Re: [BUGFIX PATCH V2 0/3] kprobes/arm: Improve kprobes implementation on arm
On Sat, 2017-02-18 at 06:43 +0900, Masami Hiramatsu wrote: > On Fri, 17 Feb 2017 11:01:10 + > "Jon Medhurst (Tixy)" <t...@linaro.org> wrote: [...] > > Meantime > > I'll investigate the kprobes test failures I see (which actually looks > > like cache/TLB issues and not test code problems after all). > > OK, btw, I couldn't reproduce the kprobes test failure with > CONFIG_DEBUG_RODATA=y on qemu... That's because it's a problem with caches which qemu doesn't emulate. I worked out the cause and posted a patch... http://lists.infradead.org/pipermail/linux-arm-kernel/2017-February/489517.html -- Tixy
Re: [BUGFIX PATCH V2 0/3] kprobes/arm: Improve kprobes implementation on arm
On Sat, 2017-02-18 at 06:43 +0900, Masami Hiramatsu wrote: > On Fri, 17 Feb 2017 11:01:10 + > "Jon Medhurst (Tixy)" wrote: [...] > > Meantime > > I'll investigate the kprobes test failures I see (which actually looks > > like cache/TLB issues and not test code problems after all). > > OK, btw, I couldn't reproduce the kprobes test failure with > CONFIG_DEBUG_RODATA=y on qemu... That's because it's a problem with caches which qemu doesn't emulate. I worked out the cause and posted a patch... http://lists.infradead.org/pipermail/linux-arm-kernel/2017-February/489517.html -- Tixy
Re: [BUGFIX PATCH V2 0/3] kprobes/arm: Improve kprobes implementation on arm
On Wed, 2017-02-15 at 09:27 +0900, Masami Hiramatsu wrote: > Hi, > > Here is the 2nd version of the patches which improve kprobe > on arm implementation (a kind of bugfix). Version 1 is here; > > https://lkml.org/lkml/2017/2/13/538 > > In this version I didn't update the code, just update the > patch description according to Tixy's comment and add his Ack. > > Thank you, > > --- > > Masami Hiramatsu (3): > kprobes/arm: Allow to handle reentered kprobe on single-stepping > kprobes/arm: Skip single-stepping in recursing path if possible > kprobes/arm: Fix the return address of multiple kretprobes > Thanks for doing these. Am I correct in assuming we don't need to consider these fixes urgent or critical? Only the first looks like it could be serious, and the x86 fix for that is 3 years old and ARM has gone without it all this time. So I'm guessing it's fine to wait for the normal development process and deal with it after the about to open merge window is completed? If so, I propose that I put the patches in a branch for Russell to pull later (unless he pipes up with objections or says otherwise). Meantime I'll investigate the kprobes test failures I see (which actually looks like cache/TLB issues and not test code problems after all). BTW, I added the ARM kernel list to the CC. I spotted you didn't add it to you patch postings, which means people interested in ARM (other than Russell) wouldn't have seen them. Thanks -- Tixy
Re: [BUGFIX PATCH V2 0/3] kprobes/arm: Improve kprobes implementation on arm
On Wed, 2017-02-15 at 09:27 +0900, Masami Hiramatsu wrote: > Hi, > > Here is the 2nd version of the patches which improve kprobe > on arm implementation (a kind of bugfix). Version 1 is here; > > https://lkml.org/lkml/2017/2/13/538 > > In this version I didn't update the code, just update the > patch description according to Tixy's comment and add his Ack. > > Thank you, > > --- > > Masami Hiramatsu (3): > kprobes/arm: Allow to handle reentered kprobe on single-stepping > kprobes/arm: Skip single-stepping in recursing path if possible > kprobes/arm: Fix the return address of multiple kretprobes > Thanks for doing these. Am I correct in assuming we don't need to consider these fixes urgent or critical? Only the first looks like it could be serious, and the x86 fix for that is 3 years old and ARM has gone without it all this time. So I'm guessing it's fine to wait for the normal development process and deal with it after the about to open merge window is completed? If so, I propose that I put the patches in a branch for Russell to pull later (unless he pipes up with objections or says otherwise). Meantime I'll investigate the kprobes test failures I see (which actually looks like cache/TLB issues and not test code problems after all). BTW, I added the ARM kernel list to the CC. I spotted you didn't add it to you patch postings, which means people interested in ARM (other than Russell) wouldn't have seen them. Thanks -- Tixy
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Wed, 2017-02-15 at 01:01 +0900, Masami Hiramatsu wrote: > On Tue, 14 Feb 2017 13:47:07 + > "Jon Medhurst (Tixy)" <t...@linaro.org> wrote: > > > On Tue, 2017-02-14 at 10:32 +, Jon Medhurst (Tixy) wrote: > > > On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > > > > This is arm port of commit 737480a0d525 ("kprobes/x86: > > > > Fix the return address of multiple kretprobes"). > > > > > > > > Fix the return address of subsequent kretprobes when multiple > > > > kretprobes are set on the same function. > > > > > > > > For example: > > > > > > > > # cd /sys/kernel/debug/tracing > > > > # echo "r:event1 sys_symlink" > kprobe_events > > > > # echo "r:event2 sys_symlink" >> kprobe_events > > > > # echo 1 > events/kprobes/enable > > > > # ln -s /tmp/foo /tmp/bar > > > > > > > > (without this patch) > > > > > > > > # cat trace | grep -v ^# > > > > ln-82[000] dn.268.446525: event1: > > > > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > > > > ln-82[000] dn.268.447831: event2: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > > > > > (with this patch) > > > > > > > > # cat trace | grep -v ^# > > > > ln-81[000] dn.139.463469: event1: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > ln-81[000] dn.139.464701: event2: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > > > > > Signed-off-by: Masami Hiramatsu <mhira...@kernel.org> > > > > Cc: KUMANO Syuhei <kumano.p...@gmail.com> > > > > --- > > > > > > I don't fully understand this function, but I've checked that the ARM > > > version now matches the x86 version (apart from the x86 specific > > > register fixup and some comments). So, FWIW > > > > > > Acked-by: Jon Medhurst <t...@linaro.org> > > > > > > I ran the before and after test case in the commit log on ARM and > > > verified the result is correct. I also tried running the ARM kprobe > > > tests with these 3 fixes but the tests fail. However, they also fail > > > without any of these changes, so I'll investigate that further... > > > > Bisecting the issue led me back to Linux 4.5 and commit 25362dc496ed > > ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA") > > > > This sets CONFIG_DEBUG_RODATA to be enabled by default. If I disable > > that on 4.10-rc4, with the three patches in this series, then the ARM > > kprobes tests pass OK. > > > > I'll stick the DEBUG_RODATA issue on my todo list (it's been around for > > a year, so can probably wait a little longer). > > Hmm, I'm running arm kernel on qemu, which maybe the reason why > the test case passed in my environment, since my kconfig also sets > CONFIG_DEBUG_RODATA=y. > > BTW, would you see that any kprobe_events didn't work with > CONFIG_DEBUG_RODATA=y? (what the failure messages were?) The tests I'm running are the ARM specific tests that are enabled by CONFIG_ARM_KPROBES_TEST=y. I'm running the tests on real multicore ARM hardware (Versatile Express with a TC2 CoreTile) For me, sometimes the first test gave: Beginning kprobe tests... Probe ARM code kprobe FAIL: test regs not OK Other times, for the specific instruction emulation tests they return FAIL: test_before_handler not run Not sure how much of the diagnostic appear without setting the tests to be verbose, which I do with: sed -e 's/VERBOSE 0/VERBOSE 1/' -i arch/arm/probes/kprobes/test-core.h Whilst writing a reply, I looked at the test code in arch/arm/probes/kprobes/test-core.c (which I wrote some years ago) and there is possibly a clue staring at us in the comments at the top of the file... * * The above would expand to assembler looking something like: * * @ TESTCASE_START * bl __kprobes_test_case_start * .pushsection .rodata * "10: * .ascii "mov r0, r7" @ text title for test case * .byte 0 * .popsection * @ start of inline data... * .word 10b @ pointer to title in .rodata section Note the ".pushsection .rodata" (though I don't see an immediate obvious reason why that would cause a problem. It certainly seems likely that the problem is with the ARM test code rather than actual kprobe implementation itself. Like I said, this issue has been there for a year or more, so I wasn't planning on spending time on it for a few more days yet whilst I get on with other urgent matters. -- Tixy Basically, m
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Wed, 2017-02-15 at 01:01 +0900, Masami Hiramatsu wrote: > On Tue, 14 Feb 2017 13:47:07 + > "Jon Medhurst (Tixy)" wrote: > > > On Tue, 2017-02-14 at 10:32 +, Jon Medhurst (Tixy) wrote: > > > On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > > > > This is arm port of commit 737480a0d525 ("kprobes/x86: > > > > Fix the return address of multiple kretprobes"). > > > > > > > > Fix the return address of subsequent kretprobes when multiple > > > > kretprobes are set on the same function. > > > > > > > > For example: > > > > > > > > # cd /sys/kernel/debug/tracing > > > > # echo "r:event1 sys_symlink" > kprobe_events > > > > # echo "r:event2 sys_symlink" >> kprobe_events > > > > # echo 1 > events/kprobes/enable > > > > # ln -s /tmp/foo /tmp/bar > > > > > > > > (without this patch) > > > > > > > > # cat trace | grep -v ^# > > > > ln-82[000] dn.268.446525: event1: > > > > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > > > > ln-82[000] dn.268.447831: event2: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > > > > > (with this patch) > > > > > > > > # cat trace | grep -v ^# > > > > ln-81[000] dn.139.463469: event1: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > ln-81[000] dn.139.464701: event2: > > > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > > > > > Signed-off-by: Masami Hiramatsu > > > > Cc: KUMANO Syuhei > > > > --- > > > > > > I don't fully understand this function, but I've checked that the ARM > > > version now matches the x86 version (apart from the x86 specific > > > register fixup and some comments). So, FWIW > > > > > > Acked-by: Jon Medhurst > > > > > > I ran the before and after test case in the commit log on ARM and > > > verified the result is correct. I also tried running the ARM kprobe > > > tests with these 3 fixes but the tests fail. However, they also fail > > > without any of these changes, so I'll investigate that further... > > > > Bisecting the issue led me back to Linux 4.5 and commit 25362dc496ed > > ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA") > > > > This sets CONFIG_DEBUG_RODATA to be enabled by default. If I disable > > that on 4.10-rc4, with the three patches in this series, then the ARM > > kprobes tests pass OK. > > > > I'll stick the DEBUG_RODATA issue on my todo list (it's been around for > > a year, so can probably wait a little longer). > > Hmm, I'm running arm kernel on qemu, which maybe the reason why > the test case passed in my environment, since my kconfig also sets > CONFIG_DEBUG_RODATA=y. > > BTW, would you see that any kprobe_events didn't work with > CONFIG_DEBUG_RODATA=y? (what the failure messages were?) The tests I'm running are the ARM specific tests that are enabled by CONFIG_ARM_KPROBES_TEST=y. I'm running the tests on real multicore ARM hardware (Versatile Express with a TC2 CoreTile) For me, sometimes the first test gave: Beginning kprobe tests... Probe ARM code kprobe FAIL: test regs not OK Other times, for the specific instruction emulation tests they return FAIL: test_before_handler not run Not sure how much of the diagnostic appear without setting the tests to be verbose, which I do with: sed -e 's/VERBOSE 0/VERBOSE 1/' -i arch/arm/probes/kprobes/test-core.h Whilst writing a reply, I looked at the test code in arch/arm/probes/kprobes/test-core.c (which I wrote some years ago) and there is possibly a clue staring at us in the comments at the top of the file... * * The above would expand to assembler looking something like: * * @ TESTCASE_START * bl __kprobes_test_case_start * .pushsection .rodata * "10: * .ascii "mov r0, r7" @ text title for test case * .byte 0 * .popsection * @ start of inline data... * .word 10b @ pointer to title in .rodata section Note the ".pushsection .rodata" (though I don't see an immediate obvious reason why that would cause a problem. It certainly seems likely that the problem is with the ARM test code rather than actual kprobe implementation itself. Like I said, this issue has been there for a year or more, so I wasn't planning on spending time on it for a few more days yet whilst I get on with other urgent matters. -- Tixy Basically, m
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Tue, 2017-02-14 at 10:32 +, Jon Medhurst (Tixy) wrote: > On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > > This is arm port of commit 737480a0d525 ("kprobes/x86: > > Fix the return address of multiple kretprobes"). > > > > Fix the return address of subsequent kretprobes when multiple > > kretprobes are set on the same function. > > > > For example: > > > > # cd /sys/kernel/debug/tracing > > # echo "r:event1 sys_symlink" > kprobe_events > > # echo "r:event2 sys_symlink" >> kprobe_events > > # echo 1 > events/kprobes/enable > > # ln -s /tmp/foo /tmp/bar > > > > (without this patch) > > > > # cat trace | grep -v ^# > > ln-82[000] dn.268.446525: event1: > > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > > ln-82[000] dn.268.447831: event2: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > (with this patch) > > > > # cat trace | grep -v ^# > > ln-81[000] dn.139.463469: event1: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > ln-81[000] dn.139.464701: event2: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > Signed-off-by: Masami Hiramatsu <mhira...@kernel.org> > > Cc: KUMANO Syuhei <kumano.p...@gmail.com> > > --- > > I don't fully understand this function, but I've checked that the ARM > version now matches the x86 version (apart from the x86 specific > register fixup and some comments). So, FWIW > > Acked-by: Jon Medhurst <t...@linaro.org> > > I ran the before and after test case in the commit log on ARM and > verified the result is correct. I also tried running the ARM kprobe > tests with these 3 fixes but the tests fail. However, they also fail > without any of these changes, so I'll investigate that further... Bisecting the issue led me back to Linux 4.5 and commit 25362dc496ed ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA") This sets CONFIG_DEBUG_RODATA to be enabled by default. If I disable that on 4.10-rc4, with the three patches in this series, then the ARM kprobes tests pass OK. I'll stick the DEBUG_RODATA issue on my todo list (it's been around for a year, so can probably wait a little longer). -- Tixy
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Tue, 2017-02-14 at 10:32 +, Jon Medhurst (Tixy) wrote: > On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > > This is arm port of commit 737480a0d525 ("kprobes/x86: > > Fix the return address of multiple kretprobes"). > > > > Fix the return address of subsequent kretprobes when multiple > > kretprobes are set on the same function. > > > > For example: > > > > # cd /sys/kernel/debug/tracing > > # echo "r:event1 sys_symlink" > kprobe_events > > # echo "r:event2 sys_symlink" >> kprobe_events > > # echo 1 > events/kprobes/enable > > # ln -s /tmp/foo /tmp/bar > > > > (without this patch) > > > > # cat trace | grep -v ^# > > ln-82[000] dn.268.446525: event1: > > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > > ln-82[000] dn.268.447831: event2: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > (with this patch) > > > > # cat trace | grep -v ^# > > ln-81[000] dn.139.463469: event1: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > ln-81[000] dn.139.464701: event2: > > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > > > Signed-off-by: Masami Hiramatsu > > Cc: KUMANO Syuhei > > --- > > I don't fully understand this function, but I've checked that the ARM > version now matches the x86 version (apart from the x86 specific > register fixup and some comments). So, FWIW > > Acked-by: Jon Medhurst > > I ran the before and after test case in the commit log on ARM and > verified the result is correct. I also tried running the ARM kprobe > tests with these 3 fixes but the tests fail. However, they also fail > without any of these changes, so I'll investigate that further... Bisecting the issue led me back to Linux 4.5 and commit 25362dc496ed ("ARM: 8501/1: mm: flip priority of CONFIG_DEBUG_RODATA") This sets CONFIG_DEBUG_RODATA to be enabled by default. If I disable that on 4.10-rc4, with the three patches in this series, then the ARM kprobes tests pass OK. I'll stick the DEBUG_RODATA issue on my todo list (it's been around for a year, so can probably wait a little longer). -- Tixy
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > This is arm port of commit 737480a0d525 ("kprobes/x86: > Fix the return address of multiple kretprobes"). > > Fix the return address of subsequent kretprobes when multiple > kretprobes are set on the same function. > > For example: > > # cd /sys/kernel/debug/tracing > # echo "r:event1 sys_symlink" > kprobe_events > # echo "r:event2 sys_symlink" >> kprobe_events > # echo 1 > events/kprobes/enable > # ln -s /tmp/foo /tmp/bar > > (without this patch) > > # cat trace | grep -v ^# > ln-82[000] dn.268.446525: event1: > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > ln-82[000] dn.268.447831: event2: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > (with this patch) > > # cat trace | grep -v ^# > ln-81[000] dn.139.463469: event1: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > ln-81[000] dn.139.464701: event2: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > Signed-off-by: Masami Hiramatsu> Cc: KUMANO Syuhei > --- I don't fully understand this function, but I've checked that the ARM version now matches the x86 version (apart from the x86 specific register fixup and some comments). So, FWIW Acked-by: Jon Medhurst I ran the before and after test case in the commit log on ARM and verified the result is correct. I also tried running the ARM kprobe tests with these 3 fixes but the tests fail. However, they also fail without any of these changes, so I'll investigate that further... > arch/arm/probes/kprobes/core.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index 84989ae..023800a 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -440,6 +440,7 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > struct hlist_node *tmp; > unsigned long flags, orig_ret_address = 0; > unsigned long trampoline_address = (unsigned long)_trampoline; > + kprobe_opcode_t *correct_ret_addr = NULL; > > INIT_HLIST_HEAD(_rp); > kretprobe_hash_lock(current, , ); > @@ -462,14 +463,34 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > /* another task is sharing our hash bucket */ > continue; > > + orig_ret_address = (unsigned long)ri->ret_addr; > + > + if (orig_ret_address != trampoline_address) > + /* > + * This is the real return address. Any other > + * instances associated with this task are for > + * other calls deeper on the call stack > + */ > + break; > + } > + > + kretprobe_assert(ri, orig_ret_address, trampoline_address); > + > + correct_ret_addr = ri->ret_addr; > + hlist_for_each_entry_safe(ri, tmp, head, hlist) { > + if (ri->task != current) > + /* another task is sharing our hash bucket */ > + continue; > + > + orig_ret_address = (unsigned long)ri->ret_addr; > if (ri->rp && ri->rp->handler) { > __this_cpu_write(current_kprobe, >rp->kp); > get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE; > + ri->ret_addr = correct_ret_addr; > ri->rp->handler(ri, regs); > __this_cpu_write(current_kprobe, NULL); > } > > - orig_ret_address = (unsigned long)ri->ret_addr; > recycle_rp_inst(ri, _rp); > > if (orig_ret_address != trampoline_address) > @@ -481,7 +502,6 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > break; > } > > - kretprobe_assert(ri, orig_ret_address, trampoline_address); > kretprobe_hash_unlock(current, ); > > hlist_for_each_entry_safe(ri, tmp, _rp, hlist) { >
Re: [BUGFIX PATCH 3/3] kprobes/arm: Fix the return address of multiple kretprobes
On Tue, 2017-02-14 at 00:05 +0900, Masami Hiramatsu wrote: > This is arm port of commit 737480a0d525 ("kprobes/x86: > Fix the return address of multiple kretprobes"). > > Fix the return address of subsequent kretprobes when multiple > kretprobes are set on the same function. > > For example: > > # cd /sys/kernel/debug/tracing > # echo "r:event1 sys_symlink" > kprobe_events > # echo "r:event2 sys_symlink" >> kprobe_events > # echo 1 > events/kprobes/enable > # ln -s /tmp/foo /tmp/bar > > (without this patch) > > # cat trace | grep -v ^# > ln-82[000] dn.268.446525: event1: > (kretprobe_trampoline+0x0/0x18 <- SyS_symlink) > ln-82[000] dn.268.447831: event2: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > (with this patch) > > # cat trace | grep -v ^# > ln-81[000] dn.139.463469: event1: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > ln-81[000] dn.139.464701: event2: > (ret_fast_syscall+0x0/0x1c <- SyS_symlink) > > Signed-off-by: Masami Hiramatsu > Cc: KUMANO Syuhei > --- I don't fully understand this function, but I've checked that the ARM version now matches the x86 version (apart from the x86 specific register fixup and some comments). So, FWIW Acked-by: Jon Medhurst I ran the before and after test case in the commit log on ARM and verified the result is correct. I also tried running the ARM kprobe tests with these 3 fixes but the tests fail. However, they also fail without any of these changes, so I'll investigate that further... > arch/arm/probes/kprobes/core.c | 24 ++-- > 1 file changed, 22 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index 84989ae..023800a 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -440,6 +440,7 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > struct hlist_node *tmp; > unsigned long flags, orig_ret_address = 0; > unsigned long trampoline_address = (unsigned long)_trampoline; > + kprobe_opcode_t *correct_ret_addr = NULL; > > INIT_HLIST_HEAD(_rp); > kretprobe_hash_lock(current, , ); > @@ -462,14 +463,34 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > /* another task is sharing our hash bucket */ > continue; > > + orig_ret_address = (unsigned long)ri->ret_addr; > + > + if (orig_ret_address != trampoline_address) > + /* > + * This is the real return address. Any other > + * instances associated with this task are for > + * other calls deeper on the call stack > + */ > + break; > + } > + > + kretprobe_assert(ri, orig_ret_address, trampoline_address); > + > + correct_ret_addr = ri->ret_addr; > + hlist_for_each_entry_safe(ri, tmp, head, hlist) { > + if (ri->task != current) > + /* another task is sharing our hash bucket */ > + continue; > + > + orig_ret_address = (unsigned long)ri->ret_addr; > if (ri->rp && ri->rp->handler) { > __this_cpu_write(current_kprobe, >rp->kp); > get_kprobe_ctlblk()->kprobe_status = KPROBE_HIT_ACTIVE; > + ri->ret_addr = correct_ret_addr; > ri->rp->handler(ri, regs); > __this_cpu_write(current_kprobe, NULL); > } > > - orig_ret_address = (unsigned long)ri->ret_addr; > recycle_rp_inst(ri, _rp); > > if (orig_ret_address != trampoline_address) > @@ -481,7 +502,6 @@ static __used __kprobes void *trampoline_handler(struct > pt_regs *regs) > break; > } > > - kretprobe_assert(ri, orig_ret_address, trampoline_address); > kretprobe_hash_unlock(current, ); > > hlist_for_each_entry_safe(ri, tmp, _rp, hlist) { >
Re: [BUGFIX PATCH 1/3] kprobes/arm: Allow to handle reentered kprobe on single-stepping
On Tue, 2017-02-14 at 00:03 +0900, Masami Hiramatsu wrote: > This is arm port of commit 6a5022a56ac3 ("kprobes/x86: Allow to > handle reentered kprobe on single-stepping") > > Since the FIQ handlers can interrupt in the single stepping > (or preparing the single stepping, do_debug etc.), we should > consider a kprobe is hit in the NMI handler. Even in that > case, the kprobe is allowed to be reentered as same as the > kprobes hit in kprobe handlers > (KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE). > > The real issue will happen when a kprobe hit while another Could to with 'is' being inserted above ^^^ (I know this is a copy of the x86 commit message) > reentered kprobe is processing (KPROBE_REENTER), because > we already consumed a saved-area for the previous kprobe. > > Signed-off-by: Masami Hiramatsu> --- Acked-by: Jon Medhurst > arch/arm/probes/kprobes/core.c |6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index a4ec240..264fedb 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -270,6 +270,7 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > switch (kcb->kprobe_status) { > case KPROBE_HIT_ACTIVE: > case KPROBE_HIT_SSDONE: > + case KPROBE_HIT_SS: > /* A pre- or post-handler probe got us here. */ > kprobes_inc_nmissed_count(p); > save_previous_kprobe(kcb); > @@ -278,6 +279,11 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > singlestep(p, regs, kcb); > restore_previous_kprobe(kcb); > break; > + case KPROBE_REENTER: > + /* A nested probe was hit in FIQ, it is a BUG */ > + pr_warn("Unrecoverable kprobe detected at > %p.\n", > + p->addr); > + /* fall through */ > default: > /* impossible cases */ > BUG(); >
Re: [BUGFIX PATCH 1/3] kprobes/arm: Allow to handle reentered kprobe on single-stepping
On Tue, 2017-02-14 at 00:03 +0900, Masami Hiramatsu wrote: > This is arm port of commit 6a5022a56ac3 ("kprobes/x86: Allow to > handle reentered kprobe on single-stepping") > > Since the FIQ handlers can interrupt in the single stepping > (or preparing the single stepping, do_debug etc.), we should > consider a kprobe is hit in the NMI handler. Even in that > case, the kprobe is allowed to be reentered as same as the > kprobes hit in kprobe handlers > (KPROBE_HIT_ACTIVE or KPROBE_HIT_SSDONE). > > The real issue will happen when a kprobe hit while another Could to with 'is' being inserted above ^^^ (I know this is a copy of the x86 commit message) > reentered kprobe is processing (KPROBE_REENTER), because > we already consumed a saved-area for the previous kprobe. > > Signed-off-by: Masami Hiramatsu > --- Acked-by: Jon Medhurst > arch/arm/probes/kprobes/core.c |6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index a4ec240..264fedb 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -270,6 +270,7 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > switch (kcb->kprobe_status) { > case KPROBE_HIT_ACTIVE: > case KPROBE_HIT_SSDONE: > + case KPROBE_HIT_SS: > /* A pre- or post-handler probe got us here. */ > kprobes_inc_nmissed_count(p); > save_previous_kprobe(kcb); > @@ -278,6 +279,11 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > singlestep(p, regs, kcb); > restore_previous_kprobe(kcb); > break; > + case KPROBE_REENTER: > + /* A nested probe was hit in FIQ, it is a BUG */ > + pr_warn("Unrecoverable kprobe detected at > %p.\n", > + p->addr); > + /* fall through */ > default: > /* impossible cases */ > BUG(); >
Re: [BUGFIX PATCH 2/3] kprobes/arm: Skip single-stepping in recursing path if possible
On Tue, 2017-02-14 at 00:04 +0900, Masami Hiramatsu wrote: > Kprobes/arm skips single-stepping (moreover handling the event) > if the conditional instruction must not be executed. This > also apply the rule when we hit the recursing kprobe, so > that kprobe does not count nmissed up in that case. Perhaps that last sentence would read better if written something like: "This also applies that rule when we hit a recursing kprobe, so that the nmissed count isn't incremented in that case." > Signed-off-by: Masami HiramatsuAcked-by: Jon Medhurst > --- > arch/arm/probes/kprobes/core.c | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index 264fedb..84989ae 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -265,7 +265,15 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > #endif > > if (p) { > - if (cur) { > + if (!p->ainsn.insn_check_cc(regs->ARM_cpsr)) { > + /* > + * Probe hit but conditional execution check failed, > + * so just skip the instruction and continue as if > + * nothing had happened. > + * In this case, we can skip recursing check too. > + */ > + singlestep_skip(p, regs); > + } else if (cur) { > /* Kprobe is pending, so we're recursing. */ > switch (kcb->kprobe_status) { > case KPROBE_HIT_ACTIVE: > @@ -288,7 +296,7 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > /* impossible cases */ > BUG(); > } > - } else if (p->ainsn.insn_check_cc(regs->ARM_cpsr)) { > + } else { > /* Probe hit and conditional execution check ok. */ > set_current_kprobe(p); > kcb->kprobe_status = KPROBE_HIT_ACTIVE; > @@ -309,13 +317,6 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > } > reset_current_kprobe(); > } > - } else { > - /* > - * Probe hit but conditional execution check failed, > - * so just skip the instruction and continue as if > - * nothing had happened. > - */ > - singlestep_skip(p, regs); > } > } else if (cur) { > /* We probably hit a jprobe. Call its break handler. */ >
Re: [BUGFIX PATCH 2/3] kprobes/arm: Skip single-stepping in recursing path if possible
On Tue, 2017-02-14 at 00:04 +0900, Masami Hiramatsu wrote: > Kprobes/arm skips single-stepping (moreover handling the event) > if the conditional instruction must not be executed. This > also apply the rule when we hit the recursing kprobe, so > that kprobe does not count nmissed up in that case. Perhaps that last sentence would read better if written something like: "This also applies that rule when we hit a recursing kprobe, so that the nmissed count isn't incremented in that case." > Signed-off-by: Masami Hiramatsu Acked-by: Jon Medhurst > --- > arch/arm/probes/kprobes/core.c | 19 ++- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index 264fedb..84989ae 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -265,7 +265,15 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > #endif > > if (p) { > - if (cur) { > + if (!p->ainsn.insn_check_cc(regs->ARM_cpsr)) { > + /* > + * Probe hit but conditional execution check failed, > + * so just skip the instruction and continue as if > + * nothing had happened. > + * In this case, we can skip recursing check too. > + */ > + singlestep_skip(p, regs); > + } else if (cur) { > /* Kprobe is pending, so we're recursing. */ > switch (kcb->kprobe_status) { > case KPROBE_HIT_ACTIVE: > @@ -288,7 +296,7 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > /* impossible cases */ > BUG(); > } > - } else if (p->ainsn.insn_check_cc(regs->ARM_cpsr)) { > + } else { > /* Probe hit and conditional execution check ok. */ > set_current_kprobe(p); > kcb->kprobe_status = KPROBE_HIT_ACTIVE; > @@ -309,13 +317,6 @@ void __kprobes kprobe_handler(struct pt_regs *regs) > } > reset_current_kprobe(); > } > - } else { > - /* > - * Probe hit but conditional execution check failed, > - * so just skip the instruction and continue as if > - * nothing had happened. > - */ > - singlestep_skip(p, regs); > } > } else if (cur) { > /* We probably hit a jprobe. Call its break handler. */ >
[PATCH v2] ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
If a DAI specifies "#sound-dai-cells = <0>" in device-tree then hdmi_of_xlate_dai_name() will be called with zero args, which it isn't implemented to cope with. The resulting use of an uninitialised variable for the id will usually result in an error like: asoc-simple-card sound: parse error -11 asoc-simple-card: probe of sound failed with error -11 Fix this by using and id of zero if no arg is provided. Fixes: 9731f82d6016 ("ASoC: hdmi-codec: enable multi probe for same device") Signed-off-by: Jon Medhurst--- Changes since v1 - Replace ternary ?: operator with if/else sound/soc/codecs/hdmi-codec.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index b904492..90b5948 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -364,7 +364,12 @@ static int hdmi_of_xlate_dai_name(struct snd_soc_component *component, struct of_phandle_args *args, const char **dai_name) { - int id = args->args[0]; + int id; + + if (args->args_count) + id = args->args[0]; + else + id = 0; if (id < ARRAY_SIZE(hdmi_dai_name)) { *dai_name = hdmi_dai_name[id]; -- 2.1.4
[PATCH v2] ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
If a DAI specifies "#sound-dai-cells = <0>" in device-tree then hdmi_of_xlate_dai_name() will be called with zero args, which it isn't implemented to cope with. The resulting use of an uninitialised variable for the id will usually result in an error like: asoc-simple-card sound: parse error -11 asoc-simple-card: probe of sound failed with error -11 Fix this by using and id of zero if no arg is provided. Fixes: 9731f82d6016 ("ASoC: hdmi-codec: enable multi probe for same device") Signed-off-by: Jon Medhurst --- Changes since v1 - Replace ternary ?: operator with if/else sound/soc/codecs/hdmi-codec.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index b904492..90b5948 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -364,7 +364,12 @@ static int hdmi_of_xlate_dai_name(struct snd_soc_component *component, struct of_phandle_args *args, const char **dai_name) { - int id = args->args[0]; + int id; + + if (args->args_count) + id = args->args[0]; + else + id = 0; if (id < ARRAY_SIZE(hdmi_dai_name)) { *dai_name = hdmi_dai_name[id]; -- 2.1.4
[PATCH] ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
If a DAI specifies "#sound-dai-cells = <0>" in device-tree then hdmi_of_xlate_dai_name() will be called with zero args, which it isn't implemented to cope with. The resulting use of an uninitialised variable for the id will usually result in an error like: asoc-simple-card sound: parse error -11 asoc-simple-card: probe of sound failed with error -11 Fix this by using and id of zero if no arg is provided. Fixes: 9731f82d6016 ("ASoC: hdmi-codec: enable multi probe for same device") Signed-off-by: Jon Medhurst--- Jyri, FYI, I believe this issue broke your commit df0bd1e8f3c508 ("ARM: dts: am335x-boneblack: Add HDMI audio support") sound/soc/codecs/hdmi-codec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index b904492..6661ace 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -364,7 +364,7 @@ static int hdmi_of_xlate_dai_name(struct snd_soc_component *component, struct of_phandle_args *args, const char **dai_name) { - int id = args->args[0]; + int id = args->args_count ? args->args[0] : 0; if (id < ARRAY_SIZE(hdmi_dai_name)) { *dai_name = hdmi_dai_name[id]; -- 2.1.4
[PATCH] ASoC: hdmi-codec: Fix hdmi_of_xlate_dai_name when #sound-dai-cells = <0>
If a DAI specifies "#sound-dai-cells = <0>" in device-tree then hdmi_of_xlate_dai_name() will be called with zero args, which it isn't implemented to cope with. The resulting use of an uninitialised variable for the id will usually result in an error like: asoc-simple-card sound: parse error -11 asoc-simple-card: probe of sound failed with error -11 Fix this by using and id of zero if no arg is provided. Fixes: 9731f82d6016 ("ASoC: hdmi-codec: enable multi probe for same device") Signed-off-by: Jon Medhurst --- Jyri, FYI, I believe this issue broke your commit df0bd1e8f3c508 ("ARM: dts: am335x-boneblack: Add HDMI audio support") sound/soc/codecs/hdmi-codec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index b904492..6661ace 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -364,7 +364,7 @@ static int hdmi_of_xlate_dai_name(struct snd_soc_component *component, struct of_phandle_args *args, const char **dai_name) { - int id = args->args[0]; + int id = args->args_count ? args->args[0] : 0; if (id < ARRAY_SIZE(hdmi_dai_name)) { *dai_name = hdmi_dai_name[id]; -- 2.1.4
Re: [PATCH v2 3/3] firmware: scpi: add device power domain support using genpd
On Thu, 2016-06-16 at 18:59 +0100, Sudeep Holla wrote: > > On 16/06/16 18:47, Jon Medhurst (Tixy) wrote: > > On Thu, 2016-06-16 at 11:38 +0100, Sudeep Holla wrote: > > [...] > >> +enum scpi_power_domain_state { > >> + SCPI_PD_STATE_ON = 0, > >> + SCPI_PD_STATE_OFF = 3, > >> +}; > > > > The SCPI doc defines the meaning of these numbers (0 and 3) in the 'Juno > > specifics' chapter. So does these values need to come from device-tree > > to allow for other hardware or SCP implementations? > > > > Ah unfortunately true :(. I had not noticed that. But I would like to > check if this can be made as part of the standard protocol. Adding such > details to DT seems overkill and defeat of the whole purpose of the > standard protocol. Well. it seems to me the 'standard protocol' is whatever the current implementation of ARM's closed source SCP firmware is. It also seems to me that people are making things up as they go along, without a clue as to how to make things generic, robust and future proof. Basically, Status Normal ARM Fucked Up. -- Tixy
Re: [PATCH v2 3/3] firmware: scpi: add device power domain support using genpd
On Thu, 2016-06-16 at 18:59 +0100, Sudeep Holla wrote: > > On 16/06/16 18:47, Jon Medhurst (Tixy) wrote: > > On Thu, 2016-06-16 at 11:38 +0100, Sudeep Holla wrote: > > [...] > >> +enum scpi_power_domain_state { > >> + SCPI_PD_STATE_ON = 0, > >> + SCPI_PD_STATE_OFF = 3, > >> +}; > > > > The SCPI doc defines the meaning of these numbers (0 and 3) in the 'Juno > > specifics' chapter. So does these values need to come from device-tree > > to allow for other hardware or SCP implementations? > > > > Ah unfortunately true :(. I had not noticed that. But I would like to > check if this can be made as part of the standard protocol. Adding such > details to DT seems overkill and defeat of the whole purpose of the > standard protocol. Well. it seems to me the 'standard protocol' is whatever the current implementation of ARM's closed source SCP firmware is. It also seems to me that people are making things up as they go along, without a clue as to how to make things generic, robust and future proof. Basically, Status Normal ARM Fucked Up. -- Tixy
Re: [PATCH v2 1/3] firmware: arm_scpi: add support for device power state management
On Thu, 2016-06-16 at 11:37 +0100, Sudeep Holla wrote: > SCPI protocol supports device power state management. This deals with > power states of various peripheral devices in the system other than the > core compute subsystem. > > This patch adds support for the power state management of those > peripheral devices. > > Signed-off-by: Sudeep HollaReviewed-by: Jon Medhurst Tested-by: Jon Medhurst I've also tested this series on Juno r2 together with the coresight changes [1], to the extent that enabling various kernel configs gets the kernel log showing several lines like: coresight-etm4x 2204.etm: ETM 4.0 initialized and if I omit the power domain entries from device-tree then boot hangs as you indicated it would be expected due to coresight hardware being powered off. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-June/433769.html > --- > drivers/firmware/arm_scpi.c | 30 ++ > include/linux/scpi_protocol.h | 2 ++ > 2 files changed, 32 insertions(+) > > diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c > index 51c6db0774cc..438893762076 100644 > --- a/drivers/firmware/arm_scpi.c > +++ b/drivers/firmware/arm_scpi.c > @@ -231,6 +231,11 @@ struct sensor_value { > __le32 hi_val; > } __packed; > > +struct dev_pstate_set { > + u16 dev_id; > + u8 pstate; > +} __packed; > + > static struct scpi_drvinfo *scpi_info; > > static int scpi_linux_errmap[SCPI_ERR_MAX] = { > @@ -537,6 +542,29 @@ static int scpi_sensor_get_value(u16 sensor, u64 *val) > return ret; > } > > +static int scpi_device_get_power_state(u16 dev_id) > +{ > + int ret; > + u8 pstate; > + __le16 id = cpu_to_le16(dev_id); > + > + ret = scpi_send_message(SCPI_CMD_GET_DEVICE_PWR_STATE, , > + sizeof(id), , sizeof(pstate)); > + return ret ? ret : pstate; > +} > + > +static int scpi_device_set_power_state(u16 dev_id, u8 pstate) > +{ > + int stat; > + struct dev_pstate_set dev_set = { > + .dev_id = cpu_to_le16(dev_id), > + .pstate = pstate, > + }; > + > + return scpi_send_message(SCPI_CMD_SET_DEVICE_PWR_STATE, _set, > + sizeof(dev_set), , sizeof(stat)); > +} > + > static struct scpi_ops scpi_ops = { > .get_version = scpi_get_version, > .clk_get_range = scpi_clk_get_range, > @@ -548,6 +576,8 @@ static struct scpi_ops scpi_ops = { > .sensor_get_capability = scpi_sensor_get_capability, > .sensor_get_info = scpi_sensor_get_info, > .sensor_get_value = scpi_sensor_get_value, > + .device_get_power_state = scpi_device_get_power_state, > + .device_set_power_state = scpi_device_set_power_state, > }; > > struct scpi_ops *get_scpi_ops(void) > diff --git a/include/linux/scpi_protocol.h b/include/linux/scpi_protocol.h > index 35de50a65665..dc5f989be226 100644 > --- a/include/linux/scpi_protocol.h > +++ b/include/linux/scpi_protocol.h > @@ -70,6 +70,8 @@ struct scpi_ops { > int (*sensor_get_capability)(u16 *sensors); > int (*sensor_get_info)(u16 sensor_id, struct scpi_sensor_info *); > int (*sensor_get_value)(u16, u64 *); > + int (*device_get_power_state)(u16); > + int (*device_set_power_state)(u16, u8); > }; > > #if IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL) > -- > 2.7.4 >
Re: [PATCH v2 1/3] firmware: arm_scpi: add support for device power state management
On Thu, 2016-06-16 at 11:37 +0100, Sudeep Holla wrote: > SCPI protocol supports device power state management. This deals with > power states of various peripheral devices in the system other than the > core compute subsystem. > > This patch adds support for the power state management of those > peripheral devices. > > Signed-off-by: Sudeep Holla Reviewed-by: Jon Medhurst Tested-by: Jon Medhurst I've also tested this series on Juno r2 together with the coresight changes [1], to the extent that enabling various kernel configs gets the kernel log showing several lines like: coresight-etm4x 2204.etm: ETM 4.0 initialized and if I omit the power domain entries from device-tree then boot hangs as you indicated it would be expected due to coresight hardware being powered off. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-June/433769.html > --- > drivers/firmware/arm_scpi.c | 30 ++ > include/linux/scpi_protocol.h | 2 ++ > 2 files changed, 32 insertions(+) > > diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c > index 51c6db0774cc..438893762076 100644 > --- a/drivers/firmware/arm_scpi.c > +++ b/drivers/firmware/arm_scpi.c > @@ -231,6 +231,11 @@ struct sensor_value { > __le32 hi_val; > } __packed; > > +struct dev_pstate_set { > + u16 dev_id; > + u8 pstate; > +} __packed; > + > static struct scpi_drvinfo *scpi_info; > > static int scpi_linux_errmap[SCPI_ERR_MAX] = { > @@ -537,6 +542,29 @@ static int scpi_sensor_get_value(u16 sensor, u64 *val) > return ret; > } > > +static int scpi_device_get_power_state(u16 dev_id) > +{ > + int ret; > + u8 pstate; > + __le16 id = cpu_to_le16(dev_id); > + > + ret = scpi_send_message(SCPI_CMD_GET_DEVICE_PWR_STATE, , > + sizeof(id), , sizeof(pstate)); > + return ret ? ret : pstate; > +} > + > +static int scpi_device_set_power_state(u16 dev_id, u8 pstate) > +{ > + int stat; > + struct dev_pstate_set dev_set = { > + .dev_id = cpu_to_le16(dev_id), > + .pstate = pstate, > + }; > + > + return scpi_send_message(SCPI_CMD_SET_DEVICE_PWR_STATE, _set, > + sizeof(dev_set), , sizeof(stat)); > +} > + > static struct scpi_ops scpi_ops = { > .get_version = scpi_get_version, > .clk_get_range = scpi_clk_get_range, > @@ -548,6 +576,8 @@ static struct scpi_ops scpi_ops = { > .sensor_get_capability = scpi_sensor_get_capability, > .sensor_get_info = scpi_sensor_get_info, > .sensor_get_value = scpi_sensor_get_value, > + .device_get_power_state = scpi_device_get_power_state, > + .device_set_power_state = scpi_device_set_power_state, > }; > > struct scpi_ops *get_scpi_ops(void) > diff --git a/include/linux/scpi_protocol.h b/include/linux/scpi_protocol.h > index 35de50a65665..dc5f989be226 100644 > --- a/include/linux/scpi_protocol.h > +++ b/include/linux/scpi_protocol.h > @@ -70,6 +70,8 @@ struct scpi_ops { > int (*sensor_get_capability)(u16 *sensors); > int (*sensor_get_info)(u16 sensor_id, struct scpi_sensor_info *); > int (*sensor_get_value)(u16, u64 *); > + int (*device_get_power_state)(u16); > + int (*device_set_power_state)(u16, u8); > }; > > #if IS_REACHABLE(CONFIG_ARM_SCPI_PROTOCOL) > -- > 2.7.4 >
Re: [PATCH v2 3/3] firmware: scpi: add device power domain support using genpd
On Thu, 2016-06-16 at 11:38 +0100, Sudeep Holla wrote: [...] > +enum scpi_power_domain_state { > + SCPI_PD_STATE_ON = 0, > + SCPI_PD_STATE_OFF = 3, > +}; The SCPI doc defines the meaning of these numbers (0 and 3) in the 'Juno specifics' chapter. So does these values need to come from device-tree to allow for other hardware or SCP implementations? -- Tixy
Re: [PATCH v2 3/3] firmware: scpi: add device power domain support using genpd
On Thu, 2016-06-16 at 11:38 +0100, Sudeep Holla wrote: [...] > +enum scpi_power_domain_state { > + SCPI_PD_STATE_ON = 0, > + SCPI_PD_STATE_OFF = 3, > +}; The SCPI doc defines the meaning of these numbers (0 and 3) in the 'Juno specifics' chapter. So does these values need to come from device-tree to allow for other hardware or SCP implementations? -- Tixy
Re: [PATCH v2 0/3] firmware: scpi: add device power domain support
On Thu, 2016-06-16 at 11:37 +0100, Sudeep Holla wrote: > This series add support for SCPI based device device power state > management using genpd. > > Regards, > Sudeep > > v1[1]->v2: > - Fixed the endianness handling in scpi_device_get_power_state > as spotted by Tixy > - Renamed scpi_pd.c to scpi_pm_domain.c as suggested by Ulf You seemed to also have made some changes to the contents of the file, is that deliberate? This is the diff... --- a/drivers/firmware/scpi_pm_domain.c +++ b/drivers/firmware/scpi_pd.c @@ -27,7 +27,7 @@ struct scpi_pm_domain { struct generic_pm_domain genpd; struct scpi_ops *ops; u32 domain; - char name[20]; + char name[30]; }; enum scpi_power_domain_state { @@ -51,7 +51,7 @@ static int scpi_pd_power(struct scpi_pm_domain *pd, bool power_on) if (ret) return ret; - return (state == pd->ops->device_get_power_state(pd->domain)); + return !(state == pd->ops->device_get_power_state(pd->domain)); } static int scpi_pd_power_on(struct generic_pm_domain *domain)
Re: [PATCH v2 0/3] firmware: scpi: add device power domain support
On Thu, 2016-06-16 at 11:37 +0100, Sudeep Holla wrote: > This series add support for SCPI based device device power state > management using genpd. > > Regards, > Sudeep > > v1[1]->v2: > - Fixed the endianness handling in scpi_device_get_power_state > as spotted by Tixy > - Renamed scpi_pd.c to scpi_pm_domain.c as suggested by Ulf You seemed to also have made some changes to the contents of the file, is that deliberate? This is the diff... --- a/drivers/firmware/scpi_pm_domain.c +++ b/drivers/firmware/scpi_pd.c @@ -27,7 +27,7 @@ struct scpi_pm_domain { struct generic_pm_domain genpd; struct scpi_ops *ops; u32 domain; - char name[20]; + char name[30]; }; enum scpi_power_domain_state { @@ -51,7 +51,7 @@ static int scpi_pd_power(struct scpi_pm_domain *pd, bool power_on) if (ret) return ret; - return (state == pd->ops->device_get_power_state(pd->domain)); + return !(state == pd->ops->device_get_power_state(pd->domain)); } static int scpi_pd_power_on(struct generic_pm_domain *domain)
Re: [PATCH 3/3] firmware: scpi: add device power domain support using genpd
On Mon, 2016-06-06 at 16:53 +0100, Sudeep Holla wrote: > This patch hooks up the support for device power domain provided by > SCPI using the Linux generic power domain infrastructure. > > Cc: "Rafael J. Wysocki"> Cc: Kevin Hilman > Cc: Ulf Hansson > Cc: linux...@vger.kernel.org > Signed-off-by: Sudeep Holla > --- > drivers/firmware/Kconfig | 8 +++ > drivers/firmware/Makefile | 1 + > drivers/firmware/scpi_pd.c | 152 > + > 3 files changed, 161 insertions(+) > create mode 100644 drivers/firmware/scpi_pd.c > > Hi, > > Since most of the power controller drivers are place in > drivers/soc/, > I am not sure where to put this SCPI power domain code as it can be used > on multiple SoC. I have placed it in drivers/firmware temporarily for > review. Please suggest the most apt place to put this driver. > > Regards, > Sudeep > > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig > index 41abdc54815e..80c963c60f13 100644 > --- a/drivers/firmware/Kconfig > +++ b/drivers/firmware/Kconfig > @@ -27,6 +27,14 @@ config ARM_SCPI_PROTOCOL > This protocol library provides interface for all the client drivers > making use of the features offered by the SCP. > > +config ARM_SCPI_POWER_DOMAIN > + tristate "SCPI power domain driver" > + depends on (ARM_SCPI_PROTOCOL && PM) || COMPILE_TEST > + select PM_GENERIC_DOMAINS_OF That select doesn't work for me and gives: warning: (ARM_SCPI_POWER_DOMAIN) selects PM_GENERIC_DOMAINS_OF which has unmet direct dependencies (PM_GENERIC_DOMAINS && OF) Followed by link errors due to missing symbols. I think you need to select PM_GENERIC_DOMAINS as well. Or perhaps just instead of, as PM_GENERIC_DOMAINS_OF defaults 'y' and isn't user selectable. From kernel/power/Kconfig ... config PM_GENERIC_DOMAINS_OF def_bool y depends on PM_GENERIC_DOMAINS && OF [...] -- Tixy
Re: [PATCH 3/3] firmware: scpi: add device power domain support using genpd
On Mon, 2016-06-06 at 16:53 +0100, Sudeep Holla wrote: > This patch hooks up the support for device power domain provided by > SCPI using the Linux generic power domain infrastructure. > > Cc: "Rafael J. Wysocki" > Cc: Kevin Hilman > Cc: Ulf Hansson > Cc: linux...@vger.kernel.org > Signed-off-by: Sudeep Holla > --- > drivers/firmware/Kconfig | 8 +++ > drivers/firmware/Makefile | 1 + > drivers/firmware/scpi_pd.c | 152 > + > 3 files changed, 161 insertions(+) > create mode 100644 drivers/firmware/scpi_pd.c > > Hi, > > Since most of the power controller drivers are place in > drivers/soc/, > I am not sure where to put this SCPI power domain code as it can be used > on multiple SoC. I have placed it in drivers/firmware temporarily for > review. Please suggest the most apt place to put this driver. > > Regards, > Sudeep > > diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig > index 41abdc54815e..80c963c60f13 100644 > --- a/drivers/firmware/Kconfig > +++ b/drivers/firmware/Kconfig > @@ -27,6 +27,14 @@ config ARM_SCPI_PROTOCOL > This protocol library provides interface for all the client drivers > making use of the features offered by the SCP. > > +config ARM_SCPI_POWER_DOMAIN > + tristate "SCPI power domain driver" > + depends on (ARM_SCPI_PROTOCOL && PM) || COMPILE_TEST > + select PM_GENERIC_DOMAINS_OF That select doesn't work for me and gives: warning: (ARM_SCPI_POWER_DOMAIN) selects PM_GENERIC_DOMAINS_OF which has unmet direct dependencies (PM_GENERIC_DOMAINS && OF) Followed by link errors due to missing symbols. I think you need to select PM_GENERIC_DOMAINS as well. Or perhaps just instead of, as PM_GENERIC_DOMAINS_OF defaults 'y' and isn't user selectable. From kernel/power/Kconfig ... config PM_GENERIC_DOMAINS_OF def_bool y depends on PM_GENERIC_DOMAINS && OF [...] -- Tixy
Re: [PATCH 1/3] firmware: arm_scpi: add support for device power state management
On Mon, 2016-06-06 at 16:53 +0100, Sudeep Holla wrote: > SCPI protocol supports device power state management. This deals with > power states of various peripheral devices in the system other than the > core compute subsystem. > > This patch adds support for the power state management of those > peripheral devices. > > Cc: linux-arm-ker...@lists.infradead.org > Signed-off-by: Sudeep Holla> --- > drivers/firmware/arm_scpi.c | 29 + > include/linux/scpi_protocol.h | 2 ++ > 2 files changed, 31 insertions(+) > > diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c > index 51c6db0774cc..ca6afd2217a8 100644 > --- a/drivers/firmware/arm_scpi.c > +++ b/drivers/firmware/arm_scpi.c > @@ -231,6 +231,11 @@ struct sensor_value { > __le32 hi_val; > } __packed; > > +struct dev_pstate_set { > + u16 dev_id; > + u8 pstate; > +} __packed; > + > static struct scpi_drvinfo *scpi_info; > > static int scpi_linux_errmap[SCPI_ERR_MAX] = { > @@ -537,6 +542,28 @@ static int scpi_sensor_get_value(u16 sensor, u64 *val) > return ret; > } > > +static int scpi_device_get_power_state(u16 dev_id) > +{ > + int ret; > + u8 pstate; > + > + ret = scpi_send_message(SCPI_CMD_GET_DEVICE_PWR_STATE, _id, > + sizeof(dev_id), , sizeof(pstate)); Don't you need to run translate dev_id to little-endian before sending it? You remembered to do that next function... > + return ret ? ret : pstate; > +} > + > +static int scpi_device_set_power_state(u16 dev_id, u8 pstate) > +{ > + int stat; > + struct dev_pstate_set dev_set = { > + .dev_id = cpu_to_le16(dev_id), > + .pstate = pstate, > + }; > + > + return scpi_send_message(SCPI_CMD_SET_DEVICE_PWR_STATE, _set, > + sizeof(dev_set), , sizeof(stat)); > +} > + [...] -- Tixy
Re: [PATCH 1/3] firmware: arm_scpi: add support for device power state management
On Mon, 2016-06-06 at 16:53 +0100, Sudeep Holla wrote: > SCPI protocol supports device power state management. This deals with > power states of various peripheral devices in the system other than the > core compute subsystem. > > This patch adds support for the power state management of those > peripheral devices. > > Cc: linux-arm-ker...@lists.infradead.org > Signed-off-by: Sudeep Holla > --- > drivers/firmware/arm_scpi.c | 29 + > include/linux/scpi_protocol.h | 2 ++ > 2 files changed, 31 insertions(+) > > diff --git a/drivers/firmware/arm_scpi.c b/drivers/firmware/arm_scpi.c > index 51c6db0774cc..ca6afd2217a8 100644 > --- a/drivers/firmware/arm_scpi.c > +++ b/drivers/firmware/arm_scpi.c > @@ -231,6 +231,11 @@ struct sensor_value { > __le32 hi_val; > } __packed; > > +struct dev_pstate_set { > + u16 dev_id; > + u8 pstate; > +} __packed; > + > static struct scpi_drvinfo *scpi_info; > > static int scpi_linux_errmap[SCPI_ERR_MAX] = { > @@ -537,6 +542,28 @@ static int scpi_sensor_get_value(u16 sensor, u64 *val) > return ret; > } > > +static int scpi_device_get_power_state(u16 dev_id) > +{ > + int ret; > + u8 pstate; > + > + ret = scpi_send_message(SCPI_CMD_GET_DEVICE_PWR_STATE, _id, > + sizeof(dev_id), , sizeof(pstate)); Don't you need to run translate dev_id to little-endian before sending it? You remembered to do that next function... > + return ret ? ret : pstate; > +} > + > +static int scpi_device_set_power_state(u16 dev_id, u8 pstate) > +{ > + int stat; > + struct dev_pstate_set dev_set = { > + .dev_id = cpu_to_le16(dev_id), > + .pstate = pstate, > + }; > + > + return scpi_send_message(SCPI_CMD_SET_DEVICE_PWR_STATE, _set, > + sizeof(dev_set), , sizeof(stat)); > +} > + [...] -- Tixy
Re: [PATCH 3.14 17/17] arm64: Make arch_randomize_brk avoid stack area
On Mon, 2016-05-16 at 18:14 -0700, Greg Kroah-Hartman wrote: > 3.14-stable review patch. If anyone has any objections, please let me know. As reported by Guenter Roeck, this patch doesn't compile on 3.14 because it deleted randomize_base which is still used by the macro ELF_ET_DYN_BASE. That use was removed in 3.18 by commit 92980405f353 ("arm64: ASLR: Don't randomise text when randomise_va_space == 0") Looking at that commit it seems to be what caused the bug $subject patch fixes because it stopped the arm64 implementation putting loaded binaries 2/3rds the way up a task's address range. So it seems to me, either $subject patch should only be applied to Linux versions 3.18 through 4.0 inclusive; or the fix in commit 92980405f353 also needs backporting to stable kernels before 3.18. (Or some more other solution.) Either way, it seems $subject patch also needs... Fixes: 92980405f353 ("arm64: ASLR: Don't randomise text when randomise_va_space == 0") Leaving patch quoted below for reference... > > From: Jon Medhurst> > [As mentioned in the commit message, the problem this patch fixes can't > occur in kernels with commit d1fd836dcf00, i.e Linux 4.1 and later., > but earlier kernel versions need this fix.] > > When a process is created, various address randomisations could end up > colluding to place the address of brk in the stack memory. This would > mean processes making small heap based memory allocations are in danger > of having them overwriting, or overwritten by, the stack. > > Another consequence, is that even for processes that make no use of > brk, the output of /proc/*/maps may show the stack area listed as > '[heap]' rather than '[stack]'. Apart from being misleading this causes > fatal errors with the Android run-time like: > "No [stack] line found in /proc/self/task/*/maps" > > To prevent this problem pick a limit for brk that allows for the stack's > memory. At the same time we remove randomize_base() as that was only > used by arch_randomize_brk(). > > Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR > from mmap ASLR") this problem shouldn't occur because the address chosen > for loading binaries is well clear of the stack, however, prior to that > the problem does occur because of the following... > > The memory layout of a task is determined by arch_pick_mmap_layout. If > address randomisation is enabled (task has flag PF_RANDOMIZE) this sets > mmap_base to a random address at the top of task memory just below a > region calculated to allow for a stack which itself may have a random > base address. Any mmap operations that then happen which require an > address allocating will use the topdown allocation method, i.e. the > first allocated memory will be at the top of memory, just below the > area set aside for the stack. > > When a relocatable binary is loaded into a new process by > load_elf_binary and randomised address are enabled, it uses a > 'load_bias' of zero, so that when mmap is called to create a memory > region for it, a new address is picked (address zero not being > available). As this is the first memory region in the task, it gets the > region just below the stack, as described previously. > > The loader then set's brk to the end of the elf data section, which will > be near the end of the loaded binary and then it calls > arch_randomize_brk. As this currently stands, this adds a random amount > to brk, which unfortunately may take it into the address range where the > stack lies. > > Testing: > > These changes have been tested on Linux 3.18 (where the collision of brk > and stack can happen) using 10 invocations of a program [1] that can > display the offset of a process's brk... > > $for i in $(seq 10); do ./aslr --report brk ; done > > This shows values of brk are evenly distributed over a 1GB range before > this change is applied. After this change the distribution shows a slope > where lower values for brk are more common and upper values have about > half the frequency of those. > > [1] > http://bazaar.launchpad.net/~ubuntu-bugcontrol/qa-regression-testing/master/files/2499/scripts/kernel-security/aslr/ > > Signed-off-by: Jon Medhurst > Acked-by: Kees Cook > Acked-by: Catalin Marinas > Signed-off-by: Greg Kroah-Hartman > --- > > > I originally posted this to the ARM kernel list and arm64 maintainers, > see http://www.spinics.net/lists/arm-kernel/msg502238.html > > arch/arm64/kernel/process.c | 24 ++-- > 1 file changed, 18 insertions(+), 6 deletions(-) > > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -350,15 +350,27 @@ unsigned long arch_align_stack(unsigned > return sp & ~0xf; > } > > -static unsigned long randomize_base(unsigned long base) > +unsigned long arch_randomize_brk(struct mm_struct *mm) > { > + unsigned long base = mm->brk;
Re: [PATCH 3.14 17/17] arm64: Make arch_randomize_brk avoid stack area
On Mon, 2016-05-16 at 18:14 -0700, Greg Kroah-Hartman wrote: > 3.14-stable review patch. If anyone has any objections, please let me know. As reported by Guenter Roeck, this patch doesn't compile on 3.14 because it deleted randomize_base which is still used by the macro ELF_ET_DYN_BASE. That use was removed in 3.18 by commit 92980405f353 ("arm64: ASLR: Don't randomise text when randomise_va_space == 0") Looking at that commit it seems to be what caused the bug $subject patch fixes because it stopped the arm64 implementation putting loaded binaries 2/3rds the way up a task's address range. So it seems to me, either $subject patch should only be applied to Linux versions 3.18 through 4.0 inclusive; or the fix in commit 92980405f353 also needs backporting to stable kernels before 3.18. (Or some more other solution.) Either way, it seems $subject patch also needs... Fixes: 92980405f353 ("arm64: ASLR: Don't randomise text when randomise_va_space == 0") Leaving patch quoted below for reference... > > From: Jon Medhurst > > [As mentioned in the commit message, the problem this patch fixes can't > occur in kernels with commit d1fd836dcf00, i.e Linux 4.1 and later., > but earlier kernel versions need this fix.] > > When a process is created, various address randomisations could end up > colluding to place the address of brk in the stack memory. This would > mean processes making small heap based memory allocations are in danger > of having them overwriting, or overwritten by, the stack. > > Another consequence, is that even for processes that make no use of > brk, the output of /proc/*/maps may show the stack area listed as > '[heap]' rather than '[stack]'. Apart from being misleading this causes > fatal errors with the Android run-time like: > "No [stack] line found in /proc/self/task/*/maps" > > To prevent this problem pick a limit for brk that allows for the stack's > memory. At the same time we remove randomize_base() as that was only > used by arch_randomize_brk(). > > Note, in practice, since commit d1fd836dcf00 ("mm: split ET_DYN ASLR > from mmap ASLR") this problem shouldn't occur because the address chosen > for loading binaries is well clear of the stack, however, prior to that > the problem does occur because of the following... > > The memory layout of a task is determined by arch_pick_mmap_layout. If > address randomisation is enabled (task has flag PF_RANDOMIZE) this sets > mmap_base to a random address at the top of task memory just below a > region calculated to allow for a stack which itself may have a random > base address. Any mmap operations that then happen which require an > address allocating will use the topdown allocation method, i.e. the > first allocated memory will be at the top of memory, just below the > area set aside for the stack. > > When a relocatable binary is loaded into a new process by > load_elf_binary and randomised address are enabled, it uses a > 'load_bias' of zero, so that when mmap is called to create a memory > region for it, a new address is picked (address zero not being > available). As this is the first memory region in the task, it gets the > region just below the stack, as described previously. > > The loader then set's brk to the end of the elf data section, which will > be near the end of the loaded binary and then it calls > arch_randomize_brk. As this currently stands, this adds a random amount > to brk, which unfortunately may take it into the address range where the > stack lies. > > Testing: > > These changes have been tested on Linux 3.18 (where the collision of brk > and stack can happen) using 10 invocations of a program [1] that can > display the offset of a process's brk... > > $for i in $(seq 10); do ./aslr --report brk ; done > > This shows values of brk are evenly distributed over a 1GB range before > this change is applied. After this change the distribution shows a slope > where lower values for brk are more common and upper values have about > half the frequency of those. > > [1] > http://bazaar.launchpad.net/~ubuntu-bugcontrol/qa-regression-testing/master/files/2499/scripts/kernel-security/aslr/ > > Signed-off-by: Jon Medhurst > Acked-by: Kees Cook > Acked-by: Catalin Marinas > Signed-off-by: Greg Kroah-Hartman > --- > > > I originally posted this to the ARM kernel list and arm64 maintainers, > see http://www.spinics.net/lists/arm-kernel/msg502238.html > > arch/arm64/kernel/process.c | 24 ++-- > 1 file changed, 18 insertions(+), 6 deletions(-) > > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -350,15 +350,27 @@ unsigned long arch_align_stack(unsigned > return sp & ~0xf; > } > > -static unsigned long randomize_base(unsigned long base) > +unsigned long arch_randomize_brk(struct mm_struct *mm) > { > + unsigned long base = mm->brk; > unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1; > - return
Re: [PATCH] arm64: kernel: Fix incorrect brk randomization
On Tue, 2016-05-10 at 10:55 -0700, Kees Cook wrote: > This fixes two issues with the arm64 brk randomziation. First, the > STACK_RND_MASK was being used incorrectly. The original code was: > > unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1; > > STACK_RND_MASK is 0x7ff (32-bit) or 0x3 (64-bit), with 4K pages where > PAGE_SHIFT is 12: > > #define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \ > 0x7ff >> (PAGE_SHIFT - 12) : \ > 0x3 >> (PAGE_SHIFT - 12)) > > This means the resulting offset from base would be 0x7ff0001 or 0x30001, > which is wrong since it creates an unaligned end address. It was likely > intended to be: > > unsigned long range_end = base + ((STACK_RND_MASK + 1) << PAGE_SHIFT) > > Which would result in offsets of 0x80 (32-bit) and 0x4000 (64-bit). > > However, even this corrected 32-bit compat offset (0x0080) is much > smaller than native ARM's brk randomization value (0x0200): > > unsigned long arch_randomize_brk(struct mm_struct *mm) > { > unsigned long range_end = mm->brk + 0x0200; > return randomize_range(mm->brk, range_end, 0) ? : mm->brk; > } > > So, instead of basing arm64's brk randomization on mistaken STACK_RND_MASK > calculations, just use specific corrected values for compat (0x200) > and native arm64 (0x4000). > > Signed-off-by: Kees CookThere seems to be a helper 'is_compat_task()' that does 'test_thread_flag(TIF_32BIT)' so could perhaps be used instead, but that's too nit-picky. This change makes things more consistent with arch/arm and other arches, and stops use of STACK_RND_MASK for things that aren't stack related so seems good all round to me. Reviewed-by: Jon Medhurst > --- > arch/arm64/kernel/process.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 80624829db61..0d0969bcd76d 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -382,13 +382,14 @@ unsigned long arch_align_stack(unsigned long sp) > return sp & ~0xf; > } > > -static unsigned long randomize_base(unsigned long base) > -{ > - unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1; > - return randomize_range(base, range_end, 0) ? : base; > -} > - > unsigned long arch_randomize_brk(struct mm_struct *mm) > { > - return randomize_base(mm->brk); > + unsigned long range_end = mm->brk; > + > + if (test_thread_flag(TIF_32BIT)) > + range_end += 0x0200; > + else > + range_end += 0x4000; > + > + return randomize_range(mm->brk, range_end, 0) ? : mm->brk; > } > -- > 2.6.3 > >
Re: [PATCH] arm64: kernel: Fix incorrect brk randomization
On Tue, 2016-05-10 at 10:55 -0700, Kees Cook wrote: > This fixes two issues with the arm64 brk randomziation. First, the > STACK_RND_MASK was being used incorrectly. The original code was: > > unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1; > > STACK_RND_MASK is 0x7ff (32-bit) or 0x3 (64-bit), with 4K pages where > PAGE_SHIFT is 12: > > #define STACK_RND_MASK (test_thread_flag(TIF_32BIT) ? \ > 0x7ff >> (PAGE_SHIFT - 12) : \ > 0x3 >> (PAGE_SHIFT - 12)) > > This means the resulting offset from base would be 0x7ff0001 or 0x30001, > which is wrong since it creates an unaligned end address. It was likely > intended to be: > > unsigned long range_end = base + ((STACK_RND_MASK + 1) << PAGE_SHIFT) > > Which would result in offsets of 0x80 (32-bit) and 0x4000 (64-bit). > > However, even this corrected 32-bit compat offset (0x0080) is much > smaller than native ARM's brk randomization value (0x0200): > > unsigned long arch_randomize_brk(struct mm_struct *mm) > { > unsigned long range_end = mm->brk + 0x0200; > return randomize_range(mm->brk, range_end, 0) ? : mm->brk; > } > > So, instead of basing arm64's brk randomization on mistaken STACK_RND_MASK > calculations, just use specific corrected values for compat (0x200) > and native arm64 (0x4000). > > Signed-off-by: Kees Cook There seems to be a helper 'is_compat_task()' that does 'test_thread_flag(TIF_32BIT)' so could perhaps be used instead, but that's too nit-picky. This change makes things more consistent with arch/arm and other arches, and stops use of STACK_RND_MASK for things that aren't stack related so seems good all round to me. Reviewed-by: Jon Medhurst > --- > arch/arm64/kernel/process.c | 15 --- > 1 file changed, 8 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 80624829db61..0d0969bcd76d 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -382,13 +382,14 @@ unsigned long arch_align_stack(unsigned long sp) > return sp & ~0xf; > } > > -static unsigned long randomize_base(unsigned long base) > -{ > - unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1; > - return randomize_range(base, range_end, 0) ? : base; > -} > - > unsigned long arch_randomize_brk(struct mm_struct *mm) > { > - return randomize_base(mm->brk); > + unsigned long range_end = mm->brk; > + > + if (test_thread_flag(TIF_32BIT)) > + range_end += 0x0200; > + else > + range_end += 0x4000; > + > + return randomize_range(mm->brk, range_end, 0) ? : mm->brk; > } > -- > 2.6.3 > >
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Fri, 2016-03-11 at 13:13 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 03:50:41PM +0000, Jon Medhurst (Tixy) wrote: > > > > The reside is requested for "a descriptor". For example if you have > > > prepared > > > two descriptors A and B and submitted them, then you can request status > > > and > > > reside for A and you need to calculate that for A only and not take into > > > account status of B > > > > But, in the case of the pl330 driver, A and B may each consist of > > multiple internal/hidden descriptors. So the residue calculation has to > > sum up the residue of all these internal/hidden descriptors as well. > > This is what the current pl330_tx_status() function does, but has bugs. > > > > I've only just managed to clearly understand all the above details > > whilst writing this email, and this confusion obviously means the code > > and any commit messages need to explain things better. > > Okay I relooked at the code, the driver is creating descriptor per period, > so yes while calculating that should be taken into account but only for > cyclic case and not for rest, as it will break reside values for for them. So not for scatter gather lists passed to pl330_prep_slave_sg? It seems to me, that as for the cyclic case, the descriptors used for each element in the list are equally internal to the pl330 driver and not exposed to users. pl330_tx_submit() only returns the cookie for the last descriptor, how do users get the cookies for the other elements so they can pass them to pl330_tx_status()? To me, the implementation of the pl330 driver looks consistent with the APIs I can see in header files and the Documentation directory, namely: - dmaengine_prep_* functions prepare a DMA transactions and return a descriptor (struct dma_async_tx_descriptor) - The transaction is queued for processing by passing that descriptor to dmaengine_submit() which returns 'cookie' (dma_cookie_t) - The current status of a transaction can be queried by passing it's cookie to dmaengine_tx_status Now, I've never looked at DMA API's or their working before I started investigating the bug I'm trying to fix, so I may very well have missed something. In fact, the more I look, the more uncertain I am. From Documentation/dmaengine/provider.txt it describes: * device_tx_status - Should report the bytes left to go over on the given channel - Should only care about the transaction descriptor passed as argument, not the currently active one on a given channel - The tx_state argument might be NULL - Should use dma_set_residue to report it - In the case of a cyclic transfer, it should only take into account the current period. So that's saying residue for cyclic transfers residue is only for the current period? I tried looking at other DMA drivers for inspiration, and it seems that some at least (e.g. drivers/dma/mmp_pdma.c) do things the same way as pl330, but that could just be a case of people copying broken code. -- Tixy
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Fri, 2016-03-11 at 13:13 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 03:50:41PM +0000, Jon Medhurst (Tixy) wrote: > > > > The reside is requested for "a descriptor". For example if you have > > > prepared > > > two descriptors A and B and submitted them, then you can request status > > > and > > > reside for A and you need to calculate that for A only and not take into > > > account status of B > > > > But, in the case of the pl330 driver, A and B may each consist of > > multiple internal/hidden descriptors. So the residue calculation has to > > sum up the residue of all these internal/hidden descriptors as well. > > This is what the current pl330_tx_status() function does, but has bugs. > > > > I've only just managed to clearly understand all the above details > > whilst writing this email, and this confusion obviously means the code > > and any commit messages need to explain things better. > > Okay I relooked at the code, the driver is creating descriptor per period, > so yes while calculating that should be taken into account but only for > cyclic case and not for rest, as it will break reside values for for them. So not for scatter gather lists passed to pl330_prep_slave_sg? It seems to me, that as for the cyclic case, the descriptors used for each element in the list are equally internal to the pl330 driver and not exposed to users. pl330_tx_submit() only returns the cookie for the last descriptor, how do users get the cookies for the other elements so they can pass them to pl330_tx_status()? To me, the implementation of the pl330 driver looks consistent with the APIs I can see in header files and the Documentation directory, namely: - dmaengine_prep_* functions prepare a DMA transactions and return a descriptor (struct dma_async_tx_descriptor) - The transaction is queued for processing by passing that descriptor to dmaengine_submit() which returns 'cookie' (dma_cookie_t) - The current status of a transaction can be queried by passing it's cookie to dmaengine_tx_status Now, I've never looked at DMA API's or their working before I started investigating the bug I'm trying to fix, so I may very well have missed something. In fact, the more I look, the more uncertain I am. From Documentation/dmaengine/provider.txt it describes: * device_tx_status - Should report the bytes left to go over on the given channel - Should only care about the transaction descriptor passed as argument, not the currently active one on a given channel - The tx_state argument might be NULL - Should use dma_set_residue to report it - In the case of a cyclic transfer, it should only take into account the current period. So that's saying residue for cyclic transfers residue is only for the current period? I tried looking at other DMA drivers for inspiration, and it seems that some at least (e.g. drivers/dma/mmp_pdma.c) do things the same way as pl330, but that could just be a case of people copying broken code. -- Tixy
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Tue, 2016-03-08 at 19:45 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 10:40:11AM +0000, Jon Medhurst (Tixy) wrote: > > On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > > > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > > > The residue calculation in pl330_tx_status doesn't handle transitional > > > > states that occur at the time one descriptor (A) is completed and the > > > > next (B) is started. Specifically, both A and B can simultaneously be in > > > > the BUSY state and at this time the thread's 'req_running' may (or may > > > > not) be -1. > > > > > > you are under lock so descriptor state wont be update while we are it. > > > > > > Also the query for residue is for "a descriptor" not whatever is the > > > current > > > running descriptor... > > > > > > > > > > > To cope with this situation we change the code to ensure A is treated as > > > > complete and B as having not yet started. Prior to the change, the code > > > > would calculate a transferred byte count as if both A and B had > > > > completed. > > > > > > You query for either A or B not both! > > > > I've probably been using wrong/ambiguous terminology... > > > > In my description I'm using 'descriptor' to refer to a 'struct > > dma_pl330_desc', I guess other people assume 'struct > > dma_async_tx_descriptor'? > > Nope > > > The situation I was debugging was audio playback, where ASoC ends up > > calling pl330_prep_dma_cyclic() with a period one quarter the length of > > the buffer it is using, so that results in four dma_pl330_desc > > 'descriptors' being created to cover that buffer. These later get > > submitted to a DMA channel (struct dma_pl330_chan) which has a list of > > these that it is processing (the 'work_list'). > > > > The residual calculation that currently exists in pl08x_dma_tx_status() > > is iterating this work_list and summing the length of currently > > transferring 'descriptor' with those later pending ones. I believe that > > is correct behaviour because these 'descriptors' (dma_pl330_desc) are > > all internal implementation details of the driver, and the dmaengine > > API's are dealing in units of 'dma_async_tx_descriptor' ? > > Not really. If you look closely dma_pl330_desc contains > dma_async_tx_descriptor. A descriptor represents a tranasaction and is > certainly not internal detail of your driver I think that for this driver it's kinda both, which is why it's confusing... pl330_prep_dma_cyclic(), after creating a number of dma_pl330_desc objects in a circular list, returns the address of the dma_async_tx_descriptor contained in the last one. I'm guessing that as far as users of the dmaengine API is concerned, this is treated as just a single descriptor. pl330_prep_slave_sg() creates a similar circular list. When these descriptors are later submitted to the driver for transfer, it is done by a call to pl330_tx_submit, which I has this comment... /* * We returned the last one of the circular list of descriptor(s) * from prep_xxx, so the argument to submit corresponds to the last * descriptor of the list. */ This goes on to add all the linked descriptors to the pending queue for sending to the DMA hardware, and then returns the cookie of the last descriptor, maintaining the fiction to the dmaengine API's that there is just a single descriptor. > > The reside is requested for "a descriptor". For example if you have prepared > two descriptors A and B and submitted them, then you can request status and > reside for A and you need to calculate that for A only and not take into > account status of B But, in the case of the pl330 driver, A and B may each consist of multiple internal/hidden descriptors. So the residue calculation has to sum up the residue of all these internal/hidden descriptors as well. This is what the current pl330_tx_status() function does, but has bugs. I've only just managed to clearly understand all the above details whilst writing this email, and this confusion obviously means the code and any commit messages need to explain things better. -- Tixy > > > > > If the current code is OK in this regard, it is definitely buggy because > > it doesn't cope with the situation when two dma_pl330_desc's are in the > > state 'BUSY' a, which I have seen occur when debugging this issue, had > > worked out can happen by analysing the code, and is acknowledged by the > > in-source comments for enum desc_status... > > > > /* > > * Sitting on the work_list and already submitted > >
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Tue, 2016-03-08 at 19:45 +0530, Vinod Koul wrote: > On Tue, Mar 08, 2016 at 10:40:11AM +0000, Jon Medhurst (Tixy) wrote: > > On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > > > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > > > The residue calculation in pl330_tx_status doesn't handle transitional > > > > states that occur at the time one descriptor (A) is completed and the > > > > next (B) is started. Specifically, both A and B can simultaneously be in > > > > the BUSY state and at this time the thread's 'req_running' may (or may > > > > not) be -1. > > > > > > you are under lock so descriptor state wont be update while we are it. > > > > > > Also the query for residue is for "a descriptor" not whatever is the > > > current > > > running descriptor... > > > > > > > > > > > To cope with this situation we change the code to ensure A is treated as > > > > complete and B as having not yet started. Prior to the change, the code > > > > would calculate a transferred byte count as if both A and B had > > > > completed. > > > > > > You query for either A or B not both! > > > > I've probably been using wrong/ambiguous terminology... > > > > In my description I'm using 'descriptor' to refer to a 'struct > > dma_pl330_desc', I guess other people assume 'struct > > dma_async_tx_descriptor'? > > Nope > > > The situation I was debugging was audio playback, where ASoC ends up > > calling pl330_prep_dma_cyclic() with a period one quarter the length of > > the buffer it is using, so that results in four dma_pl330_desc > > 'descriptors' being created to cover that buffer. These later get > > submitted to a DMA channel (struct dma_pl330_chan) which has a list of > > these that it is processing (the 'work_list'). > > > > The residual calculation that currently exists in pl08x_dma_tx_status() > > is iterating this work_list and summing the length of currently > > transferring 'descriptor' with those later pending ones. I believe that > > is correct behaviour because these 'descriptors' (dma_pl330_desc) are > > all internal implementation details of the driver, and the dmaengine > > API's are dealing in units of 'dma_async_tx_descriptor' ? > > Not really. If you look closely dma_pl330_desc contains > dma_async_tx_descriptor. A descriptor represents a tranasaction and is > certainly not internal detail of your driver I think that for this driver it's kinda both, which is why it's confusing... pl330_prep_dma_cyclic(), after creating a number of dma_pl330_desc objects in a circular list, returns the address of the dma_async_tx_descriptor contained in the last one. I'm guessing that as far as users of the dmaengine API is concerned, this is treated as just a single descriptor. pl330_prep_slave_sg() creates a similar circular list. When these descriptors are later submitted to the driver for transfer, it is done by a call to pl330_tx_submit, which I has this comment... /* * We returned the last one of the circular list of descriptor(s) * from prep_xxx, so the argument to submit corresponds to the last * descriptor of the list. */ This goes on to add all the linked descriptors to the pending queue for sending to the DMA hardware, and then returns the cookie of the last descriptor, maintaining the fiction to the dmaengine API's that there is just a single descriptor. > > The reside is requested for "a descriptor". For example if you have prepared > two descriptors A and B and submitted them, then you can request status and > reside for A and you need to calculate that for A only and not take into > account status of B But, in the case of the pl330 driver, A and B may each consist of multiple internal/hidden descriptors. So the residue calculation has to sum up the residue of all these internal/hidden descriptors as well. This is what the current pl330_tx_status() function does, but has bugs. I've only just managed to clearly understand all the above details whilst writing this email, and this confusion obviously means the code and any commit messages need to explain things better. -- Tixy > > > > > If the current code is OK in this regard, it is definitely buggy because > > it doesn't cope with the situation when two dma_pl330_desc's are in the > > state 'BUSY' a, which I have seen occur when debugging this issue, had > > worked out can happen by analysing the code, and is acknowledged by the > > in-source comments for enum desc_status... > > > > /* > > * Sitting on the work_list and already submitted > >
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > The residue calculation in pl330_tx_status doesn't handle transitional > > states that occur at the time one descriptor (A) is completed and the > > next (B) is started. Specifically, both A and B can simultaneously be in > > the BUSY state and at this time the thread's 'req_running' may (or may > > not) be -1. > > you are under lock so descriptor state wont be update while we are it. > > Also the query for residue is for "a descriptor" not whatever is the current > running descriptor... > > > > > To cope with this situation we change the code to ensure A is treated as > > complete and B as having not yet started. Prior to the change, the code > > would calculate a transferred byte count as if both A and B had > > completed. > > You query for either A or B not both! I've probably been using wrong/ambiguous terminology... In my description I'm using 'descriptor' to refer to a 'struct dma_pl330_desc', I guess other people assume 'struct dma_async_tx_descriptor'? The situation I was debugging was audio playback, where ASoC ends up calling pl330_prep_dma_cyclic() with a period one quarter the length of the buffer it is using, so that results in four dma_pl330_desc 'descriptors' being created to cover that buffer. These later get submitted to a DMA channel (struct dma_pl330_chan) which has a list of these that it is processing (the 'work_list'). The residual calculation that currently exists in pl08x_dma_tx_status() is iterating this work_list and summing the length of currently transferring 'descriptor' with those later pending ones. I believe that is correct behaviour because these 'descriptors' (dma_pl330_desc) are all internal implementation details of the driver, and the dmaengine API's are dealing in units of 'dma_async_tx_descriptor' ? If the current code is OK in this regard, it is definitely buggy because it doesn't cope with the situation when two dma_pl330_desc's are in the state 'BUSY' a, which I have seen occur when debugging this issue, had worked out can happen by analysing the code, and is acknowledged by the in-source comments for enum desc_status... /* * Sitting on the work_list and already submitted * to the PL330 core. Not more than two descriptors * of a channel can be BUSY at any time. */ BUSY, In my problematic usecase I have userside code calling ALSA ioctls to poll the current audio playback position which results in pl08x_dma_tx_status() being called multiple times a second. After only a second or two the buggy situation gets hit, resulting in a miscalculation that ASoC interprets as a buffer underflow and so it stops the stream. I spent several days debugging this, with enough ad hoc tests and printk's littered everywhere to be very confident as to how things are going wrong - what I'm not not totally confident of is how things should be properly fixed. This patch appears to fix the situation that I was hitting, but it really looks like there isn't any locking that prevent this polling use of pl08x_dma_tx_status() from happening concurrently with the irq handler reprogramming the hardware for the next dma_pl330_desc. I didn't attempt any fix for that for fear of introducing bugs in what looks like complex code, and because it's not a problem I saw happen in practice. -- Tixy > > > > > Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") > > > > Signed-off-by: Jon Medhurst <t...@linaro.org> > > --- > > > > I discovered this issue when trying to work out why audio stopped > > working on ARM's Juno platform and bisected it to commit aee4d1fac887. > > Whilst this patch seems to fix the problems I was seeing, I can't help > > but think there are more race conditions with this code. E.g. if the > > running descriptor changes under us, pl330_get_current_xferred_count > > can end up reading values from hardware that relate to a different > > descriptor. And if we're really unlucky, the reading of the 'val' and > > 'addr' values in pl330_get_current_xferred_count can come from different > > descriptors. I don't know if there is any locks we can use to prevent > > such races or if we need to try and detect when things have changed and > > redo/abort the residue calculation... > > > > drivers/dma/pl330.c | 24 > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c > > index 17ee758..55e3c5f 100644 > > --- a/drivers/dma/pl330.c > > +++ b/drivers/dma/pl330.c > > @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan
Re: [PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
On Tue, 2016-03-08 at 09:42 +0530, Vinod Koul wrote: > On Wed, Feb 24, 2016 at 01:14:34PM +0000, Jon Medhurst (Tixy) wrote: > > The residue calculation in pl330_tx_status doesn't handle transitional > > states that occur at the time one descriptor (A) is completed and the > > next (B) is started. Specifically, both A and B can simultaneously be in > > the BUSY state and at this time the thread's 'req_running' may (or may > > not) be -1. > > you are under lock so descriptor state wont be update while we are it. > > Also the query for residue is for "a descriptor" not whatever is the current > running descriptor... > > > > > To cope with this situation we change the code to ensure A is treated as > > complete and B as having not yet started. Prior to the change, the code > > would calculate a transferred byte count as if both A and B had > > completed. > > You query for either A or B not both! I've probably been using wrong/ambiguous terminology... In my description I'm using 'descriptor' to refer to a 'struct dma_pl330_desc', I guess other people assume 'struct dma_async_tx_descriptor'? The situation I was debugging was audio playback, where ASoC ends up calling pl330_prep_dma_cyclic() with a period one quarter the length of the buffer it is using, so that results in four dma_pl330_desc 'descriptors' being created to cover that buffer. These later get submitted to a DMA channel (struct dma_pl330_chan) which has a list of these that it is processing (the 'work_list'). The residual calculation that currently exists in pl08x_dma_tx_status() is iterating this work_list and summing the length of currently transferring 'descriptor' with those later pending ones. I believe that is correct behaviour because these 'descriptors' (dma_pl330_desc) are all internal implementation details of the driver, and the dmaengine API's are dealing in units of 'dma_async_tx_descriptor' ? If the current code is OK in this regard, it is definitely buggy because it doesn't cope with the situation when two dma_pl330_desc's are in the state 'BUSY' a, which I have seen occur when debugging this issue, had worked out can happen by analysing the code, and is acknowledged by the in-source comments for enum desc_status... /* * Sitting on the work_list and already submitted * to the PL330 core. Not more than two descriptors * of a channel can be BUSY at any time. */ BUSY, In my problematic usecase I have userside code calling ALSA ioctls to poll the current audio playback position which results in pl08x_dma_tx_status() being called multiple times a second. After only a second or two the buggy situation gets hit, resulting in a miscalculation that ASoC interprets as a buffer underflow and so it stops the stream. I spent several days debugging this, with enough ad hoc tests and printk's littered everywhere to be very confident as to how things are going wrong - what I'm not not totally confident of is how things should be properly fixed. This patch appears to fix the situation that I was hitting, but it really looks like there isn't any locking that prevent this polling use of pl08x_dma_tx_status() from happening concurrently with the irq handler reprogramming the hardware for the next dma_pl330_desc. I didn't attempt any fix for that for fear of introducing bugs in what looks like complex code, and because it's not a problem I saw happen in practice. -- Tixy > > > > > Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") > > > > Signed-off-by: Jon Medhurst > > --- > > > > I discovered this issue when trying to work out why audio stopped > > working on ARM's Juno platform and bisected it to commit aee4d1fac887. > > Whilst this patch seems to fix the problems I was seeing, I can't help > > but think there are more race conditions with this code. E.g. if the > > running descriptor changes under us, pl330_get_current_xferred_count > > can end up reading values from hardware that relate to a different > > descriptor. And if we're really unlucky, the reading of the 'val' and > > 'addr' values in pl330_get_current_xferred_count can come from different > > descriptors. I don't know if there is any locks we can use to prevent > > such races or if we need to try and detect when things have changed and > > redo/abort the residue calculation... > > > > drivers/dma/pl330.c | 24 > > 1 file changed, 20 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c > > index 17ee758..55e3c5f 100644 > > --- a/drivers/dma/pl330.c > > +++ b/drivers/dma/pl330.c > > @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t > >
Re: [PATCH] ARM: dts: add "simple-bus" to "arm, amba-bus" compatible nodes
On Thu, 2016-03-03 at 12:07 +0900, Masahiro Yamada wrote: [...] > This patch is derived from Rob Herring' comment > "BTW, we should also kill off "amba-bus" which is an ambiguous term" > in the following thread: > http://lkml.iu.edu/hypermail/linux/kernel/1601.0/01822.html > > > So, the plan would be like this: > > [1] Make device trees not depend on "arm,amba-bus" (this commit) > [2] New device trees should no longer use "arm,amba-bus" alone. > [3] Go though some releases until we do not care about the backward > compatibility Why would we stop caring about backwards compatibility? If I was a user of any of the platforms in question and updated my kernel, I wouldn't expect to have to debug why it was broken, then install a new dtb to fix it - which may be a tricky thing to do, depending on the firmware used to boot Linux. > [4] Drop "arm,amba-bus" from of_default_bus_match_table -- Tixy
Re: [PATCH] ARM: dts: add "simple-bus" to "arm, amba-bus" compatible nodes
On Thu, 2016-03-03 at 12:07 +0900, Masahiro Yamada wrote: [...] > This patch is derived from Rob Herring' comment > "BTW, we should also kill off "amba-bus" which is an ambiguous term" > in the following thread: > http://lkml.iu.edu/hypermail/linux/kernel/1601.0/01822.html > > > So, the plan would be like this: > > [1] Make device trees not depend on "arm,amba-bus" (this commit) > [2] New device trees should no longer use "arm,amba-bus" alone. > [3] Go though some releases until we do not care about the backward > compatibility Why would we stop caring about backwards compatibility? If I was a user of any of the platforms in question and updated my kernel, I wouldn't expect to have to debug why it was broken, then install a new dtb to fix it - which may be a tricky thing to do, depending on the firmware used to boot Linux. > [4] Drop "arm,amba-bus" from of_default_bus_match_table -- Tixy
[PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
The residue calculation in pl330_tx_status doesn't handle transitional states that occur at the time one descriptor (A) is completed and the next (B) is started. Specifically, both A and B can simultaneously be in the BUSY state and at this time the thread's 'req_running' may (or may not) be -1. To cope with this situation we change the code to ensure A is treated as complete and B as having not yet started. Prior to the change, the code would calculate a transferred byte count as if both A and B had completed. Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") Signed-off-by: Jon Medhurst--- I discovered this issue when trying to work out why audio stopped working on ARM's Juno platform and bisected it to commit aee4d1fac887. Whilst this patch seems to fix the problems I was seeing, I can't help but think there are more race conditions with this code. E.g. if the running descriptor changes under us, pl330_get_current_xferred_count can end up reading values from hardware that relate to a different descriptor. And if we're really unlucky, the reading of the 'val' and 'addr' values in pl330_get_current_xferred_count can come from different descriptors. I don't know if there is any locks we can use to prevent such races or if we need to try and detect when things have changed and redo/abort the residue calculation... drivers/dma/pl330.c | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 17ee758..55e3c5f 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, struct dma_pl330_desc *desc, *running = NULL; struct dma_pl330_chan *pch = to_pchan(chan); unsigned int transferred, residual = 0; + bool first_busy; ret = dma_cookie_status(chan, cookie, txstate); @@ -2253,16 +2254,31 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, if (pch->thread->req_running != -1) running = pch->thread->req[pch->thread->req_running].desc; + first_busy = true; /* Check in pending list */ list_for_each_entry(desc, >work_list, node) { if (desc->status == DONE) transferred = desc->bytes_requested; - else if (running && desc == running) - transferred = - pl330_get_current_xferred_count(pch, desc); - else + else if (desc->status == BUSY && first_busy) { + first_busy = false; + if (running && desc == running) { + transferred = + pl330_get_current_xferred_count(pch, desc); + } else { + /* BUSY but not running means it's just completed */ + transferred = desc->bytes_requested; + } + } else { + /* +* Descriptor is either in PREP state queued for future +* transfer or it is the second BUSY descriptor we have +* seen. The latter case means it has just, or is about +* to be, started, so treat it as having not yet +* transferred any bytes, the same as PREP. +*/ transferred = 0; + } residual += desc->bytes_requested - transferred; if (desc->txd.cookie == cookie) { switch (desc->status) { -- 2.1.4
[PATCH] dmaengine: pl330: Fix some race conditions in residue calculation
The residue calculation in pl330_tx_status doesn't handle transitional states that occur at the time one descriptor (A) is completed and the next (B) is started. Specifically, both A and B can simultaneously be in the BUSY state and at this time the thread's 'req_running' may (or may not) be -1. To cope with this situation we change the code to ensure A is treated as complete and B as having not yet started. Prior to the change, the code would calculate a transferred byte count as if both A and B had completed. Fixes: aee4d1fac887 ("dmaengine: pl330: improve pl330_tx_status() function") Signed-off-by: Jon Medhurst --- I discovered this issue when trying to work out why audio stopped working on ARM's Juno platform and bisected it to commit aee4d1fac887. Whilst this patch seems to fix the problems I was seeing, I can't help but think there are more race conditions with this code. E.g. if the running descriptor changes under us, pl330_get_current_xferred_count can end up reading values from hardware that relate to a different descriptor. And if we're really unlucky, the reading of the 'val' and 'addr' values in pl330_get_current_xferred_count can come from different descriptors. I don't know if there is any locks we can use to prevent such races or if we need to try and detect when things have changed and redo/abort the residue calculation... drivers/dma/pl330.c | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c index 17ee758..55e3c5f 100644 --- a/drivers/dma/pl330.c +++ b/drivers/dma/pl330.c @@ -2240,6 +2240,7 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, struct dma_pl330_desc *desc, *running = NULL; struct dma_pl330_chan *pch = to_pchan(chan); unsigned int transferred, residual = 0; + bool first_busy; ret = dma_cookie_status(chan, cookie, txstate); @@ -2253,16 +2254,31 @@ pl330_tx_status(struct dma_chan *chan, dma_cookie_t cookie, if (pch->thread->req_running != -1) running = pch->thread->req[pch->thread->req_running].desc; + first_busy = true; /* Check in pending list */ list_for_each_entry(desc, >work_list, node) { if (desc->status == DONE) transferred = desc->bytes_requested; - else if (running && desc == running) - transferred = - pl330_get_current_xferred_count(pch, desc); - else + else if (desc->status == BUSY && first_busy) { + first_busy = false; + if (running && desc == running) { + transferred = + pl330_get_current_xferred_count(pch, desc); + } else { + /* BUSY but not running means it's just completed */ + transferred = desc->bytes_requested; + } + } else { + /* +* Descriptor is either in PREP state queued for future +* transfer or it is the second BUSY descriptor we have +* seen. The latter case means it has just, or is about +* to be, started, so treat it as having not yet +* transferred any bytes, the same as PREP. +*/ transferred = 0; + } residual += desc->bytes_requested - transferred; if (desc->txd.cookie == cookie) { switch (desc->status) { -- 2.1.4
Re: [PATCH 3/3] [RESEND] ARM: kprobes: use "I" constraint for inline assembly offsets
On Thu, 2016-02-18 at 18:59 +, Robin Murphy wrote: > On 18/02/16 18:12, Jon Medhurst (Tixy) wrote: > > On Thu, 2016-02-18 at 18:05 +0100, Arnd Bergmann wrote: > >> build-testing with clang showed that the "J" constraint does not take > >> positive arguments on clang when building in for Thumb-2: > >> > >> core.c:540:3: error: invalid operand for inline asm constraint 'J' > >> > >> This has been reported as llvm bug > >> https://llvm.org/bugs/show_bug.cgi?id=26061 > >> > >> However, looking at the source code in depth, I found that the > >> kernel is also wrong, and it should not use "J" at all, but should > >> use "I" to pass an immediate argument to the inline assembly when that > >> is used as an offset to an 'ldr' instruction rather than the 'sub' > >> argument. > > > > This patch doesn't seem correct to me. > > > > The ARM ARM says the immediate offset to an ARM ldr instructions is "any > > value in the range 0-4095" and offsets may be added or subtracted, > > leading to values from −4095 to 4095". > > > > And GCC machine constraints [1] says > > > > I > > Integer that is valid as an immediate operand in a data processing > > instruction. That is, an integer in the range 0 to 255 rotated by a > > multiple of 2 > > J > > Integer in the range −4095 to 4095 > > > > So the current use of 'J' seems correct to me. > > Hmm, Arnd reports the failure when building for Thumb-2, and the code > under #ifdef CONFIG_THUMB2_KERNEL contains an ldrd, which takes a > different immediate of the form imm8 * 4. Maybe it's just operand %5 > which needs fixing, although I don't see that a suitable constraint for > that actually exists... Well, under Thumb-2 plain LDR is also different, the offset is up to +/-255, except for pre-indexed without writeback mode (what the code uses) which goes up to +4095. I saw this yesterday but also that there aren't asm constraints for that. Actually, there is constraint 'Uq' which is "A memory reference suitable for the ARMv4 ldrsb instruction" and that is a value +/- 255. In practice, for the code in question, which is getting offsets into struct pt_regs, either 'I', 'J' or 'Uq' would work. 'Uq' is the one that expresses the strictest restrictions, that if met, will work with all assembler instructions used, but as it's documented as being for the "ARMv4 ldrsb instruction", it would seem a bit confusing to use that to me. -- Tixy > > Robin. > > > [1] > > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints > > > > > >> Signed-off-by: Arnd Bergmann <a...@arndb.de> > >> --- > >> arch/arm/probes/kprobes/core.c | 8 > >> 1 file changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/arch/arm/probes/kprobes/core.c > >> b/arch/arm/probes/kprobes/core.c > >> index a4ec240ee7ba..4b34b40ca917 100644 > >> --- a/arch/arm/probes/kprobes/core.c > >> +++ b/arch/arm/probes/kprobes/core.c > >> @@ -570,10 +570,10 @@ void __kprobes jprobe_return(void) > >>: > >>: "r" (kcb->jprobe_saved_regs.ARM_sp), > >> "I" (sizeof(struct pt_regs) * 2), > >> -"J" (offsetof(struct pt_regs, ARM_sp)), > >> -"J" (offsetof(struct pt_regs, ARM_pc)), > >> -"J" (offsetof(struct pt_regs, ARM_cpsr)), > >> -"J" (offsetof(struct pt_regs, ARM_lr)) > >> +"I" (offsetof(struct pt_regs, ARM_sp)), > >> +"I" (offsetof(struct pt_regs, ARM_pc)), > >> +"I" (offsetof(struct pt_regs, ARM_cpsr)), > >> +"I" (offsetof(struct pt_regs, ARM_lr)) > >>: "memory", "cc"); > >> } > >> > > > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > >
Re: [PATCH 3/3] [RESEND] ARM: kprobes: use "I" constraint for inline assembly offsets
On Thu, 2016-02-18 at 18:59 +, Robin Murphy wrote: > On 18/02/16 18:12, Jon Medhurst (Tixy) wrote: > > On Thu, 2016-02-18 at 18:05 +0100, Arnd Bergmann wrote: > >> build-testing with clang showed that the "J" constraint does not take > >> positive arguments on clang when building in for Thumb-2: > >> > >> core.c:540:3: error: invalid operand for inline asm constraint 'J' > >> > >> This has been reported as llvm bug > >> https://llvm.org/bugs/show_bug.cgi?id=26061 > >> > >> However, looking at the source code in depth, I found that the > >> kernel is also wrong, and it should not use "J" at all, but should > >> use "I" to pass an immediate argument to the inline assembly when that > >> is used as an offset to an 'ldr' instruction rather than the 'sub' > >> argument. > > > > This patch doesn't seem correct to me. > > > > The ARM ARM says the immediate offset to an ARM ldr instructions is "any > > value in the range 0-4095" and offsets may be added or subtracted, > > leading to values from −4095 to 4095". > > > > And GCC machine constraints [1] says > > > > I > > Integer that is valid as an immediate operand in a data processing > > instruction. That is, an integer in the range 0 to 255 rotated by a > > multiple of 2 > > J > > Integer in the range −4095 to 4095 > > > > So the current use of 'J' seems correct to me. > > Hmm, Arnd reports the failure when building for Thumb-2, and the code > under #ifdef CONFIG_THUMB2_KERNEL contains an ldrd, which takes a > different immediate of the form imm8 * 4. Maybe it's just operand %5 > which needs fixing, although I don't see that a suitable constraint for > that actually exists... Well, under Thumb-2 plain LDR is also different, the offset is up to +/-255, except for pre-indexed without writeback mode (what the code uses) which goes up to +4095. I saw this yesterday but also that there aren't asm constraints for that. Actually, there is constraint 'Uq' which is "A memory reference suitable for the ARMv4 ldrsb instruction" and that is a value +/- 255. In practice, for the code in question, which is getting offsets into struct pt_regs, either 'I', 'J' or 'Uq' would work. 'Uq' is the one that expresses the strictest restrictions, that if met, will work with all assembler instructions used, but as it's documented as being for the "ARMv4 ldrsb instruction", it would seem a bit confusing to use that to me. -- Tixy > > Robin. > > > [1] > > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints > > > > > >> Signed-off-by: Arnd Bergmann > >> --- > >> arch/arm/probes/kprobes/core.c | 8 > >> 1 file changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/arch/arm/probes/kprobes/core.c > >> b/arch/arm/probes/kprobes/core.c > >> index a4ec240ee7ba..4b34b40ca917 100644 > >> --- a/arch/arm/probes/kprobes/core.c > >> +++ b/arch/arm/probes/kprobes/core.c > >> @@ -570,10 +570,10 @@ void __kprobes jprobe_return(void) > >>: > >>: "r" (kcb->jprobe_saved_regs.ARM_sp), > >> "I" (sizeof(struct pt_regs) * 2), > >> -"J" (offsetof(struct pt_regs, ARM_sp)), > >> -"J" (offsetof(struct pt_regs, ARM_pc)), > >> -"J" (offsetof(struct pt_regs, ARM_cpsr)), > >> -"J" (offsetof(struct pt_regs, ARM_lr)) > >> +"I" (offsetof(struct pt_regs, ARM_sp)), > >> +"I" (offsetof(struct pt_regs, ARM_pc)), > >> +"I" (offsetof(struct pt_regs, ARM_cpsr)), > >> +"I" (offsetof(struct pt_regs, ARM_lr)) > >>: "memory", "cc"); > >> } > >> > > > > > > > > ___ > > linux-arm-kernel mailing list > > linux-arm-ker...@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > >
Re: [PATCH 3/3] [RESEND] ARM: kprobes: use "I" constraint for inline assembly offsets
On Thu, 2016-02-18 at 18:05 +0100, Arnd Bergmann wrote: > build-testing with clang showed that the "J" constraint does not take > positive arguments on clang when building in for Thumb-2: > > core.c:540:3: error: invalid operand for inline asm constraint 'J' > > This has been reported as llvm bug https://llvm.org/bugs/show_bug.cgi?id=26061 > > However, looking at the source code in depth, I found that the > kernel is also wrong, and it should not use "J" at all, but should > use "I" to pass an immediate argument to the inline assembly when that > is used as an offset to an 'ldr' instruction rather than the 'sub' > argument. This patch doesn't seem correct to me. The ARM ARM says the immediate offset to an ARM ldr instructions is "any value in the range 0-4095" and offsets may be added or subtracted, leading to values from −4095 to 4095". And GCC machine constraints [1] says I Integer that is valid as an immediate operand in a data processing instruction. That is, an integer in the range 0 to 255 rotated by a multiple of 2 J Integer in the range −4095 to 4095 So the current use of 'J' seems correct to me. [1] https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints > Signed-off-by: Arnd Bergmann> --- > arch/arm/probes/kprobes/core.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index a4ec240ee7ba..4b34b40ca917 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -570,10 +570,10 @@ void __kprobes jprobe_return(void) > : > : "r" (kcb->jprobe_saved_regs.ARM_sp), > "I" (sizeof(struct pt_regs) * 2), > - "J" (offsetof(struct pt_regs, ARM_sp)), > - "J" (offsetof(struct pt_regs, ARM_pc)), > - "J" (offsetof(struct pt_regs, ARM_cpsr)), > - "J" (offsetof(struct pt_regs, ARM_lr)) > + "I" (offsetof(struct pt_regs, ARM_sp)), > + "I" (offsetof(struct pt_regs, ARM_pc)), > + "I" (offsetof(struct pt_regs, ARM_cpsr)), > + "I" (offsetof(struct pt_regs, ARM_lr)) > : "memory", "cc"); > } >
Re: [PATCH 3/3] [RESEND] ARM: kprobes: use "I" constraint for inline assembly offsets
On Thu, 2016-02-18 at 18:05 +0100, Arnd Bergmann wrote: > build-testing with clang showed that the "J" constraint does not take > positive arguments on clang when building in for Thumb-2: > > core.c:540:3: error: invalid operand for inline asm constraint 'J' > > This has been reported as llvm bug https://llvm.org/bugs/show_bug.cgi?id=26061 > > However, looking at the source code in depth, I found that the > kernel is also wrong, and it should not use "J" at all, but should > use "I" to pass an immediate argument to the inline assembly when that > is used as an offset to an 'ldr' instruction rather than the 'sub' > argument. This patch doesn't seem correct to me. The ARM ARM says the immediate offset to an ARM ldr instructions is "any value in the range 0-4095" and offsets may be added or subtracted, leading to values from −4095 to 4095". And GCC machine constraints [1] says I Integer that is valid as an immediate operand in a data processing instruction. That is, an integer in the range 0 to 255 rotated by a multiple of 2 J Integer in the range −4095 to 4095 So the current use of 'J' seems correct to me. [1] https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints > Signed-off-by: Arnd Bergmann > --- > arch/arm/probes/kprobes/core.c | 8 > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c > index a4ec240ee7ba..4b34b40ca917 100644 > --- a/arch/arm/probes/kprobes/core.c > +++ b/arch/arm/probes/kprobes/core.c > @@ -570,10 +570,10 @@ void __kprobes jprobe_return(void) > : > : "r" (kcb->jprobe_saved_regs.ARM_sp), > "I" (sizeof(struct pt_regs) * 2), > - "J" (offsetof(struct pt_regs, ARM_sp)), > - "J" (offsetof(struct pt_regs, ARM_pc)), > - "J" (offsetof(struct pt_regs, ARM_cpsr)), > - "J" (offsetof(struct pt_regs, ARM_lr)) > + "I" (offsetof(struct pt_regs, ARM_sp)), > + "I" (offsetof(struct pt_regs, ARM_pc)), > + "I" (offsetof(struct pt_regs, ARM_cpsr)), > + "I" (offsetof(struct pt_regs, ARM_lr)) > : "memory", "cc"); > } >
Re: [PATCH 9/9] ARM: fix kprobe test with CONFIG_CPU_32v3
On Thu, 2016-02-18 at 15:02 +0100, Arnd Bergmann wrote: > ARMv3 did not have 16-bit load/store or 32-bit multiply instructions, > so building the kprobe test code fails with lots of warnings about > these: > > /tmp/ccI4SKHx.s:19585: Error: selected processor does not support ARM mode > `umull r0,r1,r2,r3' > /tmp/ccI4SKHx.s:19617: Error: selected processor does not support ARM mode > `umullls r7,r8,r9,r10' > /tmp/ccI4SKHx.s:19645: Error: selected processor does not support ARM mode > `umull lr,r12,r11,r13' > /tmp/ccI4SKHx.s:19727: Error: selected processor does not support ARM mode > `umulls r0,r1,r2,r3' > ... > /tmp/ccI4SKHx.s:21273: Error: selected processor does not support ARM mode > `strh r0,[r1,-r2]' > /tmp/ccI4SKHx.s:21309: Error: selected processor does not support ARM mode > `streqh r14,[r11,r12]' > /tmp/ccI4SKHx.s:21333: Error: selected processor does not support ARM mode > `streqh r14,[r13,r12]' > > This puts all the affected instructions inside an #ifdef section, > like we do for the other architecture levels. > > Signed-off-by: Arnd Bergmann> --- I was about to say that I didn't know that we supported ARMv3 then got a feeling of deja vu :-) [1] Acked-by: Jon Medhurst [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242997.html > arch/arm/probes/kprobes/test-arm.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/probes/kprobes/test-arm.c > b/arch/arm/probes/kprobes/test-arm.c > index 8866aedfdea2..4e8511f0582d 100644 > --- a/arch/arm/probes/kprobes/test-arm.c > +++ b/arch/arm/probes/kprobes/test-arm.c > @@ -391,6 +391,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED(__inst_arm(0xe0700090) " @ undef") > TEST_UNSUPPORTED(__inst_arm(0xe07fff9f) " @ undef") > > +#if __LINUX_ARM_ARCH__ >= 4 > TEST_RR( "umullr0, r1, r",2, VAL1,", r",3, VAL2,"") > TEST_RR( "umullls r7, r8, r",9, VAL2,", r",10, VAL1,"") > TEST_R( "umulllr, r12, r",11,VAL3,", r13") > @@ -436,6 +437,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED(__inst_arm(0xe0f0f392) " @ smlals r0, pc, r2, r3") > TEST_UNSUPPORTED(__inst_arm(0xe0f0139f) " @ smlals r0, r1, pc, r3") > TEST_UNSUPPORTED(__inst_arm(0xe0f01f92) " @ smlals r0, r1, r2, pc") > +#endif > > TEST_GROUP("Synchronization primitives") > > @@ -478,7 +480,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED("ldrexhr2, [sp]") > #endif > TEST_GROUP("Extra load/store instructions") > - > +#if __LINUX_ARM_ARCH__ >= 4 > TEST_RPR( "strhr",0, VAL1,", [r",1, 48,", -r",2, 24,"]") > TEST_RPR( "streqh r",14,VAL2,", [r",11,0, ", r",12, 48,"]") > TEST_UNSUPPORTED( "streqh r14, [r13, r12]") > @@ -560,6 +562,7 @@ void kprobe_arm_test_cases(void) > TEST( "ldrsh r0, [pc, #0]") > TEST_UNSUPPORTED(__inst_arm(0xe1ffc3f0) " @ ldrsh r12, [pc, > #48]!") > TEST_UNSUPPORTED(__inst_arm(0xe0d9f3f0) " @ ldrsh pc, [r9], #48") > +#endif > > #if __LINUX_ARM_ARCH__ >= 7 > TEST_UNSUPPORTED("strht r1, [r2], r3")
Re: [PATCH 9/9] ARM: fix kprobe test with CONFIG_CPU_32v3
On Thu, 2016-02-18 at 15:02 +0100, Arnd Bergmann wrote: > ARMv3 did not have 16-bit load/store or 32-bit multiply instructions, > so building the kprobe test code fails with lots of warnings about > these: > > /tmp/ccI4SKHx.s:19585: Error: selected processor does not support ARM mode > `umull r0,r1,r2,r3' > /tmp/ccI4SKHx.s:19617: Error: selected processor does not support ARM mode > `umullls r7,r8,r9,r10' > /tmp/ccI4SKHx.s:19645: Error: selected processor does not support ARM mode > `umull lr,r12,r11,r13' > /tmp/ccI4SKHx.s:19727: Error: selected processor does not support ARM mode > `umulls r0,r1,r2,r3' > ... > /tmp/ccI4SKHx.s:21273: Error: selected processor does not support ARM mode > `strh r0,[r1,-r2]' > /tmp/ccI4SKHx.s:21309: Error: selected processor does not support ARM mode > `streqh r14,[r11,r12]' > /tmp/ccI4SKHx.s:21333: Error: selected processor does not support ARM mode > `streqh r14,[r13,r12]' > > This puts all the affected instructions inside an #ifdef section, > like we do for the other architecture levels. > > Signed-off-by: Arnd Bergmann > --- I was about to say that I didn't know that we supported ARMv3 then got a feeling of deja vu :-) [1] Acked-by: Jon Medhurst [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-March/242997.html > arch/arm/probes/kprobes/test-arm.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/probes/kprobes/test-arm.c > b/arch/arm/probes/kprobes/test-arm.c > index 8866aedfdea2..4e8511f0582d 100644 > --- a/arch/arm/probes/kprobes/test-arm.c > +++ b/arch/arm/probes/kprobes/test-arm.c > @@ -391,6 +391,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED(__inst_arm(0xe0700090) " @ undef") > TEST_UNSUPPORTED(__inst_arm(0xe07fff9f) " @ undef") > > +#if __LINUX_ARM_ARCH__ >= 4 > TEST_RR( "umullr0, r1, r",2, VAL1,", r",3, VAL2,"") > TEST_RR( "umullls r7, r8, r",9, VAL2,", r",10, VAL1,"") > TEST_R( "umulllr, r12, r",11,VAL3,", r13") > @@ -436,6 +437,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED(__inst_arm(0xe0f0f392) " @ smlals r0, pc, r2, r3") > TEST_UNSUPPORTED(__inst_arm(0xe0f0139f) " @ smlals r0, r1, pc, r3") > TEST_UNSUPPORTED(__inst_arm(0xe0f01f92) " @ smlals r0, r1, r2, pc") > +#endif > > TEST_GROUP("Synchronization primitives") > > @@ -478,7 +480,7 @@ void kprobe_arm_test_cases(void) > TEST_UNSUPPORTED("ldrexhr2, [sp]") > #endif > TEST_GROUP("Extra load/store instructions") > - > +#if __LINUX_ARM_ARCH__ >= 4 > TEST_RPR( "strhr",0, VAL1,", [r",1, 48,", -r",2, 24,"]") > TEST_RPR( "streqh r",14,VAL2,", [r",11,0, ", r",12, 48,"]") > TEST_UNSUPPORTED( "streqh r14, [r13, r12]") > @@ -560,6 +562,7 @@ void kprobe_arm_test_cases(void) > TEST( "ldrsh r0, [pc, #0]") > TEST_UNSUPPORTED(__inst_arm(0xe1ffc3f0) " @ ldrsh r12, [pc, > #48]!") > TEST_UNSUPPORTED(__inst_arm(0xe0d9f3f0) " @ ldrsh pc, [r9], #48") > +#endif > > #if __LINUX_ARM_ARCH__ >= 7 > TEST_UNSUPPORTED("strht r1, [r2], r3")
Re: [RFC] component: Fix: Unassign components' masters if bringing up master fails
On Thu, 2016-02-11 at 15:05 +0530, Archit Taneja wrote: > component_master_add_with_match can fail if the master's bind op doesn't > go through successfully. In such a scenario, all the components in the > master's match array have their 'master' pointer set to the given master. > These pointers need to be set to NULL again. If they aren't, successive > calls to component_master_add_with_match will fail because the driver > thinks these components already have a master. > > This issue can be seen when a driver defers probe because of missing > resources. It is seen after the introduction of commit: > > "component: track components via array rather than list" > > Add 'master_remove_components' which sets the all the components's masters > in the match array to NULL. This function is also re-used in > component_master_del and replaces code that did the same thing. > > Signed-off-by: Archit TanejaAs Daniel pointed out in his reply, there is already a fix for this issue in Linux which makes sure no components point to a master if it is deleted. See commit 57480484f9f7 ("component: Detach components when deleting master struct") Similarly, Daniel's fix for the mirror case has just been applied, which makes sure masters don't refer to components when they are deleted. Commit 8e7199c2c50f ("component: remove device from master match list on failed add"). It seems to me that for other error cases (that don't result in deletion of objects) we would want to leave the references between components and masters intact once they have been created. With regard to the $subject patch (below) it looks like it would go wrong in this situation... - component_add() is called to add a component - This calls try_to_bring_up_masters() which calls try_to_bring_up_master() for each master in the system - If that master doesn't yet have all components available yet find_components() returned false, then - master_remove_components() is called But, this isn't an error situation that needs rolling back, and as written the patch only half does this, because it stops components pointing to the master, but leaves the master's match list pointing to those components. The actual real error conditions in try_to_bring_up_master() only get triggered when actually trying to bring up a master, and that only happens when either: a) The last component for that master is being added with component_add() b) A master is added by component_master_add_with_match() and all the components it required where already registered. Both a) and b) should now be handled correctly by the deletion of the relevant component/master that was being added (thanks to the two fixes I mentioned at the beginning). The other components or master should subsequently get cleaned up by calling component_del() or component_master_del(), which take care of updating the relevant references between components and master. For component_master_del this is not immediately obvious, but take_down_master calls devres_release_group which causes devm_component_match_release to be called. -- Tixy > --- > drivers/base/component.c | 45 + > 1 file changed, 29 insertions(+), 16 deletions(-) > > diff --git a/drivers/base/component.c b/drivers/base/component.c > index 89f5cf68..a2ecc35 100644 > --- a/drivers/base/component.c > +++ b/drivers/base/component.c > @@ -130,6 +130,23 @@ static void remove_component(struct master *master, > struct component *c) > master->match->compare[i].component = NULL; > } > > +/* Detach all components from associated master */ > +static void master_remove_components(struct master *master) > +{ > + struct component_match *match = master->match; > + size_t i; > + > + if (!match) > + return; > + > + for (i = 0; i < match->num; i++) { > + struct component *c = match->compare[i].component; > + > + if (c) > + c->master = NULL; > + } > +} > + > /* > * Try to bring up a master. If component is NULL, we're interested in > * this master, otherwise it's a component which must be present to try > @@ -140,34 +157,39 @@ static void remove_component(struct master *master, > struct component *c) > static int try_to_bring_up_master(struct master *master, > struct component *component) > { > - int ret; > + int ret = 0; > > dev_dbg(master->dev, "trying to bring up master\n"); > > if (find_components(master)) { > dev_dbg(master->dev, "master has incomplete components\n"); > - return 0; > + goto err; > } > > if (component && component->master != master) { > dev_dbg(master->dev, "master is not for this component (%s)\n", > dev_name(component->dev)); > - return 0; > + goto err; > } > > - if (!devres_open_group(master->dev, NULL,
Re: [RFC] component: Fix: Unassign components' masters if bringing up master fails
On Thu, 2016-02-11 at 15:05 +0530, Archit Taneja wrote: > component_master_add_with_match can fail if the master's bind op doesn't > go through successfully. In such a scenario, all the components in the > master's match array have their 'master' pointer set to the given master. > These pointers need to be set to NULL again. If they aren't, successive > calls to component_master_add_with_match will fail because the driver > thinks these components already have a master. > > This issue can be seen when a driver defers probe because of missing > resources. It is seen after the introduction of commit: > > "component: track components via array rather than list" > > Add 'master_remove_components' which sets the all the components's masters > in the match array to NULL. This function is also re-used in > component_master_del and replaces code that did the same thing. > > Signed-off-by: Archit Taneja As Daniel pointed out in his reply, there is already a fix for this issue in Linux which makes sure no components point to a master if it is deleted. See commit 57480484f9f7 ("component: Detach components when deleting master struct") Similarly, Daniel's fix for the mirror case has just been applied, which makes sure masters don't refer to components when they are deleted. Commit 8e7199c2c50f ("component: remove device from master match list on failed add"). It seems to me that for other error cases (that don't result in deletion of objects) we would want to leave the references between components and masters intact once they have been created. With regard to the $subject patch (below) it looks like it would go wrong in this situation... - component_add() is called to add a component - This calls try_to_bring_up_masters() which calls try_to_bring_up_master() for each master in the system - If that master doesn't yet have all components available yet find_components() returned false, then - master_remove_components() is called But, this isn't an error situation that needs rolling back, and as written the patch only half does this, because it stops components pointing to the master, but leaves the master's match list pointing to those components. The actual real error conditions in try_to_bring_up_master() only get triggered when actually trying to bring up a master, and that only happens when either: a) The last component for that master is being added with component_add() b) A master is added by component_master_add_with_match() and all the components it required where already registered. Both a) and b) should now be handled correctly by the deletion of the relevant component/master that was being added (thanks to the two fixes I mentioned at the beginning). The other components or master should subsequently get cleaned up by calling component_del() or component_master_del(), which take care of updating the relevant references between components and master. For component_master_del this is not immediately obvious, but take_down_master calls devres_release_group which causes devm_component_match_release to be called. -- Tixy > --- > drivers/base/component.c | 45 + > 1 file changed, 29 insertions(+), 16 deletions(-) > > diff --git a/drivers/base/component.c b/drivers/base/component.c > index 89f5cf68..a2ecc35 100644 > --- a/drivers/base/component.c > +++ b/drivers/base/component.c > @@ -130,6 +130,23 @@ static void remove_component(struct master *master, > struct component *c) > master->match->compare[i].component = NULL; > } > > +/* Detach all components from associated master */ > +static void master_remove_components(struct master *master) > +{ > + struct component_match *match = master->match; > + size_t i; > + > + if (!match) > + return; > + > + for (i = 0; i < match->num; i++) { > + struct component *c = match->compare[i].component; > + > + if (c) > + c->master = NULL; > + } > +} > + > /* > * Try to bring up a master. If component is NULL, we're interested in > * this master, otherwise it's a component which must be present to try > @@ -140,34 +157,39 @@ static void remove_component(struct master *master, > struct component *c) > static int try_to_bring_up_master(struct master *master, > struct component *component) > { > - int ret; > + int ret = 0; > > dev_dbg(master->dev, "trying to bring up master\n"); > > if (find_components(master)) { > dev_dbg(master->dev, "master has incomplete components\n"); > - return 0; > + goto err; > } > > if (component && component->master != master) { > dev_dbg(master->dev, "master is not for this component (%s)\n", > dev_name(component->dev)); > - return 0; > + goto err; > } > > - if (!devres_open_group(master->dev, NULL, GFP_KERNEL)) > -
[PATCH] ASoC: dwc: Ensure i2s_reg_comp{1,2} is always initialised
In the case that the driver is configured from device-tree i2s_reg_comp1 and i2s_reg_comp2 aren't initialised, breaking the driver. Fix this by unconditionally setting these values before checking for quirks. Fixes: a242cac1d3aa ("ASoC: dwc: add quirk to override COMP_PARAM_1 register") Signed-off-by: Jon Medhurst --- sound/soc/dwc/designware_i2s.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c index d8afd8e..5f4e6aa 100644 --- a/sound/soc/dwc/designware_i2s.c +++ b/sound/soc/dwc/designware_i2s.c @@ -636,6 +636,8 @@ static int dw_i2s_probe(struct platform_device *pdev) dev->dev = >dev; + dev->i2s_reg_comp1 = I2S_COMP_PARAM_1; + dev->i2s_reg_comp2 = I2S_COMP_PARAM_2; if (pdata) { dev->capability = pdata->cap; clk_id = NULL; @@ -643,9 +645,6 @@ static int dw_i2s_probe(struct platform_device *pdev) if (dev->quirks & DW_I2S_QUIRK_COMP_REG_OFFSET) { dev->i2s_reg_comp1 = pdata->i2s_reg_comp1; dev->i2s_reg_comp2 = pdata->i2s_reg_comp2; - } else { - dev->i2s_reg_comp1 = I2S_COMP_PARAM_1; - dev->i2s_reg_comp2 = I2S_COMP_PARAM_2; } ret = dw_configure_dai_by_pd(dev, dw_i2s_dai, res, pdata); } else { -- 2.1.4
[PATCH] ASoC: dwc: Ensure i2s_reg_comp{1,2} is always initialised
In the case that the driver is configured from device-tree i2s_reg_comp1 and i2s_reg_comp2 aren't initialised, breaking the driver. Fix this by unconditionally setting these values before checking for quirks. Fixes: a242cac1d3aa ("ASoC: dwc: add quirk to override COMP_PARAM_1 register") Signed-off-by: Jon Medhurst--- sound/soc/dwc/designware_i2s.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/sound/soc/dwc/designware_i2s.c b/sound/soc/dwc/designware_i2s.c index d8afd8e..5f4e6aa 100644 --- a/sound/soc/dwc/designware_i2s.c +++ b/sound/soc/dwc/designware_i2s.c @@ -636,6 +636,8 @@ static int dw_i2s_probe(struct platform_device *pdev) dev->dev = >dev; + dev->i2s_reg_comp1 = I2S_COMP_PARAM_1; + dev->i2s_reg_comp2 = I2S_COMP_PARAM_2; if (pdata) { dev->capability = pdata->cap; clk_id = NULL; @@ -643,9 +645,6 @@ static int dw_i2s_probe(struct platform_device *pdev) if (dev->quirks & DW_I2S_QUIRK_COMP_REG_OFFSET) { dev->i2s_reg_comp1 = pdata->i2s_reg_comp1; dev->i2s_reg_comp2 = pdata->i2s_reg_comp2; - } else { - dev->i2s_reg_comp1 = I2S_COMP_PARAM_1; - dev->i2s_reg_comp2 = I2S_COMP_PARAM_2; } ret = dw_configure_dai_by_pd(dev, dw_i2s_dai, res, pdata); } else { -- 2.1.4
Re: [PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
On Thu, 2016-01-21 at 16:58 -0800, Laura Abbott wrote: > On 01/21/2016 12:19 PM, Jon Medhurst (Tixy) wrote: [...] > > If sg_dma_len() is correct or acceptable then it seems to me that the > > ION code should set that length. Especially as the comment in the code > > implies it's faking a call to map_sg and grepping the kernel tree for > > real implementations of that functionality seems to show the dma_address > > getting set. > > > > As you can probably tell, I feel I may be on shaky ground. This is > > because I don't fully understanding the code and suspecting both the ION > > and GPU code is rather dodgy (and possibly the bits in between :-) > > > > I blame the Ion code completely. I remember hitting a similar problem > with other out of tree drivers. The solution then was to have drivers > switch to using sg->length instead of sg_dma_len given the state of that > tree. For the Mali driver, if it is ever going to be backed by an IOMMU > you will need to use sg_dma_len so I think at least that part of your > code is correct. > > Thinking about it some, I'm okay with the patch going in. I thought > there was some reason why the out of tree code from before didn't just > do this hack but I can't remember it. It may have been an out of tree > use case. This does go well with Ion's behavior of pretending to do > DMA mapping. More out of tree users can plead their case if it breaks. Well, the $subject patch is copying the value of 'length' into 'dma_length', and if dma_length was previously uninitialised then I don't see that it can really cause additional problems for ION users. (I hate the way I've been using 'if' a lot so I'm going to spend some time educating myself.) > Acked-by: Laura Abbott Thanks -- Tixy
Re: [PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
On Thu, 2016-01-21 at 16:58 -0800, Laura Abbott wrote: > On 01/21/2016 12:19 PM, Jon Medhurst (Tixy) wrote: [...] > > If sg_dma_len() is correct or acceptable then it seems to me that the > > ION code should set that length. Especially as the comment in the code > > implies it's faking a call to map_sg and grepping the kernel tree for > > real implementations of that functionality seems to show the dma_address > > getting set. > > > > As you can probably tell, I feel I may be on shaky ground. This is > > because I don't fully understanding the code and suspecting both the ION > > and GPU code is rather dodgy (and possibly the bits in between :-) > > > > I blame the Ion code completely. I remember hitting a similar problem > with other out of tree drivers. The solution then was to have drivers > switch to using sg->length instead of sg_dma_len given the state of that > tree. For the Mali driver, if it is ever going to be backed by an IOMMU > you will need to use sg_dma_len so I think at least that part of your > code is correct. > > Thinking about it some, I'm okay with the patch going in. I thought > there was some reason why the out of tree code from before didn't just > do this hack but I can't remember it. It may have been an out of tree > use case. This does go well with Ion's behavior of pretending to do > DMA mapping. More out of tree users can plead their case if it breaks. Well, the $subject patch is copying the value of 'length' into 'dma_length', and if dma_length was previously uninitialised then I don't see that it can really cause additional problems for ION users. (I hate the way I've been using 'if' a lot so I'm going to spend some time educating myself.) > Acked-by: Laura Abbott <labb...@redhat.com> Thanks -- Tixy
Re: [PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
On Thu, 2016-01-21 at 09:39 -0800, Laura Abbott wrote: > On 01/21/2016 03:57 AM, Jon Medhurst (Tixy) wrote: > > From: Liviu Dudau > > > > ion_buffer_create() will allocate a buffer and then create a DMA > > mapping for it, but it forgot to set the length of the page entries. > > > > Signed-off-by: Liviu Dudau > > Signed-off-by: Jon Medhurst > > --- > > drivers/staging/android/ion/ion.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/staging/android/ion/ion.c > > b/drivers/staging/android/ion/ion.c > > index e237e9f..df56021 100644 > > --- a/drivers/staging/android/ion/ion.c > > +++ b/drivers/staging/android/ion/ion.c > > @@ -251,8 +251,10 @@ static struct ion_buffer *ion_buffer_create(struct > > ion_heap *heap, > > * memory coming from the heaps is ready for dma, ie if it has a > > * cached mapping that mapping has been invalidated > > */ > > - for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) > > + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) { > > sg_dma_address(sg) = sg_phys(sg); > > + sg_dma_len(sg) = sg->length; > > + } > > mutex_lock(>buffer_lock); > > ion_buffer_add(dev, buffer); > > mutex_unlock(>buffer_lock); > > > > So Ion is really doing it wrong by setting the sg_dma_address manually as > the comment above notes. Ion has moved away from sg_dma_len though > (see 06e0dcaeb4fd72a010a1f5ad0c03abd8e0a58ef9). This isn't technically > a mapping as well. What's broken by not having sg_dma_len set? I fear this could end up being embarrassing... What's broken is that the out-of-tree kernel driver for ARM's Mali GPU is getting passed a dma_buf corresponding to the ION buffer. It is then calling dma_buf_map_attachment [1] on that and then parsing the resultant scatter-gather list to get the physical pages so it can pass them to the GPU hardware. In the process, it is using sg_dma_len() to get the length, which is garbage for ION buffers if ion_buffer_create() doesn't set it. [1] http://git.linaro.org/landing-teams/working/arm/kernel-release.git/blob/9660bff61ab296be02aad111d0bc2b9919493de5:/drivers/gpu/arm/midgard/mali_kbase_jd.c#l333 Now, I just tried making the Mali driver use sg->length rather than sg_dma_len() and, unsurprisingly, that also fixes the problem. So, my questions would be... Is it acceptable for a driver getting a dma_buf to parse the scatter-gather list for that by had? If so, should it use ->length or sg_dma_len() to get the length of each element? If sg_dma_len() is correct or acceptable then it seems to me that the ION code should set that length. Especially as the comment in the code implies it's faking a call to map_sg and grepping the kernel tree for real implementations of that functionality seems to show the dma_address getting set. As you can probably tell, I feel I may be on shaky ground. This is because I don't fully understanding the code and suspecting both the ION and GPU code is rather dodgy (and possibly the bits in between :-) -- Tixy
Problem with commit 924eb47512 ("ASoC: dwc: fix dma stop transferring issue")
Hi In trying to work out why audio stopped working on ARM's Juno board I bisected the problem to $subject commit and reverting this fixed things. The symptoms of broken audio was that alsa-utils' speaker-test produced very short duration 'churps' rather than the voice sample naming each speaker channel. (Additional note, audio on Juno is routed through a HDMI transmitter and the patches to support that aren't in the mainline kernel tree.) Now, I don't have any documentation for the Designware I2S device so the following is my speculation from looking at the code and guessing what's going on... 1. $subject patch makes i2s_start() unmask interrupts for all four hardware channels. 2. i2s_stop masks interrupts for all four channels. 3. dw_i2s_hw_params unmasks interrupts for only the number of hardware channels as required for the specified audio stream format. So, playing stereo sound (one hardware channel) on my Juno board is going wrong because i2s_start() now unmasks interrupts for the other 3 unused hardware channels and these are immediately generating interrupts, making the code think the buffer has finished transmission (or underflowed, or something like that, remember I'm speculating here). Now, if audio playback/capture is always achieved by using this sequence of calls: dw_i2s_hw_params i2s_start i2s_stop then it seems to me that $subject patch is redundant and should be reverted. However, if this is allowed (possibly is, depends on how dw_i2s_trigger can be called) ... dw_i2s_hw_params i2s_start i2s_stop i2s_start i2s_stop then we need to change i2s_start to unmask interrupts for just the hardware channels being used. I can have a go at producing a patch for that if that's deemed the correct solution. (Looks like we'd need to stash the number of used hardware channels in struct dw_i2s_dev). -- Tixy
[PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
From: Liviu Dudau ion_buffer_create() will allocate a buffer and then create a DMA mapping for it, but it forgot to set the length of the page entries. Signed-off-by: Liviu Dudau Signed-off-by: Jon Medhurst --- drivers/staging/android/ion/ion.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index e237e9f..df56021 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -251,8 +251,10 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, * memory coming from the heaps is ready for dma, ie if it has a * cached mapping that mapping has been invalidated */ - for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) { sg_dma_address(sg) = sg_phys(sg); + sg_dma_len(sg) = sg->length; + } mutex_lock(>buffer_lock); ion_buffer_add(dev, buffer); mutex_unlock(>buffer_lock); -- 2.1.4
[PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
From: Liviu Dudauion_buffer_create() will allocate a buffer and then create a DMA mapping for it, but it forgot to set the length of the page entries. Signed-off-by: Liviu Dudau Signed-off-by: Jon Medhurst --- drivers/staging/android/ion/ion.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c index e237e9f..df56021 100644 --- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -251,8 +251,10 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, * memory coming from the heaps is ready for dma, ie if it has a * cached mapping that mapping has been invalidated */ - for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) { sg_dma_address(sg) = sg_phys(sg); + sg_dma_len(sg) = sg->length; + } mutex_lock(>buffer_lock); ion_buffer_add(dev, buffer); mutex_unlock(>buffer_lock); -- 2.1.4
Problem with commit 924eb47512 ("ASoC: dwc: fix dma stop transferring issue")
Hi In trying to work out why audio stopped working on ARM's Juno board I bisected the problem to $subject commit and reverting this fixed things. The symptoms of broken audio was that alsa-utils' speaker-test produced very short duration 'churps' rather than the voice sample naming each speaker channel. (Additional note, audio on Juno is routed through a HDMI transmitter and the patches to support that aren't in the mainline kernel tree.) Now, I don't have any documentation for the Designware I2S device so the following is my speculation from looking at the code and guessing what's going on... 1. $subject patch makes i2s_start() unmask interrupts for all four hardware channels. 2. i2s_stop masks interrupts for all four channels. 3. dw_i2s_hw_params unmasks interrupts for only the number of hardware channels as required for the specified audio stream format. So, playing stereo sound (one hardware channel) on my Juno board is going wrong because i2s_start() now unmasks interrupts for the other 3 unused hardware channels and these are immediately generating interrupts, making the code think the buffer has finished transmission (or underflowed, or something like that, remember I'm speculating here). Now, if audio playback/capture is always achieved by using this sequence of calls: dw_i2s_hw_params i2s_start i2s_stop then it seems to me that $subject patch is redundant and should be reverted. However, if this is allowed (possibly is, depends on how dw_i2s_trigger can be called) ... dw_i2s_hw_params i2s_start i2s_stop i2s_start i2s_stop then we need to change i2s_start to unmask interrupts for just the hardware channels being used. I can have a go at producing a patch for that if that's deemed the correct solution. (Looks like we'd need to stash the number of used hardware channels in struct dw_i2s_dev). -- Tixy
Re: [PATCH] staging: android: ion: Set the length of the DMA sg entries in buffer
On Thu, 2016-01-21 at 09:39 -0800, Laura Abbott wrote: > On 01/21/2016 03:57 AM, Jon Medhurst (Tixy) wrote: > > From: Liviu Dudau <liviu.du...@arm.com> > > > > ion_buffer_create() will allocate a buffer and then create a DMA > > mapping for it, but it forgot to set the length of the page entries. > > > > Signed-off-by: Liviu Dudau <liviu.du...@arm.com> > > Signed-off-by: Jon Medhurst <t...@linaro.org> > > --- > > drivers/staging/android/ion/ion.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/staging/android/ion/ion.c > > b/drivers/staging/android/ion/ion.c > > index e237e9f..df56021 100644 > > --- a/drivers/staging/android/ion/ion.c > > +++ b/drivers/staging/android/ion/ion.c > > @@ -251,8 +251,10 @@ static struct ion_buffer *ion_buffer_create(struct > > ion_heap *heap, > > * memory coming from the heaps is ready for dma, ie if it has a > > * cached mapping that mapping has been invalidated > > */ > > - for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) > > + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) { > > sg_dma_address(sg) = sg_phys(sg); > > + sg_dma_len(sg) = sg->length; > > + } > > mutex_lock(>buffer_lock); > > ion_buffer_add(dev, buffer); > > mutex_unlock(>buffer_lock); > > > > So Ion is really doing it wrong by setting the sg_dma_address manually as > the comment above notes. Ion has moved away from sg_dma_len though > (see 06e0dcaeb4fd72a010a1f5ad0c03abd8e0a58ef9). This isn't technically > a mapping as well. What's broken by not having sg_dma_len set? I fear this could end up being embarrassing... What's broken is that the out-of-tree kernel driver for ARM's Mali GPU is getting passed a dma_buf corresponding to the ION buffer. It is then calling dma_buf_map_attachment [1] on that and then parsing the resultant scatter-gather list to get the physical pages so it can pass them to the GPU hardware. In the process, it is using sg_dma_len() to get the length, which is garbage for ION buffers if ion_buffer_create() doesn't set it. [1] http://git.linaro.org/landing-teams/working/arm/kernel-release.git/blob/9660bff61ab296be02aad111d0bc2b9919493de5:/drivers/gpu/arm/midgard/mali_kbase_jd.c#l333 Now, I just tried making the Mali driver use sg->length rather than sg_dma_len() and, unsurprisingly, that also fixes the problem. So, my questions would be... Is it acceptable for a driver getting a dma_buf to parse the scatter-gather list for that by had? If so, should it use ->length or sg_dma_len() to get the length of each element? If sg_dma_len() is correct or acceptable then it seems to me that the ION code should set that length. Especially as the comment in the code implies it's faking a call to map_sg and grepping the kernel tree for real implementations of that functionality seems to show the dma_address getting set. As you can probably tell, I feel I may be on shaky ground. This is because I don't fully understanding the code and suspecting both the ION and GPU code is rather dodgy (and possibly the bits in between :-) -- Tixy
Re: [PATCH] iommu/arm-smmu: add a shortcut when the @dev_node is NULL
On Wed, 2016-01-20 at 14:46 +, Robin Murphy wrote: > > Of course, the 0.004014s maybe not accurate enough, it is just an > > approximate number. > > A mean and standard deviation of at least, say, 5 runs each with and > without the patch would be considerably more meaningful (even if > still > far from statistically significant). It wouldn't surprise me if replacing the proposed change with an 'asm volatile("nop")' or two also give a boot time delta of several milliseconds (due to change in cache line alignment of functions). I don't believe you can reliably measure such minor changes. It doesn't mean that the proposed change isn't a good addition though, it obviously results in less code getting executed for the cost of one or two instructions for a compare and branch. -- Tixy
Re: [PATCH] iommu/arm-smmu: add a shortcut when the @dev_node is NULL
On Wed, 2016-01-20 at 14:46 +, Robin Murphy wrote: > > Of course, the 0.004014s maybe not accurate enough, it is just an > > approximate number. > > A mean and standard deviation of at least, say, 5 runs each with and > without the patch would be considerably more meaningful (even if > still > far from statistically significant). It wouldn't surprise me if replacing the proposed change with an 'asm volatile("nop")' or two also give a boot time delta of several milliseconds (due to change in cache line alignment of functions). I don't believe you can reliably measure such minor changes. It doesn't mean that the proposed change isn't a good addition though, it obviously results in less code getting executed for the cost of one or two instructions for a compare and branch. -- Tixy
Re: [RFC] kprobe'ing conditionally executed instructions
On Fri, 2015-12-11 at 10:34 +, Russell King - ARM Linux wrote: > On Fri, Dec 11, 2015 at 10:27:13AM +0000, Jon Medhurst (Tixy) wrote: > > On Fri, 2015-12-11 at 00:05 -0500, David Long wrote: > > > There is a moderate amount of code already in kprobes on ARM and the > > > current ARMv8 patch to deal with conditional execution of instructions. > > > One aspect of how this is handled is that instructions that fail their > > > predicate and are not (technically) executed are also not treated as a > > > hit kprobe. Steve Capper has suggested that the probe handling should > > > still take place because we stepped through the instruction even if it > > > was effectively a nop. This would be a significant change in how it > > > currently works on 32-bit ARM > > > > 32-bit ARM uses undefined instructions for kprobe 'breakpoints' and the > > ARM ARM says it's implementation defined behaviour whether these > > generate exceptions or not, i.e. whether the kprobe handler will be > > called. > > There are two classes of undefined instructions. There are those which > fall into the above category, and there are those which are guaranteed > to raise an exception. We should always be using the guaranteed ones, > not the other set. I wonder if I'm going senile or have been subject to having the ARM ARM evolve under me. I could swear we used instructions that were defined as undefined (so to speak) and that conditional versions of undefined instructions were UNPREDICTABLE. However, checking the ARM ARM again I see those instruction encodings are for the BKPT instruction which says: Breakpoint causes a software breakpoint to occur. Breakpoint is always unconditional, even when inside an IT block. So, my previous statements about not being able to implement the proposed kprobe changes consistently aren't true. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] kprobe'ing conditionally executed instructions
On Fri, 2015-12-11 at 00:05 -0500, David Long wrote: > There is a moderate amount of code already in kprobes on ARM and the > current ARMv8 patch to deal with conditional execution of instructions. > One aspect of how this is handled is that instructions that fail their > predicate and are not (technically) executed are also not treated as a > hit kprobe. Steve Capper has suggested that the probe handling should > still take place because we stepped through the instruction even if it > was effectively a nop. This would be a significant change in how it > currently works on 32-bit ARM 32-bit ARM uses undefined instructions for kprobe 'breakpoints' and the ARM ARM says it's implementation defined behaviour whether these generate exceptions or not, i.e. whether the kprobe handler will be called. You could say that we could always use unconditional breakpoints, but this doesn't work with thumb where the instruction could be in an IT block. So, the only way to have consistent behaviour on all platforms is to not call kprobe handlers if condition check fails. Which is the reason for the current implementation's design. Also, if we change the current implementation as suggested, then looking at things from a source code point of view... if (test) foo() else bar(); If you put a probe on the call to foo() and the compiler uses a branch instruction for the test you're never going to hit the probe fortest==false. But if it decides to use conditional instructions it will (on some CPU implementations). And the choice between branch/conditional instructions probably varies between GCC version and kernel configs. So again, IMO, the current kprobes implementation leads to consistency. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] kprobe'ing conditionally executed instructions
On Fri, 2015-12-11 at 10:34 +, Russell King - ARM Linux wrote: > On Fri, Dec 11, 2015 at 10:27:13AM +0000, Jon Medhurst (Tixy) wrote: > > On Fri, 2015-12-11 at 00:05 -0500, David Long wrote: > > > There is a moderate amount of code already in kprobes on ARM and the > > > current ARMv8 patch to deal with conditional execution of instructions. > > > One aspect of how this is handled is that instructions that fail their > > > predicate and are not (technically) executed are also not treated as a > > > hit kprobe. Steve Capper has suggested that the probe handling should > > > still take place because we stepped through the instruction even if it > > > was effectively a nop. This would be a significant change in how it > > > currently works on 32-bit ARM > > > > 32-bit ARM uses undefined instructions for kprobe 'breakpoints' and the > > ARM ARM says it's implementation defined behaviour whether these > > generate exceptions or not, i.e. whether the kprobe handler will be > > called. > > There are two classes of undefined instructions. There are those which > fall into the above category, and there are those which are guaranteed > to raise an exception. We should always be using the guaranteed ones, > not the other set. I wonder if I'm going senile or have been subject to having the ARM ARM evolve under me. I could swear we used instructions that were defined as undefined (so to speak) and that conditional versions of undefined instructions were UNPREDICTABLE. However, checking the ARM ARM again I see those instruction encodings are for the BKPT instruction which says: Breakpoint causes a software breakpoint to occur. Breakpoint is always unconditional, even when inside an IT block. So, my previous statements about not being able to implement the proposed kprobe changes consistently aren't true. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] kprobe'ing conditionally executed instructions
On Fri, 2015-12-11 at 00:05 -0500, David Long wrote: > There is a moderate amount of code already in kprobes on ARM and the > current ARMv8 patch to deal with conditional execution of instructions. > One aspect of how this is handled is that instructions that fail their > predicate and are not (technically) executed are also not treated as a > hit kprobe. Steve Capper has suggested that the probe handling should > still take place because we stepped through the instruction even if it > was effectively a nop. This would be a significant change in how it > currently works on 32-bit ARM 32-bit ARM uses undefined instructions for kprobe 'breakpoints' and the ARM ARM says it's implementation defined behaviour whether these generate exceptions or not, i.e. whether the kprobe handler will be called. You could say that we could always use unconditional breakpoints, but this doesn't work with thumb where the instruction could be in an IT block. So, the only way to have consistent behaviour on all platforms is to not call kprobe handlers if condition check fails. Which is the reason for the current implementation's design. Also, if we change the current implementation as suggested, then looking at things from a source code point of view... if (test) foo() else bar(); If you put a probe on the call to foo() and the compiler uses a branch instruction for the test you're never going to hit the probe fortest==false. But if it decides to use conditional instructions it will (on some CPU implementations). And the choice between branch/conditional instructions probably varies between GCC version and kernel configs. So again, IMO, the current kprobes implementation leads to consistency. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 11/12] cpufreq: arm-big-little: clarify frequency units
On Wed, 2015-12-02 at 22:19 +0100, Ben Gamari wrote: > The frequency units are very confusing in this area as OPPs use Hz > whereas cpufreq uses kHz. Be explicit about this in variable naming. > > Cc: Javier Martinez Canillas > Signed-off-by: Ben Gamari > --- > drivers/cpufreq/arm_big_little.c | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/drivers/cpufreq/arm_big_little.c > b/drivers/cpufreq/arm_big_little.c > index 855599b..2d5761c 100644 > --- a/drivers/cpufreq/arm_big_little.c > +++ b/drivers/cpufreq/arm_big_little.c > @@ -130,14 +130,14 @@ static unsigned int bL_cpufreq_get_rate(unsigned int > cpu) > } > > static int > -bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 new_rate) > +bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 new_rate_kHz) > { > unsigned long volt = 0, volt_old = 0; > long freq_Hz; > u32 old_rate; IMO variable renaming doesn't seem necessary, if cpufreq uses kHz then in a cpufreq driver adding 'kHz' to variable seems redundant, especially if Hz values like freq_Hz above are named especially to signal their different units. However, if renaming is going to happen it should at least be consistent within the same function i.e. also rename the old old_rate variable above. > int ret; > > - freq_Hz = new_rate * 1000; > + freq_Hz = new_rate_kHz * 1000; > old_rate = clk_get_rate(clk[cluster]) / 1000; > > if (!IS_ERR(reg[cluster])) { > @@ -163,10 +163,10 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > pr_debug("%s: cpu %d, cluster: %d, %u MHz, %ld mV --> %u MHz, %ld mV\n", > __func__, cpu, cluster, > old_rate / 1000, (volt_old > 0) ? volt_old / 1000 : -1, > - new_rate / 1000, volt ? volt / 1000 : -1); > + new_rate_kHz / 1000, volt ? volt / 1000 : -1); > > /* scaling up? scale voltage before frequency */ > - if (!IS_ERR(reg[cluster]) && new_rate > old_rate) { > + if (!IS_ERR(reg[cluster]) && new_rate_kHz > old_rate) { > ret = regulator_set_voltage_tol(reg[cluster], volt, 0); > if (ret) { > pr_err("%s: cpu: %d, cluster: %d, failed to scale > voltage up: %d\n", > @@ -175,7 +175,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > } > > - ret = clk_set_rate(clk[cluster], new_rate * 1000); > + ret = clk_set_rate(clk[cluster], new_rate_kHz * 1000); > if (WARN_ON(ret)) { > pr_err("%s: clk_set_rate failed: %d, cluster: %d\n", > __func__, cluster, ret); > @@ -185,7 +185,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > > /* scaling down? scale voltage after frequency */ > - if (!IS_ERR(reg[cluster]) && new_rate < old_rate) { > + if (!IS_ERR(reg[cluster]) && new_rate_kHz < old_rate) { > ret = regulator_set_voltage_tol(reg[cluster], volt, 0); > if (ret) { > pr_err("%s: cpu: %d, cluster: %d, failed to scale > voltage down: %d\n", > @@ -199,7 +199,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > > static unsigned int > -bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) > +bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate_kHz) > { > u32 new_rate, prev_rate; Ditto. Rename these too to add '_kHz' ? > int ret; > @@ -209,13 +209,13 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > > if (bLs) { > prev_rate = per_cpu(cpu_last_req_freq, cpu); > - per_cpu(cpu_last_req_freq, cpu) = rate; > + per_cpu(cpu_last_req_freq, cpu) = rate_kHz; > per_cpu(physical_cluster, cpu) = new_cluster; > > new_rate = find_cluster_maxfreq(new_cluster); > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > } else { > - new_rate = rate; > + new_rate = rate_kHz; > } > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > @@ -236,7 +236,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > } else if (ret && bLs) { > per_cpu(cpu_last_req_freq, cpu) = prev_rate; > per_cpu(physical_cluster, cpu) = old_cluster; > - } > + } There's a spurious whitespace change here. I know the space you deleted shouldn't have been there, but doing tidyups like that generally isn't done in patches that don't otherwise affect the code in question. > > mutex_unlock(_lock[new_cluster]); > -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 11/12] cpufreq: arm-big-little: clarify frequency units
On Wed, 2015-12-02 at 22:19 +0100, Ben Gamari wrote: > The frequency units are very confusing in this area as OPPs use Hz > whereas cpufreq uses kHz. Be explicit about this in variable naming. > > Cc: Javier Martinez Canillas> Signed-off-by: Ben Gamari > --- > drivers/cpufreq/arm_big_little.c | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/drivers/cpufreq/arm_big_little.c > b/drivers/cpufreq/arm_big_little.c > index 855599b..2d5761c 100644 > --- a/drivers/cpufreq/arm_big_little.c > +++ b/drivers/cpufreq/arm_big_little.c > @@ -130,14 +130,14 @@ static unsigned int bL_cpufreq_get_rate(unsigned int > cpu) > } > > static int > -bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 new_rate) > +bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 new_rate_kHz) > { > unsigned long volt = 0, volt_old = 0; > long freq_Hz; > u32 old_rate; IMO variable renaming doesn't seem necessary, if cpufreq uses kHz then in a cpufreq driver adding 'kHz' to variable seems redundant, especially if Hz values like freq_Hz above are named especially to signal their different units. However, if renaming is going to happen it should at least be consistent within the same function i.e. also rename the old old_rate variable above. > int ret; > > - freq_Hz = new_rate * 1000; > + freq_Hz = new_rate_kHz * 1000; > old_rate = clk_get_rate(clk[cluster]) / 1000; > > if (!IS_ERR(reg[cluster])) { > @@ -163,10 +163,10 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > pr_debug("%s: cpu %d, cluster: %d, %u MHz, %ld mV --> %u MHz, %ld mV\n", > __func__, cpu, cluster, > old_rate / 1000, (volt_old > 0) ? volt_old / 1000 : -1, > - new_rate / 1000, volt ? volt / 1000 : -1); > + new_rate_kHz / 1000, volt ? volt / 1000 : -1); > > /* scaling up? scale voltage before frequency */ > - if (!IS_ERR(reg[cluster]) && new_rate > old_rate) { > + if (!IS_ERR(reg[cluster]) && new_rate_kHz > old_rate) { > ret = regulator_set_voltage_tol(reg[cluster], volt, 0); > if (ret) { > pr_err("%s: cpu: %d, cluster: %d, failed to scale > voltage up: %d\n", > @@ -175,7 +175,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > } > > - ret = clk_set_rate(clk[cluster], new_rate * 1000); > + ret = clk_set_rate(clk[cluster], new_rate_kHz * 1000); > if (WARN_ON(ret)) { > pr_err("%s: clk_set_rate failed: %d, cluster: %d\n", > __func__, cluster, ret); > @@ -185,7 +185,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > > /* scaling down? scale voltage after frequency */ > - if (!IS_ERR(reg[cluster]) && new_rate < old_rate) { > + if (!IS_ERR(reg[cluster]) && new_rate_kHz < old_rate) { > ret = regulator_set_voltage_tol(reg[cluster], volt, 0); > if (ret) { > pr_err("%s: cpu: %d, cluster: %d, failed to scale > voltage down: %d\n", > @@ -199,7 +199,7 @@ bL_cpufreq_set_rate_cluster(u32 cpu, u32 cluster, u32 > new_rate) > } > > static unsigned int > -bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) > +bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate_kHz) > { > u32 new_rate, prev_rate; Ditto. Rename these too to add '_kHz' ? > int ret; > @@ -209,13 +209,13 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > > if (bLs) { > prev_rate = per_cpu(cpu_last_req_freq, cpu); > - per_cpu(cpu_last_req_freq, cpu) = rate; > + per_cpu(cpu_last_req_freq, cpu) = rate_kHz; > per_cpu(physical_cluster, cpu) = new_cluster; > > new_rate = find_cluster_maxfreq(new_cluster); > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > } else { > - new_rate = rate; > + new_rate = rate_kHz; > } > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > @@ -236,7 +236,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > } else if (ret && bLs) { > per_cpu(cpu_last_req_freq, cpu) = prev_rate; > per_cpu(physical_cluster, cpu) = old_cluster; > - } > + } There's a spurious whitespace change here. I know the space you deleted shouldn't have been there, but doing tidyups like that generally isn't done in patches that don't otherwise affect the code in question. > > mutex_unlock(_lock[new_cluster]); > -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the
Re: [PATCH] arm64: restore bogomips information in /proc/cpuinfo
On Wed, 2015-11-25 at 13:09 -0500, Nicolas Pitre wrote: > But CPU MHz, when available, has the merit of not being open to > interpretation. Current MHz, maximum MHz or 'turbo' MHz? ;-) -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] arm64: restore bogomips information in /proc/cpuinfo
On Wed, 2015-11-25 at 13:09 -0500, Nicolas Pitre wrote: > But CPU MHz, when available, has the merit of not being open to > interpretation. Current MHz, maximum MHz or 'turbo' MHz? ;-) -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/4] drm: arm: Add DT bindings documentation for HDLCD driver.
On Thu, 2015-11-12 at 10:42 +, Liviu Dudau wrote: > > This is on-chip RAM or nornal system RAM? We already have bindings > for > > both. > > Juno has a set of TLX (ThinLinks) connectors on the board where an > FPGA can be attached. On r1 > the code running on FPGA can even participate as an AXI master with > full coherency. The FPGA > has local memory that we want to share with the HDLCD to be used as a > framebuffer. The HDLCD on the Juno chip or one implemented in the FPGA? I assume you mean the latter but just wanted to check. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 1/4] drm: arm: Add DT bindings documentation for HDLCD driver.
On Thu, 2015-11-12 at 10:42 +, Liviu Dudau wrote: > > This is on-chip RAM or nornal system RAM? We already have bindings > for > > both. > > Juno has a set of TLX (ThinLinks) connectors on the board where an > FPGA can be attached. On r1 > the code running on FPGA can even participate as an AXI master with > full coherency. The FPGA > has local memory that we want to share with the HDLCD to be used as a > framebuffer. The HDLCD on the Juno chip or one implemented in the FPGA? I assume you mean the latter but just wanted to check. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-21 at 15:27 +0530, Viresh Kumar wrote: > On 21-10-15, 10:55, Jon Medhurst (Tixy) wrote: > > The check for correct frequency being set in bL_cpufreq_set_rate is > > broken when the big.LITTLE switcher is active, for two reasons. > > > > 1. The 'new_rate' variable gets overwritten before the test by the > > code calculating the frequency of the old cluster. > > > > 2. The frequency returned by bL_cpufreq_get_rate will be the virtual > > frequency, not the actual one the intended version of new_rate contains. > > > > This means the function always returns an error causing an endless > > stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" > > > > As the intent is to check for errors that clk_set_rate doesn't report > > lets move the check to immediately after that and directly use > > clk_get_rate, rather than the arm_big_little helpers which only confuse > > matters. Also, update the comment to be hopefully clearer about the > > purpose of the code. > > > > Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is > > set correctly") > > > > Signed-off-by: Jon Medhurst > > Acked-by: Sudeep Holla > > --- > > [...] > > Acked-by: Viresh Kumar > Do I need to send this again with that Ack and Michael's Reviewed-by added? -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-21 at 15:27 +0530, Viresh Kumar wrote: > On 21-10-15, 10:55, Jon Medhurst (Tixy) wrote: > > The check for correct frequency being set in bL_cpufreq_set_rate is > > broken when the big.LITTLE switcher is active, for two reasons. > > > > 1. The 'new_rate' variable gets overwritten before the test by the > > code calculating the frequency of the old cluster. > > > > 2. The frequency returned by bL_cpufreq_get_rate will be the virtual > > frequency, not the actual one the intended version of new_rate contains. > > > > This means the function always returns an error causing an endless > > stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" > > > > As the intent is to check for errors that clk_set_rate doesn't report > > lets move the check to immediately after that and directly use > > clk_get_rate, rather than the arm_big_little helpers which only confuse > > matters. Also, update the comment to be hopefully clearer about the > > purpose of the code. > > > > Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is > > set correctly") > > > > Signed-off-by: Jon Medhurst <t...@linaro.org> > > Acked-by: Sudeep Holla <sudeep.ho...@arm.com> > > --- > > [...] > > Acked-by: Viresh Kumar <viresh.ku...@linaro.org> > Do I need to send this again with that Ack and Michael's Reviewed-by added? -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] cpufreq: arm_big_little: fix frequency check when bL switcher is active
The check for correct frequency being set in bL_cpufreq_set_rate is broken when the big.LITTLE switcher is active, for two reasons. 1. The 'new_rate' variable gets overwritten before the test by the code calculating the frequency of the old cluster. 2. The frequency returned by bL_cpufreq_get_rate will be the virtual frequency, not the actual one the intended version of new_rate contains. This means the function always returns an error causing an endless stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" As the intent is to check for errors that clk_set_rate doesn't report lets move the check to immediately after that and directly use clk_get_rate, rather than the arm_big_little helpers which only confuse matters. Also, update the comment to be hopefully clearer about the purpose of the code. Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is set correctly") Signed-off-by: Jon Medhurst Acked-by: Sudeep Holla --- Changes since V1: - Check rate using clk_get_rate rather than disabling check when bL switcher active Sudeep, I added your Ack from the last comment on the previous patch. This final patch differs from what was discussed only in the commit message and in source comment which is hopefully more clear and is also satisfactory. I've also added Michael Turquette's correct email to the CC this time, rather than his old Linaro address which was bouncing. drivers/cpufreq/arm_big_little.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..c5d256c 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -149,6 +149,19 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) __func__, cpu, old_cluster, new_cluster, new_rate); ret = clk_set_rate(clk[new_cluster], new_rate * 1000); + if (!ret) { + /* +* FIXME: clk_set_rate hasn't returned an error here however it +* may be that clk_change_rate failed due to hardware or +* firmware issues and wasn't able to report that due to the +* current design of the clk core layer. To work around this +* problem we will read back the clock rate and check it is +* correct. This needs to be removed once clk core is fixed. +*/ + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) + ret = -EIO; + } + if (WARN_ON(ret)) { pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, new_cluster); @@ -189,15 +202,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) mutex_unlock(_lock[old_cluster]); } - /* -* FIXME: clk_set_rate has to handle the case where clk_change_rate -* can fail due to hardware or firmware issues. Until the clk core -* layer is fixed, we can check here. In most of the cases we will -* be reading only the cached value anyway. This needs to be removed -* once clk core is fixed. -*/ - if (bL_cpufreq_get_rate(cpu) != new_rate) - return -EIO; return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] cpufreq: arm_big_little: fix frequency check when bL switcher is active
The check for correct frequency being set in bL_cpufreq_set_rate is broken when the big.LITTLE switcher is active, for two reasons. 1. The 'new_rate' variable gets overwritten before the test by the code calculating the frequency of the old cluster. 2. The frequency returned by bL_cpufreq_get_rate will be the virtual frequency, not the actual one the intended version of new_rate contains. This means the function always returns an error causing an endless stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" As the intent is to check for errors that clk_set_rate doesn't report lets move the check to immediately after that and directly use clk_get_rate, rather than the arm_big_little helpers which only confuse matters. Also, update the comment to be hopefully clearer about the purpose of the code. Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is set correctly") Signed-off-by: Jon MedhurstAcked-by: Sudeep Holla --- Changes since V1: - Check rate using clk_get_rate rather than disabling check when bL switcher active Sudeep, I added your Ack from the last comment on the previous patch. This final patch differs from what was discussed only in the commit message and in source comment which is hopefully more clear and is also satisfactory. I've also added Michael Turquette's correct email to the CC this time, rather than his old Linaro address which was bouncing. drivers/cpufreq/arm_big_little.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..c5d256c 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -149,6 +149,19 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) __func__, cpu, old_cluster, new_cluster, new_rate); ret = clk_set_rate(clk[new_cluster], new_rate * 1000); + if (!ret) { + /* +* FIXME: clk_set_rate hasn't returned an error here however it +* may be that clk_change_rate failed due to hardware or +* firmware issues and wasn't able to report that due to the +* current design of the clk core layer. To work around this +* problem we will read back the clock rate and check it is +* correct. This needs to be removed once clk core is fixed. +*/ + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) + ret = -EIO; + } + if (WARN_ON(ret)) { pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, new_cluster); @@ -189,15 +202,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) mutex_unlock(_lock[old_cluster]); } - /* -* FIXME: clk_set_rate has to handle the case where clk_change_rate -* can fail due to hardware or firmware issues. Until the clk core -* layer is fixed, we can check here. In most of the cases we will -* be reading only the cached value anyway. This needs to be removed -* once clk core is fixed. -*/ - if (bL_cpufreq_get_rate(cpu) != new_rate) - return -EIO; return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-14 at 09:48 +0100, Sudeep Holla wrote: > > On 14/10/15 08:12, Jon Medhurst (Tixy) wrote: > > On Tue, 2015-10-13 at 11:36 +0100, Sudeep Holla wrote: > >> > >> On 13/10/15 08:19, Jon Medhurst (Tixy) wrote: > > [...] > >>> But then we wouldn't get the WARN_ON and pr_err triggered when we detect > >>> the clock rate isn't set, which surely is half the reason for the check > >>> in the first place? > >>> > >> > >> Not sure if I understand what you mean or may be I was not clear, so > >> thought I will put the delta here. Let me know if and how its still a > >> problem. > >> > >> diff --git i/drivers/cpufreq/arm_big_little.c > >> w/drivers/cpufreq/arm_big_little.c > >> index f1e42f8ce0fc..05e850f80f39 100644 > >> --- i/drivers/cpufreq/arm_big_little.c > >> +++ w/drivers/cpufreq/arm_big_little.c > >> @@ -164,6 +164,16 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > >> new_cluster, u32 rate) > >> > >> mutex_unlock(_lock[new_cluster]); > >> > >> + /* > >> +* FIXME: clk_set_rate has to handle the case where clk_change_rate > >> +* can fail due to hardware or firmware issues. Until the clk core > >> +* layer is fixed, we can check here. In most of the cases we will > >> +* be reading only the cached value anyway. This needs to be > >> removed > >> +* once clk core is fixed. > >> +*/ > >> + if (bL_cpufreq_get_rate(cpu) != new_rate) > >> + return -EIO; > >> + > >> /* Recalc freq for old cluster when switching clusters */ > >> if (old_cluster != new_cluster) { > >> pr_debug("%s: cpu: %d, old cluster: %d, new cluster: > >> %d\n", > > > > That's what I though you meant, and I can't see why you would want to do > > that and bypass the error reporting for clk_get_rate failing. After all, > > the code we're moving around is explicitly there to workaround the fact > > that clk_set_rate doesn't actually pass through all errors, so it's > > doing additional error checking. (At least, that's what the comment > > says). So this looks more logical to me. > > > > OK, I understand what you mean now. I don't have a strong opinion, but > here is the reason why I prefer the approach I said earlier: > clk_set_rate doesn't return error if the h/w or f/w return error which > is usually the last step. So calling clk_get_rate when clk_set_rate > return error quite early makes no sense to me. It doesn't to me either, but my suggested code doesn't do that, it only calls clk_get_rate if the is _no_ error from clk_set_rate, the pseudo code again... ret = clk_set_rate() if(!ret) /* if no error from clk_set_rate */ if(clk_get_rate()!=correct) /* but some additional checks fail */ ret = -EIO; /* then indicate an error anyway */ !ret is ret==0 is 'no error' as the comment says. So the clock framework thinks the rate was set OK and we then use clk_get_rate to see if those unreported h/w or f/w errors mean that it actually wasn't set OK. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-14 at 09:48 +0100, Sudeep Holla wrote: > > On 14/10/15 08:12, Jon Medhurst (Tixy) wrote: > > On Tue, 2015-10-13 at 11:36 +0100, Sudeep Holla wrote: > >> > >> On 13/10/15 08:19, Jon Medhurst (Tixy) wrote: > > [...] > >>> But then we wouldn't get the WARN_ON and pr_err triggered when we detect > >>> the clock rate isn't set, which surely is half the reason for the check > >>> in the first place? > >>> > >> > >> Not sure if I understand what you mean or may be I was not clear, so > >> thought I will put the delta here. Let me know if and how its still a > >> problem. > >> > >> diff --git i/drivers/cpufreq/arm_big_little.c > >> w/drivers/cpufreq/arm_big_little.c > >> index f1e42f8ce0fc..05e850f80f39 100644 > >> --- i/drivers/cpufreq/arm_big_little.c > >> +++ w/drivers/cpufreq/arm_big_little.c > >> @@ -164,6 +164,16 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > >> new_cluster, u32 rate) > >> > >> mutex_unlock(_lock[new_cluster]); > >> > >> + /* > >> +* FIXME: clk_set_rate has to handle the case where clk_change_rate > >> +* can fail due to hardware or firmware issues. Until the clk core > >> +* layer is fixed, we can check here. In most of the cases we will > >> +* be reading only the cached value anyway. This needs to be > >> removed > >> +* once clk core is fixed. > >> +*/ > >> + if (bL_cpufreq_get_rate(cpu) != new_rate) > >> + return -EIO; > >> + > >> /* Recalc freq for old cluster when switching clusters */ > >> if (old_cluster != new_cluster) { > >> pr_debug("%s: cpu: %d, old cluster: %d, new cluster: > >> %d\n", > > > > That's what I though you meant, and I can't see why you would want to do > > that and bypass the error reporting for clk_get_rate failing. After all, > > the code we're moving around is explicitly there to workaround the fact > > that clk_set_rate doesn't actually pass through all errors, so it's > > doing additional error checking. (At least, that's what the comment > > says). So this looks more logical to me. > > > > OK, I understand what you mean now. I don't have a strong opinion, but > here is the reason why I prefer the approach I said earlier: > clk_set_rate doesn't return error if the h/w or f/w return error which > is usually the last step. So calling clk_get_rate when clk_set_rate > return error quite early makes no sense to me. It doesn't to me either, but my suggested code doesn't do that, it only calls clk_get_rate if the is _no_ error from clk_set_rate, the pseudo code again... ret = clk_set_rate() if(!ret) /* if no error from clk_set_rate */ if(clk_get_rate()!=correct) /* but some additional checks fail */ ret = -EIO; /* then indicate an error anyway */ !ret is ret==0 is 'no error' as the comment says. So the clock framework thinks the rate was set OK and we then use clk_get_rate to see if those unreported h/w or f/w errors mean that it actually wasn't set OK. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Tue, 2015-10-13 at 11:36 +0100, Sudeep Holla wrote: > > On 13/10/15 08:19, Jon Medhurst (Tixy) wrote: [...] > > But then we wouldn't get the WARN_ON and pr_err triggered when we detect > > the clock rate isn't set, which surely is half the reason for the check > > in the first place? > > > > Not sure if I understand what you mean or may be I was not clear, so > thought I will put the delta here. Let me know if and how its still a > problem. > > diff --git i/drivers/cpufreq/arm_big_little.c > w/drivers/cpufreq/arm_big_little.c > index f1e42f8ce0fc..05e850f80f39 100644 > --- i/drivers/cpufreq/arm_big_little.c > +++ w/drivers/cpufreq/arm_big_little.c > @@ -164,6 +164,16 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > > mutex_unlock(_lock[new_cluster]); > > + /* > +* FIXME: clk_set_rate has to handle the case where clk_change_rate > +* can fail due to hardware or firmware issues. Until the clk core > +* layer is fixed, we can check here. In most of the cases we will > +* be reading only the cached value anyway. This needs to be > removed > +* once clk core is fixed. > +*/ > + if (bL_cpufreq_get_rate(cpu) != new_rate) > + return -EIO; > + > /* Recalc freq for old cluster when switching clusters */ > if (old_cluster != new_cluster) { > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d\n", That's what I though you meant, and I can't see why you would want to do that and bypass the error reporting for clk_get_rate failing. After all, the code we're moving around is explicitly there to workaround the fact that clk_set_rate doesn't actually pass through all errors, so it's doing additional error checking. (At least, that's what the comment says). So this looks more logical to me. ret = clk_set_rate() if(!ret) /* if no error from clk_set_rate */ if(clk_get_rate()!=correct) /* but some additional checks fail */ ret = -EIO; /* then indicate an error anyway */ if (WARN_ON(ret))/* Warn if error setting rate and */ pr_err("clk_set_rate failed")/* print and error too */ But if people want the if(clk_get_rate()!=correct) after the WARN_ON then lets do that, the important thing is to get the code fixed. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Tue, 2015-10-13 at 11:36 +0100, Sudeep Holla wrote: > > On 13/10/15 08:19, Jon Medhurst (Tixy) wrote: [...] > > But then we wouldn't get the WARN_ON and pr_err triggered when we detect > > the clock rate isn't set, which surely is half the reason for the check > > in the first place? > > > > Not sure if I understand what you mean or may be I was not clear, so > thought I will put the delta here. Let me know if and how its still a > problem. > > diff --git i/drivers/cpufreq/arm_big_little.c > w/drivers/cpufreq/arm_big_little.c > index f1e42f8ce0fc..05e850f80f39 100644 > --- i/drivers/cpufreq/arm_big_little.c > +++ w/drivers/cpufreq/arm_big_little.c > @@ -164,6 +164,16 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > > mutex_unlock(_lock[new_cluster]); > > + /* > +* FIXME: clk_set_rate has to handle the case where clk_change_rate > +* can fail due to hardware or firmware issues. Until the clk core > +* layer is fixed, we can check here. In most of the cases we will > +* be reading only the cached value anyway. This needs to be > removed > +* once clk core is fixed. > +*/ > + if (bL_cpufreq_get_rate(cpu) != new_rate) > + return -EIO; > + > /* Recalc freq for old cluster when switching clusters */ > if (old_cluster != new_cluster) { > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d\n", That's what I though you meant, and I can't see why you would want to do that and bypass the error reporting for clk_get_rate failing. After all, the code we're moving around is explicitly there to workaround the fact that clk_set_rate doesn't actually pass through all errors, so it's doing additional error checking. (At least, that's what the comment says). So this looks more logical to me. ret = clk_set_rate() if(!ret) /* if no error from clk_set_rate */ if(clk_get_rate()!=correct) /* but some additional checks fail */ ret = -EIO; /* then indicate an error anyway */ if (WARN_ON(ret))/* Warn if error setting rate and */ pr_err("clk_set_rate failed")/* print and error too */ But if people want the if(clk_get_rate()!=correct) after the WARN_ON then lets do that, the important thing is to get the code fixed. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Mon, 2015-10-12 at 14:20 +0100, Sudeep Holla wrote: > > On 08/10/15 10:23, Jon Medhurst (Tixy) wrote: > [...] > > > diff --git a/drivers/cpufreq/arm_big_little.c > > b/drivers/cpufreq/arm_big_little.c > > index f1e42f8..59115a4 100644 > > --- a/drivers/cpufreq/arm_big_little.c > > +++ b/drivers/cpufreq/arm_big_little.c > > @@ -149,6 +149,18 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > new_cluster, u32 rate) > > __func__, cpu, old_cluster, new_cluster, new_rate); > > > > ret = clk_set_rate(clk[new_cluster], new_rate * 1000); > > + if (!ret) { > > + /* > > +* FIXME: clk_set_rate has to handle the case where > > clk_change_rate > > +* can fail due to hardware or firmware issues. Until the clk > > core > > +* layer is fixed, we can check here. In most of the cases we > > will > > +* be reading only the cached value anyway. This needs to be > > removed > > +* once clk core is fixed. > > +*/ > > + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) > > + ret = -EIO; > > + } > > + > > if (WARN_ON(ret)) { > > pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, > > new_cluster); > > @@ -189,15 +201,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > new_cluster, u32 rate) > > mutex_unlock(_lock[old_cluster]); > > } > > > > - /* > > -* FIXME: clk_set_rate has to handle the case where clk_change_rate > > -* can fail due to hardware or firmware issues. Until the clk core > > -* layer is fixed, we can check here. In most of the cases we will > > -* be reading only the cached value anyway. This needs to be removed > > -* once clk core is fixed. > > -*/ > > - if (bL_cpufreq_get_rate(cpu) != new_rate) > > - return -EIO; > > return 0; > > } > > > > > > > > > > The above change looks good to me but with minor nit. You can get rid of > if(!ret) check if you move the hunk after if (WARN_ON(ret)) But then we wouldn't get the WARN_ON and pr_err triggered when we detect the clock rate isn't set, which surely is half the reason for the check in the first place? (P.S. I'm on holiday this week without access to my main computer, dev boards or decent internet connectivity). -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Mon, 2015-10-12 at 14:20 +0100, Sudeep Holla wrote: > > On 08/10/15 10:23, Jon Medhurst (Tixy) wrote: > [...] > > > diff --git a/drivers/cpufreq/arm_big_little.c > > b/drivers/cpufreq/arm_big_little.c > > index f1e42f8..59115a4 100644 > > --- a/drivers/cpufreq/arm_big_little.c > > +++ b/drivers/cpufreq/arm_big_little.c > > @@ -149,6 +149,18 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > new_cluster, u32 rate) > > __func__, cpu, old_cluster, new_cluster, new_rate); > > > > ret = clk_set_rate(clk[new_cluster], new_rate * 1000); > > + if (!ret) { > > + /* > > +* FIXME: clk_set_rate has to handle the case where > > clk_change_rate > > +* can fail due to hardware or firmware issues. Until the clk > > core > > +* layer is fixed, we can check here. In most of the cases we > > will > > +* be reading only the cached value anyway. This needs to be > > removed > > +* once clk core is fixed. > > +*/ > > + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) > > + ret = -EIO; > > + } > > + > > if (WARN_ON(ret)) { > > pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, > > new_cluster); > > @@ -189,15 +201,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > new_cluster, u32 rate) > > mutex_unlock(_lock[old_cluster]); > > } > > > > - /* > > -* FIXME: clk_set_rate has to handle the case where clk_change_rate > > -* can fail due to hardware or firmware issues. Until the clk core > > -* layer is fixed, we can check here. In most of the cases we will > > -* be reading only the cached value anyway. This needs to be removed > > -* once clk core is fixed. > > -*/ > > - if (bL_cpufreq_get_rate(cpu) != new_rate) > > - return -EIO; > > return 0; > > } > > > > > > > > > > The above change looks good to me but with minor nit. You can get rid of > if(!ret) check if you move the hunk after if (WARN_ON(ret)) But then we wouldn't get the WARN_ON and pr_err triggered when we detect the clock rate isn't set, which surely is half the reason for the check in the first place? (P.S. I'm on holiday this week without access to my main computer, dev boards or decent internet connectivity). -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Thu, 2015-10-08 at 16:54 +0530, Viresh Kumar wrote: > On 08-10-15, 10:23, Jon Medhurst (Tixy) wrote: > > On Wed, 2015-10-07 at 23:09 +0530, Viresh Kumar wrote: [...] > > > @@ -140,9 +140,11 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > > new_cluster, u32 rate) > > > per_cpu(physical_cluster, cpu) = new_cluster; > > > > > > new_rate = find_cluster_maxfreq(new_cluster); > > > + target_rate = new_rate; > > This is still a virtual freq ... > > > > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > > And new_rate is the actual freq.. > > > > } else { > > > new_rate = rate; > > > + target_rate = new_rate; > > Here both are actual freqs, and no virtual freq. > > > > } > > > > > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > > > @@ -196,7 +198,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > > new_cluster, u32 rate) > > >* be reading only the cached value anyway. This needs to be removed > > >* once clk core is fixed. > > >*/ > > > - if (bL_cpufreq_get_rate(cpu) != new_rate) > > > + if (bL_cpufreq_get_rate(cpu) != target_rate) > > > return -EIO; > > > return 0; > > > } > > > > You call that a proper fix? ;-) It's comparing an 'virtual' frequency to > > an 'actual' frequency. > > So, why do you say so? I thought both are virtual freqs only. You are right, I had misread the code. I guess my problem is that I'm actually running the code then when it doesn't work (which it doesn't) going back to try and work out why not. Looking a bit more carefully, the reason your fix doesn't work is that bL_cpufreq_get_rate returns the last requested rate for this CPU, whereas target_rate/new_rate is the maximum rate requested by any CPU on the cluster (which is what we want the hardware set to). > > > If the real intent is to check that clk_set_rate works I would have > > thought the patch below would be correct. But I didn't propose that as > > it's the obvious implementation and I assumed the original patch didn't > > do it that way for a reason. > > Maybe yes. Only Sudeep can tell why he didn't do it that way. But > yeah, the intent was only what the comment says. So sounds like my alternative fix of checking the 'actual' frequency immediately after setting it is probably the way forward, unless Sudeep chimes in with additional info about the issue he was trying to address. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-07 at 23:09 +0530, Viresh Kumar wrote: [...] > And why not fix it properly by doing this: > > diff --git a/drivers/cpufreq/arm_big_little.c > b/drivers/cpufreq/arm_big_little.c > index f1e42f8ce0fc..5b36657a76d6 100644 > --- a/drivers/cpufreq/arm_big_little.c > +++ b/drivers/cpufreq/arm_big_little.c > @@ -128,7 +128,7 @@ static unsigned int bL_cpufreq_get_rate(unsigned int cpu) > static unsigned int > bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) > { > - u32 new_rate, prev_rate; > + u32 new_rate, prev_rate, target_rate; > int ret; > bool bLs = is_bL_switching_enabled(); > > @@ -140,9 +140,11 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > per_cpu(physical_cluster, cpu) = new_cluster; > > new_rate = find_cluster_maxfreq(new_cluster); > + target_rate = new_rate; > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > } else { > new_rate = rate; > + target_rate = new_rate; > } > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > @@ -196,7 +198,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) >* be reading only the cached value anyway. This needs to be removed >* once clk core is fixed. >*/ > - if (bL_cpufreq_get_rate(cpu) != new_rate) > + if (bL_cpufreq_get_rate(cpu) != target_rate) > return -EIO; > return 0; > } You call that a proper fix? ;-) It's comparing an 'virtual' frequency to an 'actual' frequency. If the real intent is to check that clk_set_rate works I would have thought the patch below would be correct. But I didn't propose that as it's the obvious implementation and I assumed the original patch didn't do it that way for a reason. diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..59115a4 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -149,6 +149,18 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) __func__, cpu, old_cluster, new_cluster, new_rate); ret = clk_set_rate(clk[new_cluster], new_rate * 1000); + if (!ret) { + /* +* FIXME: clk_set_rate has to handle the case where clk_change_rate +* can fail due to hardware or firmware issues. Until the clk core +* layer is fixed, we can check here. In most of the cases we will +* be reading only the cached value anyway. This needs to be removed +* once clk core is fixed. +*/ + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) + ret = -EIO; + } + if (WARN_ON(ret)) { pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, new_cluster); @@ -189,15 +201,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) mutex_unlock(_lock[old_cluster]); } - /* -* FIXME: clk_set_rate has to handle the case where clk_change_rate -* can fail due to hardware or firmware issues. Until the clk core -* layer is fixed, we can check here. In most of the cases we will -* be reading only the cached value anyway. This needs to be removed -* once clk core is fixed. -*/ - if (bL_cpufreq_get_rate(cpu) != new_rate) - return -EIO; return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Wed, 2015-10-07 at 23:09 +0530, Viresh Kumar wrote: [...] > And why not fix it properly by doing this: > > diff --git a/drivers/cpufreq/arm_big_little.c > b/drivers/cpufreq/arm_big_little.c > index f1e42f8ce0fc..5b36657a76d6 100644 > --- a/drivers/cpufreq/arm_big_little.c > +++ b/drivers/cpufreq/arm_big_little.c > @@ -128,7 +128,7 @@ static unsigned int bL_cpufreq_get_rate(unsigned int cpu) > static unsigned int > bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) > { > - u32 new_rate, prev_rate; > + u32 new_rate, prev_rate, target_rate; > int ret; > bool bLs = is_bL_switching_enabled(); > > @@ -140,9 +140,11 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) > per_cpu(physical_cluster, cpu) = new_cluster; > > new_rate = find_cluster_maxfreq(new_cluster); > + target_rate = new_rate; > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > } else { > new_rate = rate; > + target_rate = new_rate; > } > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > @@ -196,7 +198,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > new_cluster, u32 rate) >* be reading only the cached value anyway. This needs to be removed >* once clk core is fixed. >*/ > - if (bL_cpufreq_get_rate(cpu) != new_rate) > + if (bL_cpufreq_get_rate(cpu) != target_rate) > return -EIO; > return 0; > } You call that a proper fix? ;-) It's comparing an 'virtual' frequency to an 'actual' frequency. If the real intent is to check that clk_set_rate works I would have thought the patch below would be correct. But I didn't propose that as it's the obvious implementation and I assumed the original patch didn't do it that way for a reason. diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..59115a4 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -149,6 +149,18 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) __func__, cpu, old_cluster, new_cluster, new_rate); ret = clk_set_rate(clk[new_cluster], new_rate * 1000); + if (!ret) { + /* +* FIXME: clk_set_rate has to handle the case where clk_change_rate +* can fail due to hardware or firmware issues. Until the clk core +* layer is fixed, we can check here. In most of the cases we will +* be reading only the cached value anyway. This needs to be removed +* once clk core is fixed. +*/ + if (clk_get_rate(clk[new_cluster]) != new_rate * 1000) + ret = -EIO; + } + if (WARN_ON(ret)) { pr_err("clk_set_rate failed: %d, new cluster: %d\n", ret, new_cluster); @@ -189,15 +201,6 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) mutex_unlock(_lock[old_cluster]); } - /* -* FIXME: clk_set_rate has to handle the case where clk_change_rate -* can fail due to hardware or firmware issues. Until the clk core -* layer is fixed, we can check here. In most of the cases we will -* be reading only the cached value anyway. This needs to be removed -* once clk core is fixed. -*/ - if (bL_cpufreq_get_rate(cpu) != new_rate) - return -EIO; return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
On Thu, 2015-10-08 at 16:54 +0530, Viresh Kumar wrote: > On 08-10-15, 10:23, Jon Medhurst (Tixy) wrote: > > On Wed, 2015-10-07 at 23:09 +0530, Viresh Kumar wrote: [...] > > > @@ -140,9 +140,11 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > > new_cluster, u32 rate) > > > per_cpu(physical_cluster, cpu) = new_cluster; > > > > > > new_rate = find_cluster_maxfreq(new_cluster); > > > + target_rate = new_rate; > > This is still a virtual freq ... > > > > new_rate = ACTUAL_FREQ(new_cluster, new_rate); > > And new_rate is the actual freq.. > > > > } else { > > > new_rate = rate; > > > + target_rate = new_rate; > > Here both are actual freqs, and no virtual freq. > > > > } > > > > > > pr_debug("%s: cpu: %d, old cluster: %d, new cluster: %d, freq: %d\n", > > > @@ -196,7 +198,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 > > > new_cluster, u32 rate) > > >* be reading only the cached value anyway. This needs to be removed > > >* once clk core is fixed. > > >*/ > > > - if (bL_cpufreq_get_rate(cpu) != new_rate) > > > + if (bL_cpufreq_get_rate(cpu) != target_rate) > > > return -EIO; > > > return 0; > > > } > > > > You call that a proper fix? ;-) It's comparing an 'virtual' frequency to > > an 'actual' frequency. > > So, why do you say so? I thought both are virtual freqs only. You are right, I had misread the code. I guess my problem is that I'm actually running the code then when it doesn't work (which it doesn't) going back to try and work out why not. Looking a bit more carefully, the reason your fix doesn't work is that bL_cpufreq_get_rate returns the last requested rate for this CPU, whereas target_rate/new_rate is the maximum rate requested by any CPU on the cluster (which is what we want the hardware set to). > > > If the real intent is to check that clk_set_rate works I would have > > thought the patch below would be correct. But I didn't propose that as > > it's the obvious implementation and I assumed the original patch didn't > > do it that way for a reason. > > Maybe yes. Only Sudeep can tell why he didn't do it that way. But > yeah, the intent was only what the comment says. So sounds like my alternative fix of checking the 'actual' frequency immediately after setting it is probably the way forward, unless Sudeep chimes in with additional info about the issue he was trying to address. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
The check for correct frequency being set in bL_cpufreq_set_rate is broken when the big.LITTLE switcher is active, for two reasons. 1. The 'new_rate' variable gets overwritten before the test by the code calculating the frequency of the old cluster. 2. The frequency returned by bL_cpufreq_get_rate will be the virtual frequency, not the actual one the intended version of new_rate contains. This means the function always returns an error causing an endless stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" As the comment for the test indicates that it is temporary and limited lets not bother fixing it to work when the bL switcher is active and instead just bypass the test in that case. Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is set correctly") Signed-off-by: Jon Medhurst --- drivers/cpufreq/arm_big_little.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..71deddb 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -196,7 +196,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) * be reading only the cached value anyway. This needs to be removed * once clk core is fixed. */ - if (bL_cpufreq_get_rate(cpu) != new_rate) + if (!bLs && bL_cpufreq_get_rate(cpu) != new_rate) return -EIO; return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cpufreq: arm_big_little: fix frequency check when bL switcher is active
The check for correct frequency being set in bL_cpufreq_set_rate is broken when the big.LITTLE switcher is active, for two reasons. 1. The 'new_rate' variable gets overwritten before the test by the code calculating the frequency of the old cluster. 2. The frequency returned by bL_cpufreq_get_rate will be the virtual frequency, not the actual one the intended version of new_rate contains. This means the function always returns an error causing an endless stream of: "cpufreq: __target_index: Failed to change cpu frequency: -5" As the comment for the test indicates that it is temporary and limited lets not bother fixing it to work when the bL switcher is active and instead just bypass the test in that case. Fixes: 0a95e630b49a ("cpufreq: arm_big_little: check if the frequency is set correctly") Signed-off-by: Jon Medhurst--- drivers/cpufreq/arm_big_little.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c index f1e42f8..71deddb 100644 --- a/drivers/cpufreq/arm_big_little.c +++ b/drivers/cpufreq/arm_big_little.c @@ -196,7 +196,7 @@ bL_cpufreq_set_rate(u32 cpu, u32 old_cluster, u32 new_cluster, u32 rate) * be reading only the cached value anyway. This needs to be removed * once clk core is fixed. */ - if (bL_cpufreq_get_rate(cpu) != new_rate) + if (!bLs && bL_cpufreq_get_rate(cpu) != new_rate) return -EIO; return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 17:04 +0100, Punit Agrawal wrote: > "Jon Medhurst (Tixy)" writes: > > > On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > >> "Jon Medhurst (Tixy)" writes: > >> > >> > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > >> >> Mark Rutland writes: > >> >> > > [...] > >> >> The way the SCP interface is defined, the sensor identifiers are > >> >> contiguous, > >> > > >> > Is there any documentation other than DUI0922A? [1] From what I can seen > >> > that just says it's a 16-bit value and doesn't put any particular > >> > constraints on its value. > >> > >> Although not explicitly stated, if you look at the Get Sensor Capability > >> [2] and Get Sensor Info [3] commands you can indirectly infer that the > >> Sensor IDs are contiguous. > > > > I personally wouldn't even indirectly infer they are contiguous from > > what the document says. If I were implementing the firmware I would feel > > quite in my rights to, for example, use the top 8 bits of the ID for a > > sensor type and the bottom 8 for an index, if that made dispatching of > > requests more efficient. Or if some optional hardware was detected as > > missing, leaving some holes in ID space. > > True. And without a command to convey the list of valid IDs, the > consumer of the API would have to iterate over the entire 16bit space to > locate valid IDs. Or get IDs from device-tree :-) Anyway, I'm not arguing that the IDs shouldn't be 0..N-1, just that it should explicitly documented in the SCPI doc, which we're are in agreement on. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 12:03 +0100, Mark Rutland wrote: > On Tue, Sep 15, 2015 at 11:46:02AM +0100, Jon Medhurst (Tixy) wrote: > > On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > > > "Jon Medhurst (Tixy)" writes: > > > > > > > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > > > >> Mark Rutland writes: > > > >> > > [...] > > > >> The way the SCP interface is defined, the sensor identifiers are > > > >> contiguous, > > > > > > > > Is there any documentation other than DUI0922A? [1] From what I can seen > > > > that just says it's a 16-bit value and doesn't put any particular > > > > constraints on its value. > > > > > > Although not explicitly stated, if you look at the Get Sensor Capability > > > [2] and Get Sensor Info [3] commands you can indirectly infer that the > > > Sensor IDs are contiguous. > > > > I personally wouldn't even indirectly infer they are contiguous from > > what the document says. If I were implementing the firmware I would feel > > quite in my rights to, for example, use the top 8 bits of the ID for a > > sensor type and the bottom 8 for an index, if that made dispatching of > > requests more efficient. Or if some optional hardware was detected as > > missing, leaving some holes in ID space. > > > > As a specification of a 'standard' the document seems to be rather > > lacking. So, Sensor ID should be documented as being "an unsigned > > integer less than then number of sensors returned by the Get Sensor > > Capability command", or something like that. I guess clocks and other > > devices suffer from similar lack of specificity. > > I think from the PoV of the binding, this doesn't matter. The value is > just an arbitrary opaue token written down in a spec, that the FW > understands how to interpret. True, it's the Linux implementation in following patches that has assumptions, e.g. for loops iterating over id's 0..N-1 > > I only asked about how the IDs were organised because I thought there > was additional translation in the kernel, but this is not the case. > > The only potential problem is bit-width. Punit, I assume the value is > 32-bit (or less) in the messages to the FW? It's 16 bit. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > "Jon Medhurst (Tixy)" writes: > > > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > >> Mark Rutland writes: > >> [...] > >> The way the SCP interface is defined, the sensor identifiers are > >> contiguous, > > > > Is there any documentation other than DUI0922A? [1] From what I can seen > > that just says it's a 16-bit value and doesn't put any particular > > constraints on its value. > > Although not explicitly stated, if you look at the Get Sensor Capability > [2] and Get Sensor Info [3] commands you can indirectly infer that the > Sensor IDs are contiguous. I personally wouldn't even indirectly infer they are contiguous from what the document says. If I were implementing the firmware I would feel quite in my rights to, for example, use the top 8 bits of the ID for a sensor type and the bottom 8 for an index, if that made dispatching of requests more efficient. Or if some optional hardware was detected as missing, leaving some holes in ID space. As a specification of a 'standard' the document seems to be rather lacking. So, Sensor ID should be documented as being "an unsigned integer less than then number of sensors returned by the Get Sensor Capability command", or something like that. I guess clocks and other devices suffer from similar lack of specificity. > Not the strongest guarantee I know. > > All platforms currently using SCP (Juno R0 and R1) do indeed expose > contiguous identifiers. IMO, Linux drivers should be coded to the standard or written specification (where they are available) not the particular implementations available. > > > > [1] > > http://community.arm.com/servlet/JiveServlet/download/8401-40-18262/DUI0922A_scp_message_interface.pdf > [2] > http://arminfo.emea.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922b/ch03s02s21.html > [3] > http://arminfo.emea.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922b/BABCCCJJ.html I think those links are on ARM's intranet, they return NXDOMAIN for me. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 17:04 +0100, Punit Agrawal wrote: > "Jon Medhurst (Tixy)" <t...@linaro.org> writes: > > > On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > >> "Jon Medhurst (Tixy)" <t...@linaro.org> writes: > >> > >> > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > >> >> Mark Rutland <mark.rutl...@arm.com> writes: > >> >> > > [...] > >> >> The way the SCP interface is defined, the sensor identifiers are > >> >> contiguous, > >> > > >> > Is there any documentation other than DUI0922A? [1] From what I can seen > >> > that just says it's a 16-bit value and doesn't put any particular > >> > constraints on its value. > >> > >> Although not explicitly stated, if you look at the Get Sensor Capability > >> [2] and Get Sensor Info [3] commands you can indirectly infer that the > >> Sensor IDs are contiguous. > > > > I personally wouldn't even indirectly infer they are contiguous from > > what the document says. If I were implementing the firmware I would feel > > quite in my rights to, for example, use the top 8 bits of the ID for a > > sensor type and the bottom 8 for an index, if that made dispatching of > > requests more efficient. Or if some optional hardware was detected as > > missing, leaving some holes in ID space. > > True. And without a command to convey the list of valid IDs, the > consumer of the API would have to iterate over the entire 16bit space to > locate valid IDs. Or get IDs from device-tree :-) Anyway, I'm not arguing that the IDs shouldn't be 0..N-1, just that it should explicitly documented in the SCPI doc, which we're are in agreement on. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > "Jon Medhurst (Tixy)" <t...@linaro.org> writes: > > > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > >> Mark Rutland <mark.rutl...@arm.com> writes: > >> [...] > >> The way the SCP interface is defined, the sensor identifiers are > >> contiguous, > > > > Is there any documentation other than DUI0922A? [1] From what I can seen > > that just says it's a 16-bit value and doesn't put any particular > > constraints on its value. > > Although not explicitly stated, if you look at the Get Sensor Capability > [2] and Get Sensor Info [3] commands you can indirectly infer that the > Sensor IDs are contiguous. I personally wouldn't even indirectly infer they are contiguous from what the document says. If I were implementing the firmware I would feel quite in my rights to, for example, use the top 8 bits of the ID for a sensor type and the bottom 8 for an index, if that made dispatching of requests more efficient. Or if some optional hardware was detected as missing, leaving some holes in ID space. As a specification of a 'standard' the document seems to be rather lacking. So, Sensor ID should be documented as being "an unsigned integer less than then number of sensors returned by the Get Sensor Capability command", or something like that. I guess clocks and other devices suffer from similar lack of specificity. > Not the strongest guarantee I know. > > All platforms currently using SCP (Juno R0 and R1) do indeed expose > contiguous identifiers. IMO, Linux drivers should be coded to the standard or written specification (where they are available) not the particular implementations available. > > > > [1] > > http://community.arm.com/servlet/JiveServlet/download/8401-40-18262/DUI0922A_scp_message_interface.pdf > [2] > http://arminfo.emea.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922b/ch03s02s21.html > [3] > http://arminfo.emea.arm.com/help/index.jsp?topic=/com.arm.doc.dui0922b/BABCCCJJ.html I think those links are on ARM's intranet, they return NXDOMAIN for me. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Tue, 2015-09-15 at 12:03 +0100, Mark Rutland wrote: > On Tue, Sep 15, 2015 at 11:46:02AM +0100, Jon Medhurst (Tixy) wrote: > > On Tue, 2015-09-15 at 10:37 +0100, Punit Agrawal wrote: > > > "Jon Medhurst (Tixy)" <t...@linaro.org> writes: > > > > > > > On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > > > >> Mark Rutland <mark.rutl...@arm.com> writes: > > > >> > > [...] > > > >> The way the SCP interface is defined, the sensor identifiers are > > > >> contiguous, > > > > > > > > Is there any documentation other than DUI0922A? [1] From what I can seen > > > > that just says it's a 16-bit value and doesn't put any particular > > > > constraints on its value. > > > > > > Although not explicitly stated, if you look at the Get Sensor Capability > > > [2] and Get Sensor Info [3] commands you can indirectly infer that the > > > Sensor IDs are contiguous. > > > > I personally wouldn't even indirectly infer they are contiguous from > > what the document says. If I were implementing the firmware I would feel > > quite in my rights to, for example, use the top 8 bits of the ID for a > > sensor type and the bottom 8 for an index, if that made dispatching of > > requests more efficient. Or if some optional hardware was detected as > > missing, leaving some holes in ID space. > > > > As a specification of a 'standard' the document seems to be rather > > lacking. So, Sensor ID should be documented as being "an unsigned > > integer less than then number of sensors returned by the Get Sensor > > Capability command", or something like that. I guess clocks and other > > devices suffer from similar lack of specificity. > > I think from the PoV of the binding, this doesn't matter. The value is > just an arbitrary opaue token written down in a spec, that the FW > understands how to interpret. True, it's the Linux implementation in following patches that has assumptions, e.g. for loops iterating over id's 0..N-1 > > I only asked about how the IDs were organised because I thought there > was additional translation in the kernel, but this is not the case. > > The only potential problem is bit-width. Punit, I assume the value is > 32-bit (or less) in the messages to the FW? It's 16 bit. -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > Mark Rutland writes: > > >> >> +Sensor bindings for the sensors based on SCPI Message Protocol > >> >> +-- > >> >> +SCPI provides an API to access the various sensors on the SoC. > >> >> + > >> >> +Required properties: > >> >> +- compatible : should be "arm,scpi-sensors". > >> >> +- #thermal-sensor-cells: should be set to 1. This property follows the > >> >> +thermal device tree bindings[2]. > >> > > >> > You need to specify what the valid values for this cell are. > >> > >> The enumeration depends on the number of sensors exported by SCP > >> firmware - which is platform dependent. I could add add something like > >> if you think that is helpful - > >> > >> "Valid cell value is a number between 0..n-1, where n is the number > >> of sensors exported by SCP firmware." > > > > Can the FW identifer space have holes? Or are they always contiguous? > > The way the SCP interface is defined, the sensor identifiers are > contiguous, Is there any documentation other than DUI0922A? [1] From what I can seen that just says it's a 16-bit value and doesn't put any particular constraints on its value. [1] http://community.arm.com/servlet/JiveServlet/download/8401-40-18262/DUI0922A_scp_message_interface.pdf -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 1/5] Documentation: add DT bindings for ARM SCPI sensors
On Mon, 2015-09-14 at 15:38 +0100, Punit Agrawal wrote: > Mark Rutlandwrites: > > >> >> +Sensor bindings for the sensors based on SCPI Message Protocol > >> >> +-- > >> >> +SCPI provides an API to access the various sensors on the SoC. > >> >> + > >> >> +Required properties: > >> >> +- compatible : should be "arm,scpi-sensors". > >> >> +- #thermal-sensor-cells: should be set to 1. This property follows the > >> >> +thermal device tree bindings[2]. > >> > > >> > You need to specify what the valid values for this cell are. > >> > >> The enumeration depends on the number of sensors exported by SCP > >> firmware - which is platform dependent. I could add add something like > >> if you think that is helpful - > >> > >> "Valid cell value is a number between 0..n-1, where n is the number > >> of sensors exported by SCP firmware." > > > > Can the FW identifer space have holes? Or are they always contiguous? > > The way the SCP interface is defined, the sensor identifiers are > contiguous, Is there any documentation other than DUI0922A? [1] From what I can seen that just says it's a 16-bit value and doesn't put any particular constraints on its value. [1] http://community.arm.com/servlet/JiveServlet/download/8401-40-18262/DUI0922A_scp_message_interface.pdf -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] drm: Add support for ARM's HDLCD controller.
I haven't reviewed the code in detail, just had one comment I alluded to in a private email the other day... On Wed, 2015-08-05 at 15:28 +0100, Liviu Dudau wrote: > diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c > b/drivers/gpu/drm/arm/hdlcd_crtc.c [...] > +void hdlcd_set_scanout(struct hdlcd_drm_private *hdlcd) > +{ > + struct drm_framebuffer *fb = hdlcd->crtc.primary->fb; > + struct drm_gem_cma_object *gem; > + unsigned int depth, bpp; > + dma_addr_t scanout_start; > + > + drm_fb_get_bpp_depth(fb->pixel_format, , ); > + gem = drm_fb_cma_get_gem_obj(fb, 0); > + > + scanout_start = gem->paddr + fb->offsets[0] + > + (hdlcd->crtc.y * fb->pitches[0]) + (hdlcd->crtc.x * bpp/8); > + > + hdlcd_write(hdlcd, HDLCD_REG_FB_BASE, scanout_start); > +} > + The above function accesses various pointers and values, which presumably all need to be valid and consistent. However... [...] > diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c [...] > +static irqreturn_t hdlcd_irq(int irq, void *arg) > +{ > + struct drm_device *dev = arg; > + struct hdlcd_drm_private *hdlcd = dev->dev_private; > + unsigned long irq_status; > + > + irq_status = hdlcd_read(hdlcd, HDLCD_REG_INT_STATUS); > + > +#ifdef CONFIG_DEBUG_FS > + if (irq_status & HDLCD_INTERRUPT_UNDERRUN) { > + atomic_inc(>buffer_underrun_count); > + } > + if (irq_status & HDLCD_INTERRUPT_DMA_END) { > + atomic_inc(>dma_end_count); > + } > + if (irq_status & HDLCD_INTERRUPT_BUS_ERROR) { > + atomic_inc(>bus_error_count); > + } > + if (irq_status & HDLCD_INTERRUPT_VSYNC) { > + atomic_inc(>vsync_count); > + } > +#endif > + if (irq_status & HDLCD_INTERRUPT_VSYNC) { > + struct drm_pending_vblank_event *event; > + unsigned long flags; > + > + hdlcd_set_scanout(hdlcd); hdlcd_set_scanout is being called on every vsync interrupt, can we guarantee that is safe? What if we're in the middle of a page flip or panning operation? Seems to me that there is at least scope for incorrect addresses being calculated leading to a nasty glitch on the screen for one frame. And in the worst case, possibly invalid pointer being dereferenced. So, if scanout needs to be set on every vsync, would it not be safer (and more efficient) to have a single variable storing the value to use during interrupts, and recalculate that value outside of interrupts whenever things are changed? -- Tixy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/4] drm: Add support for ARM's HDLCD controller.
I haven't reviewed the code in detail, just had one comment I alluded to in a private email the other day... On Wed, 2015-08-05 at 15:28 +0100, Liviu Dudau wrote: diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c [...] +void hdlcd_set_scanout(struct hdlcd_drm_private *hdlcd) +{ + struct drm_framebuffer *fb = hdlcd-crtc.primary-fb; + struct drm_gem_cma_object *gem; + unsigned int depth, bpp; + dma_addr_t scanout_start; + + drm_fb_get_bpp_depth(fb-pixel_format, depth, bpp); + gem = drm_fb_cma_get_gem_obj(fb, 0); + + scanout_start = gem-paddr + fb-offsets[0] + + (hdlcd-crtc.y * fb-pitches[0]) + (hdlcd-crtc.x * bpp/8); + + hdlcd_write(hdlcd, HDLCD_REG_FB_BASE, scanout_start); +} + The above function accesses various pointers and values, which presumably all need to be valid and consistent. However... [...] diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c [...] +static irqreturn_t hdlcd_irq(int irq, void *arg) +{ + struct drm_device *dev = arg; + struct hdlcd_drm_private *hdlcd = dev-dev_private; + unsigned long irq_status; + + irq_status = hdlcd_read(hdlcd, HDLCD_REG_INT_STATUS); + +#ifdef CONFIG_DEBUG_FS + if (irq_status HDLCD_INTERRUPT_UNDERRUN) { + atomic_inc(hdlcd-buffer_underrun_count); + } + if (irq_status HDLCD_INTERRUPT_DMA_END) { + atomic_inc(hdlcd-dma_end_count); + } + if (irq_status HDLCD_INTERRUPT_BUS_ERROR) { + atomic_inc(hdlcd-bus_error_count); + } + if (irq_status HDLCD_INTERRUPT_VSYNC) { + atomic_inc(hdlcd-vsync_count); + } +#endif + if (irq_status HDLCD_INTERRUPT_VSYNC) { + struct drm_pending_vblank_event *event; + unsigned long flags; + + hdlcd_set_scanout(hdlcd); hdlcd_set_scanout is being called on every vsync interrupt, can we guarantee that is safe? What if we're in the middle of a page flip or panning operation? Seems to me that there is at least scope for incorrect addresses being calculated leading to a nasty glitch on the screen for one frame. And in the worst case, possibly invalid pointer being dereferenced. So, if scanout needs to be set on every vsync, would it not be safer (and more efficient) to have a single variable storing the value to use during interrupts, and recalculate that value outside of interrupts whenever things are changed? -- Tixy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/