Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Nicholas Piggin
On Thu Oct 13, 2022 at 10:21 AM AEST, Guenter Roeck wrote:
> On Thu, Oct 13, 2022 at 11:03:34AM +1100, Michael Ellerman wrote:
> > Guenter Roeck  writes:
> > > On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> > >> 
> > >> I've also managed to not hit this bug a few times. When it triggers,
> > >> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> > >> optimized if possible.", there's a long hang - tens seconds before it
> > >> continues. When it doesn't trigger, there's no hang at that point in the
> > >> boot process.
> > >> 
> > >
> > > I managed to bisect the problem. See below for results. Reverting the
> > > offending patch fixes the problem for me.
> > 
> > Thanks.
> > 
> > This is probably down to me/us not testing with PREEMPT enabled enough.
> > 
> Not sure. My configuration has
>
> CONFIG_PREEMPT_NONE=y
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT is not set

Okay I reproduced it, just takes a while to hit.

Thanks,
Nick


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck

On 10/12/22 22:03, Nicholas Piggin wrote:

On Thu Oct 13, 2022 at 10:21 AM AEST, Guenter Roeck wrote:

On Thu, Oct 13, 2022 at 11:03:34AM +1100, Michael Ellerman wrote:

Guenter Roeck  writes:

On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:


I've also managed to not hit this bug a few times. When it triggers,
after "kprobes: kprobe jump-optimization is enabled. All kprobes are
optimized if possible.", there's a long hang - tens seconds before it
continues. When it doesn't trigger, there's no hang at that point in the
boot process.



I managed to bisect the problem. See below for results. Reverting the
offending patch fixes the problem for me.


Thanks.

This is probably down to me/us not testing with PREEMPT enabled enough.


Not sure. My configuration has

CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set


Thanks very much for helping with this. The config snippet you posted here
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-October/249758.html
has CONFIG_PREEMPT=y. How do you turn that into a .config, olddefconfig?

I can't reproduce this so far using your config and qemu command line,
but the patch you've bisected it to definitely could cause this. I'll
keep trying...



Uuh, sorry, I think I got confused with running multiple bisects on the
same branch, and took the above from a different bisect run. You are
correct, PREEMPT is enabled in the configuration.

Timing is definitely involved; I see the problem more often on a loaded
system. To bisect it, I had to repeat the test for each bisect step
several times (I set the limit to 20 retries; that gave me reliable
results).

Guenter


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Jason A. Donenfeld
On Thu, Oct 13, 2022 at 03:03:14PM +1000, Nicholas Piggin wrote:
> On Thu Oct 13, 2022 at 10:21 AM AEST, Guenter Roeck wrote:
> > On Thu, Oct 13, 2022 at 11:03:34AM +1100, Michael Ellerman wrote:
> > > Guenter Roeck  writes:
> > > > On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> > > >> 
> > > >> I've also managed to not hit this bug a few times. When it triggers,
> > > >> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> > > >> optimized if possible.", there's a long hang - tens seconds before it
> > > >> continues. When it doesn't trigger, there's no hang at that point in 
> > > >> the
> > > >> boot process.
> > > >> 
> > > >
> > > > I managed to bisect the problem. See below for results. Reverting the
> > > > offending patch fixes the problem for me.
> > > 
> > > Thanks.
> > > 
> > > This is probably down to me/us not testing with PREEMPT enabled enough.
> > > 
> > Not sure. My configuration has
> >
> > CONFIG_PREEMPT_NONE=y
> > # CONFIG_PREEMPT_VOLUNTARY is not set
> > # CONFIG_PREEMPT is not set
> 
> Thanks very much for helping with this. The config snippet you posted here
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-October/249758.html
> has CONFIG_PREEMPT=y. How do you turn that into a .config, olddefconfig?
> 
> I can't reproduce this so far using your config and qemu command line,
> but the patch you've bisected it to definitely could cause this. I'll
> keep trying...

Voila https://xn--4db.cc/dt00j0mt this repros it for me.

> 
> Thanks,
> Nick
> 
> [...]
> > > > # first bad commit: [e485f6c751e0a969327336c635ca602feea117f0] 
> > > > powerpc/64/interrupt: Fix return to masked context after hard-mask irq 
> > > > becomes pending
> 


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Nicholas Piggin
On Thu Oct 13, 2022 at 4:37 AM AEST, Jason A. Donenfeld wrote:
> On Wed, Oct 12, 2022 at 10:48:26AM -0700, Guenter Roeck wrote:
> > > I've also managed to not hit this bug a few times. When it triggers,
> > > after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> > > optimized if possible.", there's a long hang - tens seconds before it
> > > continues. When it doesn't trigger, there's no hang at that point in the
> > > boot process.
> > > 
> > 
> > That probably explains why my attempts to bisect the problem were
> > unsuccessful.
>
> So I just did this:
>
> diff --git a/drivers/char/random.c b/drivers/char/random.c
> index 2fe28eeb2f38..2d70bc09db7e 100644
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1212,6 +1212,7 @@ static void __cold try_to_generate_entropy(void)
> struct entropy_timer_state stack;
> unsigned int i, num_different = 0;
> unsigned long last = random_get_entropy();
> +   return;
>
> for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
> stack.entropy = random_get_entropy();
>
> And then ran it, and now we get the lockup from the idle process:

Yep that rules out the random code. And really if it was calling
schedule() it shouldn't be getting a softlockup anyway.

Thanks,
Nick


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Nicholas Piggin
On Thu Oct 13, 2022 at 2:43 PM AEST, Guenter Roeck wrote:
> On 10/12/22 10:20, Jason A. Donenfeld wrote:
> > On Wed, Oct 12, 2022 at 09:44:52AM -0700, Guenter Roeck wrote:
> >> On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> >>> On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
>  NIP [c0031630] .replay_soft_interrupts+0x60/0x300
>  LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
>  Call Trace:
>  [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
>  (unreliable)
>  [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
>  [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
>  [c7df3a50] [c092f0dc] 
>  .try_to_generate_entropy+0x118/0x174
>  [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
>  [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
>  [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
>  [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
>  [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> >>>
> >>> Obviously the first couple lines of this concern me a bit. But I think
> >>> actually this might just be a catalyst for another bug. You could view
> >>> that function as basically just:
> >>>
> >>>  while (something)
> >>>   schedule();
> >>>
> >>> And I guess in the process of calling the scheduler a lot, which toggles
> >>> interrupts a lot, something got wedged.
> >>>
> >>> Curious, though, I did try to reproduce this, to no avail. My .config is
> >>> https://xn--4db.cc/rBvHWfDZ . What's yours?
> >>>
> >>
> >> Attached. My qemu command line is
> > 
> > Okay, thanks, I reproduced it. In this case, I suspect
> > try_to_generate_entropy() is just the messenger. There's an earlier
> > problem:
> > 
> > BUG: using smp_processor_id() in preemptible [] code: swapper/0/1
> > caller is .__flush_tlb_pending+0x40/0xf0
> > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-28380-gde492c83cae0-dirty #4
> > Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
> > Call Trace:
> > [c44c3540] [c0f93ef0] .dump_stack_lvl+0x7c/0xc4 (unreliable)
> > [c44c35d0] [c0fc9550] .check_preemption_disabled+0x140/0x150
> > [c44c3660] [c0073dd0] .__flush_tlb_pending+0x40/0xf0
> > [c44c36f0] [c0334434] .__apply_to_page_range+0x764/0xa30
> > [c44c3840] [c006cad0] .change_memory_attr+0xf0/0x160
> > [c44c38d0] [c02a1d70] .bpf_prog_select_runtime+0x150/0x230
> > [c44c3970] [c0d405d4] .bpf_prepare_filter+0x504/0x6f0
> > [c44c3a30] [c0d4085c] .bpf_prog_create+0x9c/0x140
> > [c44c3ac0] [c2051d9c] .ptp_classifier_init+0x44/0x78
> > [c44c3b50] [c2050f3c] .sock_init+0xe0/0x100
> > [c44c3bd0] [c0010bd4] .do_one_initcall+0xa4/0x438
> > [c44c3cc0] [c2005008] .kernel_init_freeable+0x378/0x428
> > [c44c3da0] [c00113d8] .kernel_init+0x28/0x1a0
> > [c44c3e10] [c000ca3c] .ret_from_kernel_thread+0x58/0x60
> > 
> > This in turn is because __flush_tlb_pending() calls:
> > 
> > static inline int mm_is_thread_local(struct mm_struct *mm)
> > {
> >  return cpumask_equal(mm_cpumask(mm),
> >cpumask_of(smp_processor_id()));
> > }
> > 
> > __flush_tlb_pending() has a comment about this:
> > 
> >   * Must be called from within some kind of spinlock/non-preempt region...
> >   */
> > void __flush_tlb_pending(struct ppc64_tlb_batch *batch)
> > 
> > So I guess that didn't happen for some reason? Maybe this is indicative
> > of some lock imbalance that then gets hit later?
>
> I managed to bisect that problem. Unfortunately it points to the
> scheduler merge. No idea what to do about that. Any idea ?
> I am copying Peter and Ingo for comments.
>

> # first bad commit: [30c37f69abf935b0228b8411713737377d9e] Merge tag 
> 'sched-core-2022-10-07' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

This might be a red herring because I can reproduce without it.
I think we can fix this with some preempt critical sections, they
don't look too much of a problem.

I don't know why it's not showing up earlier than this release,
I'll look into it a bit more.

Thanks,
Nick


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Nicholas Piggin
On Thu Oct 13, 2022 at 10:21 AM AEST, Guenter Roeck wrote:
> On Thu, Oct 13, 2022 at 11:03:34AM +1100, Michael Ellerman wrote:
> > Guenter Roeck  writes:
> > > On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> > >> 
> > >> I've also managed to not hit this bug a few times. When it triggers,
> > >> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> > >> optimized if possible.", there's a long hang - tens seconds before it
> > >> continues. When it doesn't trigger, there's no hang at that point in the
> > >> boot process.
> > >> 
> > >
> > > I managed to bisect the problem. See below for results. Reverting the
> > > offending patch fixes the problem for me.
> > 
> > Thanks.
> > 
> > This is probably down to me/us not testing with PREEMPT enabled enough.
> > 
> Not sure. My configuration has
>
> CONFIG_PREEMPT_NONE=y
> # CONFIG_PREEMPT_VOLUNTARY is not set
> # CONFIG_PREEMPT is not set

Thanks very much for helping with this. The config snippet you posted here
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-October/249758.html
has CONFIG_PREEMPT=y. How do you turn that into a .config, olddefconfig?

I can't reproduce this so far using your config and qemu command line,
but the patch you've bisected it to definitely could cause this. I'll
keep trying...

Thanks,
Nick

[...]
> > > # first bad commit: [e485f6c751e0a969327336c635ca602feea117f0] 
> > > powerpc/64/interrupt: Fix return to masked context after hard-mask irq 
> > > becomes pending



Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck

On 10/12/22 10:20, Jason A. Donenfeld wrote:

On Wed, Oct 12, 2022 at 09:44:52AM -0700, Guenter Roeck wrote:

On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:

On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:

NIP [c0031630] .replay_soft_interrupts+0x60/0x300
LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
Call Trace:
[c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
(unreliable)
[c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
[c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
[c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
[c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
[c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
[c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
[c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
[c7df3e10] [c000c1d4] system_call_common+0xf4/0x258


Obviously the first couple lines of this concern me a bit. But I think
actually this might just be a catalyst for another bug. You could view
that function as basically just:

 while (something)
schedule();

And I guess in the process of calling the scheduler a lot, which toggles
interrupts a lot, something got wedged.

Curious, though, I did try to reproduce this, to no avail. My .config is
https://xn--4db.cc/rBvHWfDZ . What's yours?



Attached. My qemu command line is


Okay, thanks, I reproduced it. In this case, I suspect
try_to_generate_entropy() is just the messenger. There's an earlier
problem:

BUG: using smp_processor_id() in preemptible [] code: swapper/0/1
caller is .__flush_tlb_pending+0x40/0xf0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-28380-gde492c83cae0-dirty #4
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
Call Trace:
[c44c3540] [c0f93ef0] .dump_stack_lvl+0x7c/0xc4 (unreliable)
[c44c35d0] [c0fc9550] .check_preemption_disabled+0x140/0x150
[c44c3660] [c0073dd0] .__flush_tlb_pending+0x40/0xf0
[c44c36f0] [c0334434] .__apply_to_page_range+0x764/0xa30
[c44c3840] [c006cad0] .change_memory_attr+0xf0/0x160
[c44c38d0] [c02a1d70] .bpf_prog_select_runtime+0x150/0x230
[c44c3970] [c0d405d4] .bpf_prepare_filter+0x504/0x6f0
[c44c3a30] [c0d4085c] .bpf_prog_create+0x9c/0x140
[c44c3ac0] [c2051d9c] .ptp_classifier_init+0x44/0x78
[c44c3b50] [c2050f3c] .sock_init+0xe0/0x100
[c44c3bd0] [c0010bd4] .do_one_initcall+0xa4/0x438
[c44c3cc0] [c2005008] .kernel_init_freeable+0x378/0x428
[c44c3da0] [c00113d8] .kernel_init+0x28/0x1a0
[c44c3e10] [c000ca3c] .ret_from_kernel_thread+0x58/0x60

This in turn is because __flush_tlb_pending() calls:

static inline int mm_is_thread_local(struct mm_struct *mm)
{
 return cpumask_equal(mm_cpumask(mm),
   cpumask_of(smp_processor_id()));
}

__flush_tlb_pending() has a comment about this:

  * Must be called from within some kind of spinlock/non-preempt region...
  */
void __flush_tlb_pending(struct ppc64_tlb_batch *batch)

So I guess that didn't happen for some reason? Maybe this is indicative
of some lock imbalance that then gets hit later?


I managed to bisect that problem. Unfortunately it points to the
scheduler merge. No idea what to do about that. Any idea ?
I am copying Peter and Ingo for comments.

Guenter

---
# bad: [1440f576022887004f719883acb094e7e0dd4944] Merge tag 
'mm-hotfixes-stable-2022-10-11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
git bisect start 'HEAD' 'v6.0'
# good: [7171a8da00035e7913c3013ca5fb5beb5b8b22f0] Merge tag 'arm-dt-6.1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect good 7171a8da00035e7913c3013ca5fb5beb5b8b22f0
# good: [f01603979a4afaad7504a728918b678d572cda9e] Merge tag 
'gpio-updates-for-v6.1-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
git bisect good f01603979a4afaad7504a728918b678d572cda9e
# bad: [8aeab132e05fefc3a1a5277878629586bd7a3547] Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect bad 8aeab132e05fefc3a1a5277878629586bd7a3547
# good: [493ffd6605b2d3d4dc7008ab927dba319f36671f] Merge tag 
'ucount-rlimits-cleanups-for-v5.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
git bisect good 493ffd6605b2d3d4dc7008ab927dba319f36671f
# bad: [cdf072acb5baa18e5b05bdf3f13d6481f62396fc] Merge tag 'trace-v6.1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
git bisect bad cdf072acb5baa18e5b05bdf3f13d6481f62396fc
# bad: [55be6084c8e0e0ada9278c2ab60b7a584378efda] Merge tag 
'timers-core-2022-10-05' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 

Re: [PATCH v4 00/16] objtool: Enable and implement --mcount option on powerpc

2022-10-12 Thread Naveen N. Rao

Josh Poimboeuf wrote:

On Tue, Oct 11, 2022 at 01:20:02PM -0700, Josh Poimboeuf wrote:

On Mon, Oct 10, 2022 at 05:19:02PM +0530, Naveen N. Rao wrote:
> All the above changes are down to compiler optimizations and shuffling due
> to CONFIG_OBJTOOL being enabled and changing annotate_unreachable().
> 
> As such, for this series:

> Reviewed-by: Naveen N. Rao 
> Tested-by: Naveen N. Rao 
> 
> 
> Josh,

> Are you ok if this series is taken in through the powerpc tree?

Yes, it looks ok to me.  Let me run it through a round of testing.


The testing looked good, so:

  Acked-by: Josh Poimboeuf 


Thanks!

FYI: your previous reply (that you would be testing it) didn't hit my 
inbox and it doesn't seem to have hit the list either.



- Naveen


[powerpc:merge] BUILD SUCCESS 0c4c772cd7717acd0e466154ca733eea38895af0

2022-10-12 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
merge
branch HEAD: 0c4c772cd7717acd0e466154ca733eea38895af0  Automatic merge of 
'fixes' into merge (2022-10-13 00:56)

elapsed time: 727m

configs tested: 58
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
um i386_defconfig
um   x86_64_defconfig
x86_64   rhel-8.3-syz
x86_64 rhel-8.3-kunit
x86_64   rhel-8.3-kvm
riscvrandconfig-r042-20221012
arc  randconfig-r043-20221012
x86_64  defconfig
arc defconfig
s390 randconfig-r044-20221012
alpha   defconfig
powerpc   allnoconfig
x86_64   rhel-8.3
x86_64randconfig-a013
mips allyesconfig
x86_64randconfig-a011
powerpc  allmodconfig
m68k allyesconfig
m68k allmodconfig
s390 allmodconfig
x86_64randconfig-a015
s390defconfig
x86_64   allyesconfig
x86_64  rhel-8.3-func
arc  allyesconfig
sh   allmodconfig
x86_64rhel-8.3-kselftests
alphaallyesconfig
arm defconfig
i386  randconfig-a001
i386  randconfig-a003
x86_64randconfig-a004
x86_64randconfig-a002
i386  randconfig-a005
x86_64randconfig-a006
i386defconfig
arm  allyesconfig
arm64allyesconfig
i386  randconfig-a012
i386  randconfig-a014
i386  randconfig-a016
s390 allyesconfig
i386 allyesconfig
ia64 allmodconfig

clang tested configs:
hexagon  randconfig-r045-20221012
hexagon  randconfig-r041-20221012
x86_64randconfig-a014
x86_64randconfig-a012
x86_64randconfig-a016
i386  randconfig-a002
i386  randconfig-a004
x86_64randconfig-a001
x86_64randconfig-a003
x86_64randconfig-a005
i386  randconfig-a013
i386  randconfig-a006
i386  randconfig-a011
i386  randconfig-a015

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp


[powerpc:fixes-test] BUILD SUCCESS e237506238352f3bfa9cf3983cdab873e35651eb

2022-10-12 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
fixes-test
branch HEAD: e237506238352f3bfa9cf3983cdab873e35651eb  powerpc/32: fix syscall 
wrappers with 64-bit arguments of unaligned register-pairs

elapsed time: 728m

configs tested: 2
configs skipped: 96

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
powerpc   allnoconfig
powerpc  allmodconfig

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp


Re: [PATCH v1 3/5] treewide: use get_random_u32() when possible

2022-10-12 Thread Joe Perches
On Wed, 2022-10-12 at 21:29 +, David Laight wrote:
> From: Joe Perches
> > Sent: 12 October 2022 20:17
> > 
> > On Wed, 2022-10-05 at 23:48 +0200, Jason A. Donenfeld wrote:
> > > The prandom_u32() function has been a deprecated inline wrapper around
> > > get_random_u32() for several releases now, and compiles down to the
> > > exact same code. Replace the deprecated wrapper with a direct call to
> > > the real function.
> > []
> > > diff --git a/drivers/infiniband/hw/cxgb4/cm.c 
> > > b/drivers/infiniband/hw/cxgb4/cm.c
> > []
> > > @@ -734,7 +734,7 @@ static int send_connect(struct c4iw_ep *ep)
> > >  >com.remote_addr;
> > >   int ret;
> > >   enum chip_type adapter_type = ep->com.dev->rdev.lldi.adapter_type;
> > > - u32 isn = (prandom_u32() & ~7UL) - 1;
> > > + u32 isn = (get_random_u32() & ~7UL) - 1;
> > 
> > trivia:
> > 
> > There are somewhat odd size mismatches here.
> > 
> > I had to think a tiny bit if random() returned a value from 0 to 7
> > and was promoted to a 64 bit value then truncated to 32 bit.
> > 
> > Perhaps these would be clearer as ~7U and not ~7UL
> 
> That makes no difference - the compiler will generate the same code.

True, more or less.  It's more a question for the reader.

> The real question is WTF is the code doing?

True.

> The '& ~7u' clears the bottom 3 bits.
> The '- 1' then sets the bottom 3 bits and decrements the
> (random) high bits.

Right.

> So is the same as get_random_u32() | 7.

True, it's effectively the same as the upper 29 bits are random
anyway and the bottom 3 bits are always set.

> But I bet the coder had something else in mind.

Likely.

And it was also likely copy/pasted a few times.


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Thu, Oct 13, 2022 at 11:03:34AM +1100, Michael Ellerman wrote:
> Guenter Roeck  writes:
> > On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> >> 
> >> I've also managed to not hit this bug a few times. When it triggers,
> >> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> >> optimized if possible.", there's a long hang - tens seconds before it
> >> continues. When it doesn't trigger, there's no hang at that point in the
> >> boot process.
> >> 
> >
> > I managed to bisect the problem. See below for results. Reverting the
> > offending patch fixes the problem for me.
> 
> Thanks.
> 
> This is probably down to me/us not testing with PREEMPT enabled enough.
> 
Not sure. My configuration has

CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set

Guenter

> cheers
> 
> > ---
> > # bad: [1440f576022887004f719883acb094e7e0dd4944] Merge tag 
> > 'mm-hotfixes-stable-2022-10-11' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> > # good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
> > git bisect start 'HEAD' 'v6.0'
> > # good: [7171a8da00035e7913c3013ca5fb5beb5b8b22f0] Merge tag 'arm-dt-6.1' 
> > of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> > git bisect good 7171a8da00035e7913c3013ca5fb5beb5b8b22f0
> > # good: [f01603979a4afaad7504a728918b678d572cda9e] Merge tag 
> > 'gpio-updates-for-v6.1-rc1' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
> > git bisect good f01603979a4afaad7504a728918b678d572cda9e
> > # bad: [8aeab132e05fefc3a1a5277878629586bd7a3547] Merge tag 'for_linus' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
> > git bisect bad 8aeab132e05fefc3a1a5277878629586bd7a3547
> > # bad: [493ffd6605b2d3d4dc7008ab927dba319f36671f] Merge tag 
> > 'ucount-rlimits-cleanups-for-v5.19' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> > git bisect bad 493ffd6605b2d3d4dc7008ab927dba319f36671f
> > # good: [0e470763d84dcad27284067647dfb4b1a94dfce0] Merge tag 
> > 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
> > git bisect good 0e470763d84dcad27284067647dfb4b1a94dfce0
> > # bad: [110a58b9f91c66f743c01a2c217243d94c899c23] powerpc/boot: Explicitly 
> > disable usage of SPE instructions
> > git bisect bad 110a58b9f91c66f743c01a2c217243d94c899c23
> > # good: [fdfdcfd504933ed06eb6b4c9df21eede0e213c3e] powerpc/build: put 
> > sys_call_table in .data.rel.ro if RELOCATABLE
> > git bisect good fdfdcfd504933ed06eb6b4c9df21eede0e213c3e
> > # good: [c2e7a19827eec443a7cbe85e8d959052412d6dc3] powerpc: Use generic 
> > fallocate compatibility syscall
> > git bisect good c2e7a19827eec443a7cbe85e8d959052412d6dc3
> > # good: [56adbb7a8b6cc7fc9b940829c38494e53c9e57d1] powerpc/64/interrupt: 
> > Fix false warning in context tracking due to idle state
> > git bisect good 56adbb7a8b6cc7fc9b940829c38494e53c9e57d1
> > # bad: [754f611774e4b9357a944f5b703dd291c85161cf] powerpc/64: switch asm 
> > helpers from GOT to TOC relative addressing
> > git bisect bad 754f611774e4b9357a944f5b703dd291c85161cf
> > # bad: [f7bff6e7759b1abb59334f6448f9ef3172c4c04a] powerpc/64/interrupt: 
> > avoid BUG/WARN recursion in interrupt entry
> > git bisect bad f7bff6e7759b1abb59334f6448f9ef3172c4c04a
> > # bad: [e485f6c751e0a969327336c635ca602feea117f0] powerpc/64/interrupt: Fix 
> > return to masked context after hard-mask irq becomes pending
> > git bisect bad e485f6c751e0a969327336c635ca602feea117f0
> > # good: [799f7063c7645f9a751d17f5dfd73b952f962cd2] powerpc/64: mark irqs 
> > hard disabled in boot paca
> > git bisect good 799f7063c7645f9a751d17f5dfd73b952f962cd2
> > # first bad commit: [e485f6c751e0a969327336c635ca602feea117f0] 
> > powerpc/64/interrupt: Fix return to masked context after hard-mask irq 
> > becomes pending


Re: [PATCH v4 00/16] objtool: Enable and implement --mcount option on powerpc

2022-10-12 Thread Josh Poimboeuf
On Tue, Oct 11, 2022 at 01:20:02PM -0700, Josh Poimboeuf wrote:
> On Mon, Oct 10, 2022 at 05:19:02PM +0530, Naveen N. Rao wrote:
> > All the above changes are down to compiler optimizations and shuffling due
> > to CONFIG_OBJTOOL being enabled and changing annotate_unreachable().
> > 
> > As such, for this series:
> > Reviewed-by: Naveen N. Rao 
> > Tested-by: Naveen N. Rao 
> > 
> > 
> > Josh,
> > Are you ok if this series is taken in through the powerpc tree?
> 
> Yes, it looks ok to me.  Let me run it through a round of testing.

The testing looked good, so:

  Acked-by: Josh Poimboeuf 

-- 
Josh


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Michael Ellerman
Guenter Roeck  writes:
> On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
>> 
>> I've also managed to not hit this bug a few times. When it triggers,
>> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
>> optimized if possible.", there's a long hang - tens seconds before it
>> continues. When it doesn't trigger, there's no hang at that point in the
>> boot process.
>> 
>
> I managed to bisect the problem. See below for results. Reverting the
> offending patch fixes the problem for me.

Thanks.

This is probably down to me/us not testing with PREEMPT enabled enough.

cheers

> ---
> # bad: [1440f576022887004f719883acb094e7e0dd4944] Merge tag 
> 'mm-hotfixes-stable-2022-10-11' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> # good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
> git bisect start 'HEAD' 'v6.0'
> # good: [7171a8da00035e7913c3013ca5fb5beb5b8b22f0] Merge tag 'arm-dt-6.1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> git bisect good 7171a8da00035e7913c3013ca5fb5beb5b8b22f0
> # good: [f01603979a4afaad7504a728918b678d572cda9e] Merge tag 
> 'gpio-updates-for-v6.1-rc1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
> git bisect good f01603979a4afaad7504a728918b678d572cda9e
> # bad: [8aeab132e05fefc3a1a5277878629586bd7a3547] Merge tag 'for_linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
> git bisect bad 8aeab132e05fefc3a1a5277878629586bd7a3547
> # bad: [493ffd6605b2d3d4dc7008ab927dba319f36671f] Merge tag 
> 'ucount-rlimits-cleanups-for-v5.19' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
> git bisect bad 493ffd6605b2d3d4dc7008ab927dba319f36671f
> # good: [0e470763d84dcad27284067647dfb4b1a94dfce0] Merge tag 
> 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
> git bisect good 0e470763d84dcad27284067647dfb4b1a94dfce0
> # bad: [110a58b9f91c66f743c01a2c217243d94c899c23] powerpc/boot: Explicitly 
> disable usage of SPE instructions
> git bisect bad 110a58b9f91c66f743c01a2c217243d94c899c23
> # good: [fdfdcfd504933ed06eb6b4c9df21eede0e213c3e] powerpc/build: put 
> sys_call_table in .data.rel.ro if RELOCATABLE
> git bisect good fdfdcfd504933ed06eb6b4c9df21eede0e213c3e
> # good: [c2e7a19827eec443a7cbe85e8d959052412d6dc3] powerpc: Use generic 
> fallocate compatibility syscall
> git bisect good c2e7a19827eec443a7cbe85e8d959052412d6dc3
> # good: [56adbb7a8b6cc7fc9b940829c38494e53c9e57d1] powerpc/64/interrupt: Fix 
> false warning in context tracking due to idle state
> git bisect good 56adbb7a8b6cc7fc9b940829c38494e53c9e57d1
> # bad: [754f611774e4b9357a944f5b703dd291c85161cf] powerpc/64: switch asm 
> helpers from GOT to TOC relative addressing
> git bisect bad 754f611774e4b9357a944f5b703dd291c85161cf
> # bad: [f7bff6e7759b1abb59334f6448f9ef3172c4c04a] powerpc/64/interrupt: avoid 
> BUG/WARN recursion in interrupt entry
> git bisect bad f7bff6e7759b1abb59334f6448f9ef3172c4c04a
> # bad: [e485f6c751e0a969327336c635ca602feea117f0] powerpc/64/interrupt: Fix 
> return to masked context after hard-mask irq becomes pending
> git bisect bad e485f6c751e0a969327336c635ca602feea117f0
> # good: [799f7063c7645f9a751d17f5dfd73b952f962cd2] powerpc/64: mark irqs hard 
> disabled in boot paca
> git bisect good 799f7063c7645f9a751d17f5dfd73b952f962cd2
> # first bad commit: [e485f6c751e0a969327336c635ca602feea117f0] 
> powerpc/64/interrupt: Fix return to masked context after hard-mask irq 
> becomes pending


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> 
> I've also managed to not hit this bug a few times. When it triggers,
> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> optimized if possible.", there's a long hang - tens seconds before it
> continues. When it doesn't trigger, there's no hang at that point in the
> boot process.
> 

I managed to bisect the problem. See below for results. Reverting the
offending patch fixes the problem for me.

Guenter

---
# bad: [1440f576022887004f719883acb094e7e0dd4944] Merge tag 
'mm-hotfixes-stable-2022-10-11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
# good: [4fe89d07dcc2804c8b562f6c7896a45643d34b2f] Linux 6.0
git bisect start 'HEAD' 'v6.0'
# good: [7171a8da00035e7913c3013ca5fb5beb5b8b22f0] Merge tag 'arm-dt-6.1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
git bisect good 7171a8da00035e7913c3013ca5fb5beb5b8b22f0
# good: [f01603979a4afaad7504a728918b678d572cda9e] Merge tag 
'gpio-updates-for-v6.1-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
git bisect good f01603979a4afaad7504a728918b678d572cda9e
# bad: [8aeab132e05fefc3a1a5277878629586bd7a3547] Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect bad 8aeab132e05fefc3a1a5277878629586bd7a3547
# bad: [493ffd6605b2d3d4dc7008ab927dba319f36671f] Merge tag 
'ucount-rlimits-cleanups-for-v5.19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
git bisect bad 493ffd6605b2d3d4dc7008ab927dba319f36671f
# good: [0e470763d84dcad27284067647dfb4b1a94dfce0] Merge tag 
'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
git bisect good 0e470763d84dcad27284067647dfb4b1a94dfce0
# bad: [110a58b9f91c66f743c01a2c217243d94c899c23] powerpc/boot: Explicitly 
disable usage of SPE instructions
git bisect bad 110a58b9f91c66f743c01a2c217243d94c899c23
# good: [fdfdcfd504933ed06eb6b4c9df21eede0e213c3e] powerpc/build: put 
sys_call_table in .data.rel.ro if RELOCATABLE
git bisect good fdfdcfd504933ed06eb6b4c9df21eede0e213c3e
# good: [c2e7a19827eec443a7cbe85e8d959052412d6dc3] powerpc: Use generic 
fallocate compatibility syscall
git bisect good c2e7a19827eec443a7cbe85e8d959052412d6dc3
# good: [56adbb7a8b6cc7fc9b940829c38494e53c9e57d1] powerpc/64/interrupt: Fix 
false warning in context tracking due to idle state
git bisect good 56adbb7a8b6cc7fc9b940829c38494e53c9e57d1
# bad: [754f611774e4b9357a944f5b703dd291c85161cf] powerpc/64: switch asm 
helpers from GOT to TOC relative addressing
git bisect bad 754f611774e4b9357a944f5b703dd291c85161cf
# bad: [f7bff6e7759b1abb59334f6448f9ef3172c4c04a] powerpc/64/interrupt: avoid 
BUG/WARN recursion in interrupt entry
git bisect bad f7bff6e7759b1abb59334f6448f9ef3172c4c04a
# bad: [e485f6c751e0a969327336c635ca602feea117f0] powerpc/64/interrupt: Fix 
return to masked context after hard-mask irq becomes pending
git bisect bad e485f6c751e0a969327336c635ca602feea117f0
# good: [799f7063c7645f9a751d17f5dfd73b952f962cd2] powerpc/64: mark irqs hard 
disabled in boot paca
git bisect good 799f7063c7645f9a751d17f5dfd73b952f962cd2
# first bad commit: [e485f6c751e0a969327336c635ca602feea117f0] 
powerpc/64/interrupt: Fix return to masked context after hard-mask irq becomes 
pending


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael S. Tsirkin
On Wed, Oct 12, 2022 at 11:06:54PM +0200, Arnd Bergmann wrote:
> On Wed, Oct 12, 2022, at 7:22 PM, Linus Torvalds wrote:
> >
> > The NO_IRQ thing is mainly actually defined by a few drivers that just
> > never got converted to the proper world order, and even then you can
> > see the confusion (ie some drivers use "-1", others use "0", and yet
> > others use "((unsigned int)(-1)".
> 
> The last time I looked at removing it for arch/arm/, one problem was
> that there were a number of platforms using IRQ 0 as a valid number.
> We have converted most of them in the meantime, leaving now only
> mach-rpc and mach-footbridge. For the other platforms, we just
> renumbered all interrupts to add one, but footbridge apparently
> relies on hardcoded ISA interrupts in device drivers. For rpc,
> it looks like IRQ 0 (printer) already wouldn't work, and it
> looks like there was never a driver referencing it either.


Do these two boxes even have pci?

> I see that openrisc and parisc also still define NO_IRQ to -1, but at
> least openrisc already relies on 0 being the invalid IRQ (from
> CONFIG_IRQ_DOMAIN), probably parisc as well.
> 
>  Arnd



RE: [PATCH v1 3/5] treewide: use get_random_u32() when possible

2022-10-12 Thread David Laight
From: Joe Perches
> Sent: 12 October 2022 20:17
> 
> On Wed, 2022-10-05 at 23:48 +0200, Jason A. Donenfeld wrote:
> > The prandom_u32() function has been a deprecated inline wrapper around
> > get_random_u32() for several releases now, and compiles down to the
> > exact same code. Replace the deprecated wrapper with a direct call to
> > the real function.
> []
> > diff --git a/drivers/infiniband/hw/cxgb4/cm.c 
> > b/drivers/infiniband/hw/cxgb4/cm.c
> []
> > @@ -734,7 +734,7 @@ static int send_connect(struct c4iw_ep *ep)
> >>com.remote_addr;
> > int ret;
> > enum chip_type adapter_type = ep->com.dev->rdev.lldi.adapter_type;
> > -   u32 isn = (prandom_u32() & ~7UL) - 1;
> > +   u32 isn = (get_random_u32() & ~7UL) - 1;
> 
> trivia:
> 
> There are somewhat odd size mismatches here.
> 
> I had to think a tiny bit if random() returned a value from 0 to 7
> and was promoted to a 64 bit value then truncated to 32 bit.
> 
> Perhaps these would be clearer as ~7U and not ~7UL

That makes no difference - the compiler will generate the same code.

The real question is WTF is the code doing?
The '& ~7u' clears the bottom 3 bits.
The '- 1' then sets the bottom 3 bits and decrements the
(random) high bits.

So is the same as get_random_u32() | 7.
But I bet the coder had something else in mind.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 7:22 PM, Linus Torvalds wrote:
>
> The NO_IRQ thing is mainly actually defined by a few drivers that just
> never got converted to the proper world order, and even then you can
> see the confusion (ie some drivers use "-1", others use "0", and yet
> others use "((unsigned int)(-1)".

The last time I looked at removing it for arch/arm/, one problem was
that there were a number of platforms using IRQ 0 as a valid number.
We have converted most of them in the meantime, leaving now only
mach-rpc and mach-footbridge. For the other platforms, we just
renumbered all interrupts to add one, but footbridge apparently
relies on hardcoded ISA interrupts in device drivers. For rpc,
it looks like IRQ 0 (printer) already wouldn't work, and it
looks like there was never a driver referencing it either.

I see that openrisc and parisc also still define NO_IRQ to -1, but at
least openrisc already relies on 0 being the invalid IRQ (from
CONFIG_IRQ_DOMAIN), probably parisc as well.

 Arnd


Re: [PATCH v2] perf: Rewrite core context handling

2022-10-12 Thread Peter Zijlstra
On Wed, Oct 12, 2022 at 02:16:29PM +0200, Peter Zijlstra wrote:

> That's the intent yeah. But due to not always holding ctx->mutex over
> put_pmu_ctx() this might be moot. I'm almost through auditing epc usage
> and I think ctx->lock is sufficient, fingers crossed.

So the very last epc usage threw a spanner into the works and made
things complicated.

Specifically sys_perf_event_open()'s group_leader case uses
event->pmu_ctx while only holding ctx->mutex. Therefore we can't fully
let go of ctx->mutex locking and purely rely on ctx->lock.

Now the good news is that the annoying put_pmu_ctx() without holding
ctx->mutex case doesn't actually matter here. Since we hold a reference
on the group_leader (per the filedesc) the event can't go away,
therefore it must have a pmu_ctx, and then holding ctx->mutex ensures
the pmu_ctx is stable -- iow it serializes against
sys_perf_event_open()'s move_group and perf_pmu_migrate_context()
changing the epc around.

So we're going with the normal mutex+lock for modification rule, but
allow the weird put_pmu_ctx() exception.

I have the below delta.

I'm hoping we can call this done -- I'm going to see if I can bribe Mark
to take a look at the arm64 thing soon and then hopefully queue the
whole thing once -rc1 happens. That should give us a good long soak
until the next merge window.

---
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -826,21 +826,28 @@ struct perf_event {
 };
 
 /*
- *   ,[1:n]-.
+ *   ,---[1:n]--.
  *   V  V
  * perf_event_context <-[1:n]-> perf_event_pmu_context <--- perf_event
  *   ^  ^ | |
  *   `[1:n]-' `-[n:1]-> pmu <-[1:n]-'
  *
  *
- * XXX destroy epc when empty
- *   refcount, !rcu
- *
- * XXX epc locking
- *
- *   event->pmu_ctxctx->mutex && inactive
- *   ctx->pmu_ctx_list ctx->mutex && ctx->lock
- *
+ * struct perf_event_pmu_context  lifetime is refcount based and RCU freed
+ * (similar to perf_event_context). Locking is as if it were a member of
+ * perf_event_context; specifically:
+ *
+ *   modification, both: ctx->mutex && ctx->lock
+ *   reading, either:ctx->mutex || ctx->lock
+ *
+ * There is one exception to this; namely put_pmu_ctx() isn't always called
+ * with ctx->mutex held; this means that as long as we can guarantee the epc
+ * has events the above rules hold.
+ *
+ * Specificially, sys_perf_event_open()'s group_leader case depends on
+ * ctx->mutex pinning the configuration. Since we hold a reference on
+ * group_leader (through the filedesc) it can't fo away, therefore it's
+ * associated pmu_ctx must exist and cannot change due to ctx->mutex.
  */
 struct perf_event_pmu_context {
struct pmu  *pmu;
@@ -857,6 +864,7 @@ struct perf_event_pmu_context {
unsigned intnr_events;
 
atomic_trefcount; /* event <-> epc */
+   struct rcu_head rcu_head;
 
void*task_ctx_data; /* pmu specific data */
/*
@@ -906,7 +914,7 @@ struct perf_event_context {
int nr_freq;
int rotate_disable;
 
-   refcount_t  refcount;
+   refcount_t  refcount; /* event <-> ctx */
struct task_struct  *task;
 
/*
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1727,6 +1727,10 @@ perf_event_groups_next(struct perf_event
return NULL;
 }
 
+#define perf_event_groups_for_cpu_pmu(event, groups, cpu, pmu) \
+   for (event = perf_event_groups_first(groups, cpu, pmu, NULL);   \
+event; event = perf_event_groups_next(event, pmu))
+
 /*
  * Iterate through the whole groups tree.
  */
@@ -3366,6 +3370,14 @@ static void perf_event_sync_stat(struct
}
 }
 
+#define double_list_for_each_entry(pos1, pos2, head1, head2, member)   \
+   for (pos1 = list_first_entry(head1, typeof(*pos1), member), \
+pos2 = list_first_entry(head2, typeof(*pos2), member); \
+!list_entry_is_head(pos1, head1, member) &&\
+!list_entry_is_head(pos2, head2, member);  \
+pos1 = list_next_entry(pos1, member),  \
+pos2 = list_next_entry(pos2, member))
+
 static void perf_event_swap_task_ctx_data(struct perf_event_context *prev_ctx,
  struct perf_event_context *next_ctx)
 {
@@ -3374,17 +3386,12 @@ static void perf_event_swap_task_ctx_dat
if (!prev_ctx->nr_task_data)
return;
 
-   prev_epc = list_first_entry(_ctx->pmu_ctx_list,
-   struct perf_event_pmu_context,
-

Re: [PATCH v1 3/5] treewide: use get_random_u32() when possible

2022-10-12 Thread Joe Perches
On Wed, 2022-10-05 at 23:48 +0200, Jason A. Donenfeld wrote:
> The prandom_u32() function has been a deprecated inline wrapper around
> get_random_u32() for several releases now, and compiles down to the
> exact same code. Replace the deprecated wrapper with a direct call to
> the real function.
[]
> diff --git a/drivers/infiniband/hw/cxgb4/cm.c 
> b/drivers/infiniband/hw/cxgb4/cm.c
[]
> @@ -734,7 +734,7 @@ static int send_connect(struct c4iw_ep *ep)
>  >com.remote_addr;
>   int ret;
>   enum chip_type adapter_type = ep->com.dev->rdev.lldi.adapter_type;
> - u32 isn = (prandom_u32() & ~7UL) - 1;
> + u32 isn = (get_random_u32() & ~7UL) - 1;

trivia:

There are somewhat odd size mismatches here.

I had to think a tiny bit if random() returned a value from 0 to 7
and was promoted to a 64 bit value then truncated to 32 bit.

Perhaps these would be clearer as ~7U and not ~7UL

>   struct net_device *netdev;
>   u64 params;
>  
> @@ -2469,7 +2469,7 @@ static int accept_cr(struct c4iw_ep *ep, struct sk_buff 
> *skb,
>   }
>  
>   if (!is_t4(adapter_type)) {
> - u32 isn = (prandom_u32() & ~7UL) - 1;
> + u32 isn = (get_random_u32() & ~7UL) - 1;

etc...

drivers/infiniband/hw/cxgb4/cm.c:   u32 isn = (prandom_u32() & ~7UL) - 1;
drivers/infiniband/hw/cxgb4/cm.c:   u32 isn = (prandom_u32() & 
~7UL) - 1;
drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c:rpl5->iss = 
cpu_to_be32((prandom_u32() & ~7UL) - 1);
drivers/scsi/cxgbi/cxgb4i/cxgb4i.c: u32 isn = (prandom_u32() & 
~7UL) - 1;
drivers/scsi/cxgbi/cxgb4i/cxgb4i.c: u32 isn = (prandom_u32() & 
~7UL) - 1;
drivers/target/iscsi/cxgbit/cxgbit_cm.c:rpl5->iss = 
cpu_to_be32((prandom_u32() & ~7UL) - 1);



Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Jason A. Donenfeld
On Wed, Oct 12, 2022 at 10:48:26AM -0700, Guenter Roeck wrote:
> > I've also managed to not hit this bug a few times. When it triggers,
> > after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> > optimized if possible.", there's a long hang - tens seconds before it
> > continues. When it doesn't trigger, there's no hang at that point in the
> > boot process.
> > 
> 
> That probably explains why my attempts to bisect the problem were
> unsuccessful.

So I just did this:

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 2fe28eeb2f38..2d70bc09db7e 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1212,6 +1212,7 @@ static void __cold try_to_generate_entropy(void)
struct entropy_timer_state stack;
unsigned int i, num_different = 0;
unsigned long last = random_get_entropy();
+   return;

for (i = 0; i < NUM_TRIAL_SAMPLES - 1; ++i) {
stack.entropy = random_get_entropy();

And then ran it, and now we get the lockup from the idle process:

udhcpc: started, v1.33.0
udhcpc: sending discover
watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-28380-gde492c83cae0-dirty #10
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
NIP:  c00300f8 LR: c00304e8 CTR: c001a410
REGS: c28c79a8 TRAP: 0900   Not tainted  
(6.0.0-28380-gde492c83cae0-dirty)
MSR:  8000b032   CR: 24088442  XER: 
IRQMASK: 0
GPR00: c00304e8 c28c7b30 c1435500 c28c79a8
GPR04: c13366c0  0010029c 
GPR08: c2d3bbb0  c2883d00 c2915500
GPR12: 44088442 c2e0 0007 02295698
GPR16: 039400e8 02295258 02295660 022953d0
GPR20: 02295b10 022b34d0 02295b38 03945500
GPR24: 03945500 0008 c2883d80 c2883d00
GPR28: c290d0c0 0001 c290d018 c290cc78
NIP [c00300f8] .replay_soft_interrupts+0x28/0x2d0
LR [c00304e8] .arch_local_irq_restore+0x148/0x1a0
Call Trace:
[c28c7b30] [c00304e8] .arch_local_irq_restore+0x148/0x1a0 
(unreliable)
[c28c7bb0] [c001a388] .arch_cpu_idle+0xb8/0x140
[c28c7c30] [c0fd4940] .default_idle_call+0x80/0xc8
[c28c7ca0] [c0148480] .do_idle+0x150/0x1a0
[c28c7d50] [c0148748] .cpu_startup_entry+0x38/0x40
[c28c7dd0] [c00113a8] .rest_init+0x168/0x170
[c28c7e60] [c2004224] .arch_post_acpi_subsys_init+0x0/0x24
[c28c7ed0] [c2004ba8] .start_kernel+0x8d0/0x924
[c28c7f90] [c000d4ac] start_here_common+0x1c/0x20
Instruction dump:
6000 6000 7c0802a6 f8010010 f821fe01 6000 6000 38610078
e92d0af8 f92101f8 3920 4803a491 <6000> 3920 e9410180 f92101b0
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G L 
6.0.0-28380-gde492c83cae0-dirty #10
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
Call Trace:
[c28c74a0] [c0f93b90] .dump_stack_lvl+0x7c/0xc4 (unreliable)
[c28c7530] [c00d2a58] .panic+0x180/0x438
[c28c75e0] [c0232424] .watchdog_timer_fn+0x3a4/0x410
[c28c76a0] [c01cb964] .__hrtimer_run_queues+0x1f4/0x590
[c28c77a0] [c01cc354] .hrtimer_interrupt+0x134/0x300
[c28c7860] [c0021cd4] .timer_interrupt+0x1c4/0x5d0
[c28c7930] [c00302f8] .replay_soft_interrupts+0x228/0x2d0
[c28c7b30] [c00304e8] .arch_local_irq_restore+0x148/0x1a0
[c28c7bb0] [c001a388] .arch_cpu_idle+0xb8/0x140
[c28c7c30] [c0fd4940] .default_idle_call+0x80/0xc8
[c28c7ca0] [c0148480] .do_idle+0x150/0x1a0
[c28c7d50] [c0148748] .cpu_startup_entry+0x38/0x40
[c28c7dd0] [c00113a8] .rest_init+0x168/0x170
[c28c7e60] [c2004224] .arch_post_acpi_subsys_init+0x0/0x24
[c28c7ed0] [c2004ba8] .start_kernel+0x8d0/0x924
[c28c7f90] [c0


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Wed, Oct 12, 2022 at 11:20:38AM -0600, Jason A. Donenfeld wrote:
> On Wed, Oct 12, 2022 at 09:44:52AM -0700, Guenter Roeck wrote:
> > On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> > > On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> > > > NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> > > > LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> > > > Call Trace:
> > > > [c7df3870] [c0031964] 
> > > > .arch_local_irq_restore+0x94/0x1c0 (unreliable)
> > > > [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> > > > [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> > > > [c7df3a50] [c092f0dc] 
> > > > .try_to_generate_entropy+0x118/0x174
> > > > [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> > > > [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> > > > [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> > > > [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> > > > [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> > > 
> > > Obviously the first couple lines of this concern me a bit. But I think
> > > actually this might just be a catalyst for another bug. You could view
> > > that function as basically just:
> > > 
> > > while (something)
> > >   schedule();
> > > 
> > > And I guess in the process of calling the scheduler a lot, which toggles
> > > interrupts a lot, something got wedged.
> > > 
> > > Curious, though, I did try to reproduce this, to no avail. My .config is
> > > https://xn--4db.cc/rBvHWfDZ . What's yours?
> > > 
> > 
> > Attached. My qemu command line is
> 
> Okay, thanks, I reproduced it. In this case, I suspect
> try_to_generate_entropy() is just the messenger. There's an earlier
> problem:
> 

That problem is not new but has existed for a couple of releases, and has
never caused a hang until now.

> BUG: using smp_processor_id() in preemptible [] code: swapper/0/1
> caller is .__flush_tlb_pending+0x40/0xf0
> CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-28380-gde492c83cae0-dirty #4
> Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
> Call Trace:
> [c44c3540] [c0f93ef0] .dump_stack_lvl+0x7c/0xc4 (unreliable)
> [c44c35d0] [c0fc9550] .check_preemption_disabled+0x140/0x150
> [c44c3660] [c0073dd0] .__flush_tlb_pending+0x40/0xf0
> [c44c36f0] [c0334434] .__apply_to_page_range+0x764/0xa30
> [c44c3840] [c006cad0] .change_memory_attr+0xf0/0x160
> [c44c38d0] [c02a1d70] .bpf_prog_select_runtime+0x150/0x230
> [c44c3970] [c0d405d4] .bpf_prepare_filter+0x504/0x6f0
> [c44c3a30] [c0d4085c] .bpf_prog_create+0x9c/0x140
> [c44c3ac0] [c2051d9c] .ptp_classifier_init+0x44/0x78
> [c44c3b50] [c2050f3c] .sock_init+0xe0/0x100
> [c44c3bd0] [c0010bd4] .do_one_initcall+0xa4/0x438
> [c44c3cc0] [c2005008] .kernel_init_freeable+0x378/0x428
> [c44c3da0] [c00113d8] .kernel_init+0x28/0x1a0
> [c44c3e10] [c000ca3c] .ret_from_kernel_thread+0x58/0x60
> 
> This in turn is because __flush_tlb_pending() calls:
> 
> static inline int mm_is_thread_local(struct mm_struct *mm)
> {
> return cpumask_equal(mm_cpumask(mm),
>   cpumask_of(smp_processor_id()));
> }
> 
> __flush_tlb_pending() has a comment about this:
> 
>  * Must be called from within some kind of spinlock/non-preempt region...
>  */
> void __flush_tlb_pending(struct ppc64_tlb_batch *batch)
> 
> So I guess that didn't happen for some reason? Maybe this is indicative
> of some lock imbalance that then gets hit later?
> 
> I've also managed to not hit this bug a few times. When it triggers,
> after "kprobes: kprobe jump-optimization is enabled. All kprobes are
> optimized if possible.", there's a long hang - tens seconds before it
> continues. When it doesn't trigger, there's no hang at that point in the
> boot process.
> 

That probably explains why my attempts to bisect the problem were
unsuccessful.

Thanks,
Guenter


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Linus Torvalds
On Wed, Oct 12, 2022 at 8:51 AM Michael S. Tsirkin  wrote:
>
> Are you sure?

MichaelE is right.

This is just bogus historical garbage:

> arch/arm/include/asm/irq.h:#ifndef NO_IRQ
> arch/arm/include/asm/irq.h:#define NO_IRQ   ((unsigned int)(-1))

that I've tried to get rid of for years, but for some reason it just won't die.

NO_IRQ should be zero. Or rather, it shouldn't exist at all. It's a bogus thing.

You can see just how bogus it is from grepping for it - the users are
all completely and utterly confused, and all are entirely historical
brokenness.

The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does

if (dev->irq) ...

which is why you will only find a few instances of NO_IRQ in the tree
in the first place.

The NO_IRQ thing is mainly actually defined by a few drivers that just
never got converted to the proper world order, and even then you can
see the confusion (ie some drivers use "-1", others use "0", and yet
others use "((unsigned int)(-1)".

   Linus


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Jason A. Donenfeld
On Wed, Oct 12, 2022 at 09:44:52AM -0700, Guenter Roeck wrote:
> On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> > On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> > > NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> > > LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> > > Call Trace:
> > > [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
> > > (unreliable)
> > > [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> > > [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> > > [c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
> > > [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> > > [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> > > [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> > > [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> > > [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> > 
> > Obviously the first couple lines of this concern me a bit. But I think
> > actually this might just be a catalyst for another bug. You could view
> > that function as basically just:
> > 
> > while (something)
> > schedule();
> > 
> > And I guess in the process of calling the scheduler a lot, which toggles
> > interrupts a lot, something got wedged.
> > 
> > Curious, though, I did try to reproduce this, to no avail. My .config is
> > https://xn--4db.cc/rBvHWfDZ . What's yours?
> > 
> 
> Attached. My qemu command line is

Okay, thanks, I reproduced it. In this case, I suspect
try_to_generate_entropy() is just the messenger. There's an earlier
problem:

BUG: using smp_processor_id() in preemptible [] code: swapper/0/1
caller is .__flush_tlb_pending+0x40/0xf0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-28380-gde492c83cae0-dirty #4
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
Call Trace:
[c44c3540] [c0f93ef0] .dump_stack_lvl+0x7c/0xc4 (unreliable)
[c44c35d0] [c0fc9550] .check_preemption_disabled+0x140/0x150
[c44c3660] [c0073dd0] .__flush_tlb_pending+0x40/0xf0
[c44c36f0] [c0334434] .__apply_to_page_range+0x764/0xa30
[c44c3840] [c006cad0] .change_memory_attr+0xf0/0x160
[c44c38d0] [c02a1d70] .bpf_prog_select_runtime+0x150/0x230
[c44c3970] [c0d405d4] .bpf_prepare_filter+0x504/0x6f0
[c44c3a30] [c0d4085c] .bpf_prog_create+0x9c/0x140
[c44c3ac0] [c2051d9c] .ptp_classifier_init+0x44/0x78
[c44c3b50] [c2050f3c] .sock_init+0xe0/0x100
[c44c3bd0] [c0010bd4] .do_one_initcall+0xa4/0x438
[c44c3cc0] [c2005008] .kernel_init_freeable+0x378/0x428
[c44c3da0] [c00113d8] .kernel_init+0x28/0x1a0
[c44c3e10] [c000ca3c] .ret_from_kernel_thread+0x58/0x60

This in turn is because __flush_tlb_pending() calls:

static inline int mm_is_thread_local(struct mm_struct *mm)
{
return cpumask_equal(mm_cpumask(mm),
  cpumask_of(smp_processor_id()));
}

__flush_tlb_pending() has a comment about this:

 * Must be called from within some kind of spinlock/non-preempt region...
 */
void __flush_tlb_pending(struct ppc64_tlb_batch *batch)

So I guess that didn't happen for some reason? Maybe this is indicative
of some lock imbalance that then gets hit later?

I've also managed to not hit this bug a few times. When it triggers,
after "kprobes: kprobe jump-optimization is enabled. All kprobes are
optimized if possible.", there's a long hang - tens seconds before it
continues. When it doesn't trigger, there's no hang at that point in the
boot process.

Jason


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Wed, Oct 12, 2022 at 10:45:46AM -0600, Jason A. Donenfeld wrote:
> On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> > On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> > > NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> > > LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> > > Call Trace:
> > > [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
> > > (unreliable)
> > > [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> > > [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> > > [c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
> > > [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> > > [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> > > [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> > > [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> > > [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> > 
> > Obviously the first couple lines of this concern me a bit. But I think
> > actually this might just be a catalyst for another bug. You could view
> > that function as basically just:
> > 
> > while (something)
> > schedule();
> > 
> > And I guess in the process of calling the scheduler a lot, which toggles
> > interrupts a lot, something got wedged.
> > 
> > Curious, though, I did try to reproduce this, to no avail. My .config is
> > https://xn--4db.cc/rBvHWfDZ . What's yours?
> 
> I also just tried using your github linux-build-test scripts as a guide
> for construction a config -- https://xn--4db.cc/B0HpEQDQ -- and loaded
> up your rootfs over sdhci and such, and still couldn't manage to
> reproduce. I tried commenting out the line "if (!bits)" in
> _credit_init_bits(), so that the rng would never initialize, so that the
> schedule() loop would just keep on running indefinitely, but still no
> dice.
> 
> But also, I'm running Linus' tree. From your log, I see
> "6.0.0-rc2-00163-ga5edf9815dd7". So maybe these bugs got fixed
> elsewhere?
> 

Blame me for not attaching the latest crash report.

Guenter

---
BUG: soft lockup - CPU#0 stuck for 23s! [dd:111]
Modules linked in:
CPU: 0 PID: 111 Comm: dd Not tainted 6.0.0-11414-g49da07006239 #1
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
NIP:  c0031630 LR: c0031964 CTR: 
REGS: c7d5b6a8 TRAP: 0900   Not tainted  (6.0.0-11414-g49da07006239)
MSR:  80009032   CR: 28002228  XER: 
IRQMASK: 0
GPR00: c0031964 c7d5b870 c13e5500 c7d5b6a8
GPR04: c125e1c0  c7d5b814 c291d018
GPR08: c2d4bbb8  c7356400 c2d21098
GPR12: 2800 c2e2 100d32e0 100d32b4
GPR16: 100d3301 100d32b9 100d3358 100d32bf
GPR20: 2000 100d3372 100d331e c7356c18
GPR24:  0e60 0900 0500
GPR28: 0a00 0f00 0002 0003
NIP [c0031630] .replay_soft_interrupts+0x60/0x300
LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
Call Trace:
[c7d5b870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
(unreliable)
[c7d5b8f0] [c0f8bac4] .__schedule+0x664/0xa50
[c7d5b9d0] [c0f8bf30] .schedule+0x80/0x140
[c7d5ba50] [c093085c] .try_to_generate_entropy+0x118/0x174
[c7d5bb40] [c092fa64] .urandom_read_iter+0x74/0x140
[c7d5bbc0] [c03b0044] .vfs_read+0x284/0x2d0
[c7d5bcd0] [c03b0d2c] .ksys_read+0xdc/0x130
[c7d5bd80] [c002a88c] .system_call_exception+0x19c/0x330
[c7d5be10] [c000c1d4] system_call_common+0xf4/0x258
--- interrupt: c00 at 0x7fffb5c9d49c
NIP:  7fffb5c9d49c LR: 1000da90 CTR: 
REGS: c7d5be80 TRAP: 0c00   Not tainted  (6.0.0-11414-g49da07006239)
MSR:  8000f032   CR: 22002422  XER: 
IRQMASK: 0
GPR00: 0003 76dcc220 7fffb5d97300 
GPR04: 101102a0 0020  
GPR08:    
GPR12:  7fffb5e6aac0 100d32e0 100d32b4
GPR16: 100d3301 100d32b9 100d3358 100d32bf
GPR20: 2000 100d3372 100d331e 
GPR24: 7fff 100b3a9c 101102a0 0020
GPR28: 101025c0 0020  
NIP [7fffb5c9d49c] 0x7fffb5c9d49c
LR [1000da90] 0x1000da90
--- interrupt: c00
Instruction dump:
3b600500 3b800a00 3ba00f00 f8010010 f821fdc1 6000 6000 38610078
e92d0af8 f92101f8 3920 48039745 <6000> 3900 e9410180 

Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Jason A. Donenfeld
On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> > NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> > LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> > Call Trace:
> > [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
> > (unreliable)
> > [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> > [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> > [c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
> > [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> > [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> > [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> > [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> > [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> 
> Obviously the first couple lines of this concern me a bit. But I think
> actually this might just be a catalyst for another bug. You could view
> that function as basically just:
> 
> while (something)
>   schedule();
> 
> And I guess in the process of calling the scheduler a lot, which toggles
> interrupts a lot, something got wedged.
> 
> Curious, though, I did try to reproduce this, to no avail. My .config is
> https://xn--4db.cc/rBvHWfDZ . What's yours?

I also just tried using your github linux-build-test scripts as a guide
for construction a config -- https://xn--4db.cc/B0HpEQDQ -- and loaded
up your rootfs over sdhci and such, and still couldn't manage to
reproduce. I tried commenting out the line "if (!bits)" in
_credit_init_bits(), so that the rng would never initialize, so that the
schedule() loop would just keep on running indefinitely, but still no
dice.

But also, I'm running Linus' tree. From your log, I see
"6.0.0-rc2-00163-ga5edf9815dd7". So maybe these bugs got fixed
elsewhere?

Jason


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Wed, Oct 12, 2022 at 09:49:26AM -0600, Jason A. Donenfeld wrote:
> On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> > NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> > LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> > Call Trace:
> > [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
> > (unreliable)
> > [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> > [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> > [c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
> > [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> > [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> > [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> > [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> > [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
> 
> Obviously the first couple lines of this concern me a bit. But I think
> actually this might just be a catalyst for another bug. You could view
> that function as basically just:
> 
> while (something)
>   schedule();
> 
> And I guess in the process of calling the scheduler a lot, which toggles
> interrupts a lot, something got wedged.
> 
> Curious, though, I did try to reproduce this, to no avail. My .config is
> https://xn--4db.cc/rBvHWfDZ . What's yours?
> 

Attached. My qemu command line is

qemu-system-ppc64 -M mac99 -cpu ppc64 \
 -m 1G -kernel vmlinux -snapshot -device e1000,netdev=net0 \
 -netdev user,id=net0 -device sdhci-pci -device sd-card,drive=d0 \
 -drive file=/var/cache/buildbot/ppc64/rootfs.ext2,format=raw,if=none,id=d0 
\
 -nographic -vga none -monitor null -no-reboot \
 --append "root=/dev/mmcblk0 rootwait console=tty console=ttyS0"

Qemu version is 7.0. The root file system is from
https://github.com/groeck/linux-build-test/tree/master/rootfs/ppc64

I used to have self tests enabled, but with that (specifically, with
CONFIG_STRING_SELFTEST=y) I now get a different hang, so I disabled that
for the time being.

Guenter
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_PREEMPT=y
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_CGROUPS=y
CONFIG_MEMCG=y
CONFIG_BLK_CGROUP=y
CONFIG_CGROUP_SCHED=y
CONFIG_RT_GROUP_SCHED=y
CONFIG_CGROUP_FREEZER=y
CONFIG_CPUSETS=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
CONFIG_CGROUP_DEBUG=y
CONFIG_NAMESPACES=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_EMBEDDED=y
CONFIG_PROFILING=y
CONFIG_PPC64=y
# CONFIG_PPC_POWERNV is not set
CONFIG_DTL=y
# CONFIG_CPU_IDLE is not set
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_BINFMT_MISC=m
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_XFRM_USER=m
CONFIG_XFRM_SUB_POLICY=y
CONFIG_NET_KEY=m
CONFIG_NET_KEY_MIGRATE=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
CONFIG_NET_IPIP=m
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_INET6_AH=y
CONFIG_INET6_ESP=y
CONFIG_INET6_IPCOMP=m
CONFIG_IPV6_TUNNEL=m
CONFIG_NETFILTER=y
CONFIG_NF_CONNTRACK=m
CONFIG_NF_CONNTRACK_AMANDA=m
CONFIG_NF_CONNTRACK_FTP=m
CONFIG_NF_CONNTRACK_H323=m
CONFIG_NF_CONNTRACK_IRC=m
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
CONFIG_NF_CONNTRACK_PPTP=m
CONFIG_NF_CONNTRACK_SANE=m
CONFIG_NF_CONNTRACK_SIP=m
CONFIG_NF_CONNTRACK_TFTP=m
CONFIG_NF_CT_NETLINK=m
CONFIG_NETFILTER_XT_TARGET_CLASSIFY=m
CONFIG_NETFILTER_XT_TARGET_CONNMARK=m
CONFIG_NETFILTER_XT_TARGET_DSCP=m
CONFIG_NETFILTER_XT_TARGET_MARK=m
CONFIG_NETFILTER_XT_TARGET_NFLOG=m
CONFIG_NETFILTER_XT_TARGET_NFQUEUE=m
CONFIG_NETFILTER_XT_TARGET_NOTRACK=m
CONFIG_NETFILTER_XT_TARGET_TRACE=m
CONFIG_NETFILTER_XT_TARGET_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_COMMENT=m
CONFIG_NETFILTER_XT_MATCH_CONNBYTES=m
CONFIG_NETFILTER_XT_MATCH_CONNLIMIT=m
CONFIG_NETFILTER_XT_MATCH_CONNMARK=m
CONFIG_NETFILTER_XT_MATCH_CONNTRACK=m
CONFIG_NETFILTER_XT_MATCH_DCCP=m
CONFIG_NETFILTER_XT_MATCH_DSCP=m
CONFIG_NETFILTER_XT_MATCH_ESP=m
CONFIG_NETFILTER_XT_MATCH_HASHLIMIT=m
CONFIG_NETFILTER_XT_MATCH_HELPER=m
CONFIG_NETFILTER_XT_MATCH_LENGTH=m
CONFIG_NETFILTER_XT_MATCH_LIMIT=m
CONFIG_NETFILTER_XT_MATCH_MAC=m
CONFIG_NETFILTER_XT_MATCH_MARK=m
CONFIG_NETFILTER_XT_MATCH_MULTIPORT=m
CONFIG_NETFILTER_XT_MATCH_POLICY=m
CONFIG_NETFILTER_XT_MATCH_PKTTYPE=m
CONFIG_NETFILTER_XT_MATCH_QUOTA=m
CONFIG_NETFILTER_XT_MATCH_REALM=m
CONFIG_NETFILTER_XT_MATCH_STATE=m
CONFIG_NETFILTER_XT_MATCH_STATISTIC=m
CONFIG_NETFILTER_XT_MATCH_STRING=m
CONFIG_NETFILTER_XT_MATCH_TCPMSS=m
CONFIG_NETFILTER_XT_MATCH_U32=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_AH=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m

Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Michal Suchánek
On Wed, Oct 12, 2022 at 05:59:45PM +0300, Ville Syrjälä wrote:
> On Wed, Oct 12, 2022 at 04:31:14PM +0200, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 12.10.22 um 15:12 schrieb Arnd Bergmann:
> > > On Wed, Oct 12, 2022, at 2:00 PM, Thomas Zimmermann wrote:
> > >>
> > >> Could well be. But ofdrm intents to replace offb and this test has
> > >> worked well in offb for almost 15 yrs. If there are bug reports, I'm
> > >> happy to take patches, but until then I see no reason to change it.
> > > 
> > > I wouldn't change the code in offb unless a user reports a bug,
> > > but I don't see a point in adding the same mistake to ofdrm if we
> > > know it can't work on real hardware.
> > 
> > As I said, this has worked with offb and apparently on real hardware. 
> > For all I know, ATI hardware (before it became AMD) was used in PPC 
> > Macintoshs and assumed big-endian access on those machines.
> 
> At least mach64 class hardware has two frame buffer apertures, and
> byte swapping can be configured separately for each. But that means
> you only get correct byte swapping for at most two bpps at the same
> time (and that only if you know which aperture to access each time).
> IIRC Rage 128 already has the surface register stuff where you
> could byte swap a limited set of ranges independently. And old
> mga hardware has just one byte swap setting for the whole frame
> buffer aperture, so only one bpp at a time.
> 
> That kind of horrible limitations of the byte swappers is the
> main reason why I wanted to make drm fourcc endianness explicit.
> Simply assuming host endianness would end in tears on big endian
> as soon as you need to access stuff with two bpps at the same time.
> Much better to just switch off those useless byte swappers and
> swap by hand when necessary.

If you have hardware-specific driver, sure.

This is firmware-provided framebuffer, though. You get one framebuffer
address, and one endian - whatever the firmware set up and described in
the DT.

Thanks

Michal


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael S. Tsirkin
On Thu, Oct 13, 2022 at 01:33:59AM +1100, Michael Ellerman wrote:
> Michael Ellerman  writes:
> > [ Cc += Bjorn & linux-pci ]
> >
> > "Michael S. Tsirkin"  writes:
> >> On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
> >>> "Michael S. Tsirkin"  writes:
> > ...
> >>> > 
> >>> > virtio: fixes, features
> >>> >
> >>> > 9k mtu perf improvements
> >>> > vdpa feature provisioning
> >>> > virtio blk SECURE ERASE support
> >>> >
> >>> > Fixes, cleanups all over the place.
> >>> >
> >>> > Signed-off-by: Michael S. Tsirkin 
> >>> >
> >>> > 
> >>> > Alvaro Karsz (1):
> >>> >   virtio_blk: add SECURE ERASE command support
> >>> >
> >>> > Angus Chen (1):
> >>> >   virtio_pci: don't try to use intxif pin is zero
> >>> 
> >>> This commit breaks virtio_pci for me on powerpc, when running as a qemu
> >>> guest.
> >>> 
> >>> vp_find_vqs() bails out because pci_dev->pin == 0.
> >>> 
> >>> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> >>> succeed if we called it - which is what the code used to do.
> >>> 
> >>> I think this happens because pci_dev->pin is not populated in
> >>> pci_assign_irq().
> >>> 
> >>> I would absolutely believe this is bug in our PCI code, but I think it
> >>> may also affect other platforms that use of_irq_parse_and_map_pci().
> >>
> >> How about fixing this in of_irq_parse_and_map_pci then?
> >> Something like the below maybe?
> >> 
> >> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> >> index 196834ed44fe..504c4d75c83f 100644
> >> --- a/drivers/pci/of.c
> >> +++ b/drivers/pci/of.c
> >> @@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev 
> >> *pdev, struct of_phandle_args *
> >>if (pin == 0)
> >>return -ENODEV;
> >>  
> >> +  pdev->pin = pin;
> >> +
> >>/* Local interrupt-map in the device node? Use it! */
> >>if (of_get_property(dn, "interrupt-map", NULL)) {
> >>pin = pci_swizzle_interrupt_pin(pdev, pin);
> 
> Backing up a bit. Should the virtio code be looking at pci_dev->pin in
> the first place?
> 
> Shouldn't it be checking pci_dev->irq instead?
> 
> The original commit talks about irq being 0 and colliding with the timer
> interrupt.
> 
> But all (most?) platforms have converged on 0 meaning NO_IRQ since quite
> a fews ago AFAIK.

Are you sure?

arch/arm/include/asm/irq.h:#ifndef NO_IRQ
arch/arm/include/asm/irq.h:#define NO_IRQ   ((unsigned int)(-1))



> And the timer irq == 0 is a special case AIUI:
>   
> https://lore.kernel.org/all/ca+55afwilp1z+2mzkrfsid1wzq0tqkcn8f2e6nl_avr+m1f...@mail.gmail.com/
> 
> cheers



Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Jason A. Donenfeld
On Wed, Oct 12, 2022 at 07:18:27AM -0700, Guenter Roeck wrote:
> NIP [c0031630] .replay_soft_interrupts+0x60/0x300
> LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
> Call Trace:
> [c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
> (unreliable)
> [c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
> [c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
> [c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
> [c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
> [c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
> [c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
> [c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
> [c7df3e10] [c000c1d4] system_call_common+0xf4/0x258

Obviously the first couple lines of this concern me a bit. But I think
actually this might just be a catalyst for another bug. You could view
that function as basically just:

while (something)
schedule();

And I guess in the process of calling the scheduler a lot, which toggles
interrupts a lot, something got wedged.

Curious, though, I did try to reproduce this, to no avail. My .config is
https://xn--4db.cc/rBvHWfDZ . What's yours?

Jason


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman
Michael Ellerman  writes:
> [ Cc += Bjorn & linux-pci ]
>
> "Michael S. Tsirkin"  writes:
>> On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
>>> "Michael S. Tsirkin"  writes:
> ...
>>> > 
>>> > virtio: fixes, features
>>> >
>>> > 9k mtu perf improvements
>>> > vdpa feature provisioning
>>> > virtio blk SECURE ERASE support
>>> >
>>> > Fixes, cleanups all over the place.
>>> >
>>> > Signed-off-by: Michael S. Tsirkin 
>>> >
>>> > 
>>> > Alvaro Karsz (1):
>>> >   virtio_blk: add SECURE ERASE command support
>>> >
>>> > Angus Chen (1):
>>> >   virtio_pci: don't try to use intxif pin is zero
>>> 
>>> This commit breaks virtio_pci for me on powerpc, when running as a qemu
>>> guest.
>>> 
>>> vp_find_vqs() bails out because pci_dev->pin == 0.
>>> 
>>> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
>>> succeed if we called it - which is what the code used to do.
>>> 
>>> I think this happens because pci_dev->pin is not populated in
>>> pci_assign_irq().
>>> 
>>> I would absolutely believe this is bug in our PCI code, but I think it
>>> may also affect other platforms that use of_irq_parse_and_map_pci().
>>
>> How about fixing this in of_irq_parse_and_map_pci then?
>> Something like the below maybe?
>> 
>> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
>> index 196834ed44fe..504c4d75c83f 100644
>> --- a/drivers/pci/of.c
>> +++ b/drivers/pci/of.c
>> @@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, 
>> struct of_phandle_args *
>>  if (pin == 0)
>>  return -ENODEV;
>>  
>> +pdev->pin = pin;
>> +
>>  /* Local interrupt-map in the device node? Use it! */
>>  if (of_get_property(dn, "interrupt-map", NULL)) {
>>  pin = pci_swizzle_interrupt_pin(pdev, pin);

Backing up a bit. Should the virtio code be looking at pci_dev->pin in
the first place?

Shouldn't it be checking pci_dev->irq instead?

The original commit talks about irq being 0 and colliding with the timer
interrupt.

But all (most?) platforms have converged on 0 meaning NO_IRQ since quite
a fews ago AFAIK.

And the timer irq == 0 is a special case AIUI:
  
https://lore.kernel.org/all/ca+55afwilp1z+2mzkrfsid1wzq0tqkcn8f2e6nl_avr+m1f...@mail.gmail.com/

cheers


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Thomas Zimmermann

Hi

Am 12.10.22 um 15:12 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 2:00 PM, Thomas Zimmermann wrote:


Could well be. But ofdrm intents to replace offb and this test has
worked well in offb for almost 15 yrs. If there are bug reports, I'm
happy to take patches, but until then I see no reason to change it.


I wouldn't change the code in offb unless a user reports a bug,
but I don't see a point in adding the same mistake to ofdrm if we
know it can't work on real hardware.


As I said, this has worked with offb and apparently on real hardware. 
For all I know, ATI hardware (before it became AMD) was used in PPC 
Macintoshs and assumed big-endian access on those machines.



I tried to find out where this is configured in qemu, but it seems
to depend on the framebuffer backend there: most are always little-endian,
ati/bochs/vga-pci/virtio-vga are configurable from the guest through
some register setting, but vga.c picks a default from the
'TARGET_WORDS_BIGENDIAN' macro, which I think is set differently
between qemu-system-ppc64le and qemu-system-ppc64.

If you are using the framebuffer code from vga.c, I would guess that
that you can run a big-endian kernel with qemu-system-ppc64,
or a little-endian kernel with qemu-system-ppc64le and get the
correct colors, while running a little-endian kernel with
qemu-system-ppc64 and vga.c, or using a different framebuffer
emulation on a big-endian kernel would give you the wrong colors.


If qemu doesn't give us the necessary DT property, it's a qemu bug. In 
in the absence of the property, picking the kernel's endianess is a 
sensible choice.


Best regards
Thomas



Which combinations did you actually test?

  Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Michal Suchánek
Hello,

On Wed, Oct 12, 2022 at 03:12:35PM +0200, Arnd Bergmann wrote:
> On Wed, Oct 12, 2022, at 2:00 PM, Thomas Zimmermann wrote:
> >
> > Could well be. But ofdrm intents to replace offb and this test has 
> > worked well in offb for almost 15 yrs. If there are bug reports, I'm 
> > happy to take patches, but until then I see no reason to change it.
> 
> I wouldn't change the code in offb unless a user reports a bug,
> but I don't see a point in adding the same mistake to ofdrm if we
> know it can't work on real hardware.
> 
> I tried to find out where this is configured in qemu, but it seems
> to depend on the framebuffer backend there: most are always little-endian,
> ati/bochs/vga-pci/virtio-vga are configurable from the guest through
> some register setting, but vga.c picks a default from the
> 'TARGET_WORDS_BIGENDIAN' macro, which I think is set differently
> between qemu-system-ppc64le and qemu-system-ppc64.
> 
> If you are using the framebuffer code from vga.c, I would guess that
> that you can run a big-endian kernel with qemu-system-ppc64,
> or a little-endian kernel with qemu-system-ppc64le and get the
> correct colors, while running a little-endian kernel with
> qemu-system-ppc64 and vga.c, or using a different framebuffer
> emulation on a big-endian kernel would give you the wrong colors.

Thanks for digging this up.

That makes one thing clear: qemu does not emulate this framebuffer
property correctly, and cannot be relied on for verification.

If you can provide test results from real hardware that show the current
logic as flawed it should be changed.

In absence of such test results I think the most reasonable thing is to
keep the logic that nobody complained about for 10+ years.

Thanks

Michal


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag

2022-10-12 Thread Guenter Roeck
On Sun, Oct 09, 2022 at 10:01:39PM +1100, Michael Ellerman wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Hi Linus,
> 
> Please pull powerpc updates for 6.1.
> 
> No conflicts with your tree. There will be a conflict when you merge the 
> kbuild tree, due
> to us renaming head_fsl_booke.S to head_85xx.S. The resolution is mostly 
> trivial,
> linux-next has the correct result if it's unclear.
> 

Post-merge problems are much more exciting when trying to run mac99
emulations in qemu.

Enabling KFENCE results in log messages such as


WARNING: inconsistent lock state
6.0.0-rc2-00163-ga5edf9815dd7 #1 Tainted: G N

inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes:
c2734d68 (native_tlbie_lock){+.?.}-{2:2}, at: 
.native_hpte_updateboltedpp+0x1a4/0x600
{IN-SOFTIRQ-W} state was registered at:
  .lock_acquire+0x20c/0x520
  ._raw_spin_lock+0x4c/0x70
  .native_hpte_invalidate+0x62c/0x840
  .hash__kernel_map_pages+0x450/0x640
  .kfence_protect+0x58/0xc0
  .kfence_guarded_free+0x374/0x5a0
  .__slab_free+0x340/0x670
  .__d_free+0x2c/0x50
  .rcu_core+0x3f4/0x1750
  .__do_softirq+0x1dc/0x7dc
  .do_softirq_own_stack+0x40/0x60
  0xc775bca0
  .irq_exit+0x1e8/0x220
  .timer_interrupt+0x284/0x700
  decrementer_common_virt+0x208/0x210
irq event stamp: 243607
hardirqs last  enabled at (243607): [] 
.__slab_free+0x324/0x670
hardirqs last disabled at (243606): [] 
.__slab_free+0x1f4/0x670
softirqs last  enabled at (242982): [] 
.__do_softirq+0x7ac/0x7dc
softirqs last disabled at (242973): [] 
.do_softirq_own_stack+0x40/0x60

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(native_tlbie_lock);
  
lock(native_tlbie_lock);

 *** DEADLOCK ***

and, indeed, there appear to be various deadlocks.

I had to disable KFENCE to be able to test further (or maybe KFENCE works
and points out the soft lockup problem observed below - hard for me to
determine).

>   powerpc/pseries: Move dtl scanning and steal time accounting to pseries 
> platform

With this patch, CONFIG_DTL must be enabled if CONFIG_PPC_SPLPAR is enabled.
CONFIG_PPC_SPLPAR=y and CONFIG_DTL=n results in build failures due to

irq.c:(.text+0x2798): undefined reference to `.pseries_accumulate_stolen_time'

and many similar errors.

I had to enable CONFIG_DTL explicitly to be able to build my test images.
CONFIG_PPC_SPLPAR now depends on or requires CONFIG_DTL which in turn
depends on CONFIG_DEBUG_FS. That seems odd.

With all this worked around, I still get soft lockup problems when trying to 
boot
from SDHCI. I have not been able to bisect this problem.

BUG: soft lockup - CPU#0 stuck for 23s! [dd:111]
Modules linked in:
CPU: 0 PID: 111 Comm: dd Not tainted 6.0.0-10822-g60bb8154d1d7 #1
Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac
NIP:  c0031630 LR: c0031964 CTR: 
REGS: c7df36a8 TRAP: 0900   Not tainted  (6.0.0-10822-g60bb8154d1d7)
MSR:  8000b032   CR: 28002228  XER: 
IRQMASK: 0
GPR00: c0031964 c7df3870 c13e5500 c7df36a8
GPR04: c125dd80  c7df3814 c291d018
GPR08: c2d4bbb8  c7365100 c2d21098
GPR12: 2800 c2e2 100d32e0 100d32b4
GPR16: 100d3301 100d32b9 100d3358 100d32bf
GPR20: 2000 100d3372 100d331e c7365918
GPR24:  0e60 0900 0500
GPR28: 0a00 0f00 0002 0003
NIP [c0031630] .replay_soft_interrupts+0x60/0x300
LR [c0031964] .arch_local_irq_restore+0x94/0x1c0
Call Trace:
[c7df3870] [c0031964] .arch_local_irq_restore+0x94/0x1c0 
(unreliable)
[c7df38f0] [c0f8a444] .__schedule+0x664/0xa50
[c7df39d0] [c0f8a8b0] .schedule+0x80/0x140
[c7df3a50] [c092f0dc] .try_to_generate_entropy+0x118/0x174
[c7df3b40] [c092e2e4] .urandom_read_iter+0x74/0x140
[c7df3bc0] [c03b0044] .vfs_read+0x284/0x2d0
[c7df3cd0] [c03b0d2c] .ksys_read+0xdc/0x130
[c7df3d80] [c002a88c] .system_call_exception+0x19c/0x330
[c7df3e10] [c000c1d4] system_call_common+0xf4/0x258
--- interrupt: c00 at 0x7fff829fd49c
NIP:  7fff829fd49c LR: 1000da90 CTR: 
REGS: c7df3e80 TRAP: 0c00   Not tainted  (6.0.0-10822-g60bb8154d1d7)
MSR:  8000f032   CR: 22002422  XER: 
IRQMASK: 0
GPR00: 0003 7138df70 7fff82af7300 
GPR04: 101102a0 0020  
GPR08:    
GPR12:  7fff82bcaac0 

Re: [PATCH v4 11/16] objtool: Add --mnop as an option to --mcount

2022-10-12 Thread Naveen N. Rao

Josh Poimboeuf wrote:

On Mon, Oct 10, 2022 at 05:07:46PM +0530, Naveen N. Rao wrote:

> +++ b/scripts/Makefile.lib
> @@ -234,6 +234,7 @@ objtool_args =
 \
>$(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=noinstr)  \
>$(if $(CONFIG_X86_KERNEL_IBT), --ibt)   \
>$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \
> +  $(if $(CONFIG_HAVE_OBJTOOL_NOP_MCOUNT), --mnop) \

This still won't help: for instance, if CONFIG_FTRACE itself is disabled. I
think we should make this depend on CONFIG_FTRACE_MCOUNT_USE_OBJTOOL. The
below change works for me:

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 54d2d6451bdacc..fd3f55a1fdb7bb 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -245,8 +245,8 @@ objtool_args =  
\
   $(if $(CONFIG_HAVE_JUMP_LABEL_HACK), --hacks=jump_label)\
   $(if $(CONFIG_HAVE_NOINSTR_HACK), --hacks=noinstr)  \
   $(if $(CONFIG_X86_KERNEL_IBT), --ibt)   \
-   $(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL), --mcount) \
-   $(if $(CONFIG_HAVE_OBJTOOL_NOP_MCOUNT), --mnop) \
+$(if $(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL),   \
+ $(if $(CONFIG_HAVE_OBJTOOL_NOP_MCOUNT), --mcount --mnop, 
--mcount)) \
   $(if $(CONFIG_UNWINDER_ORC), --orc) \
   $(if $(CONFIG_RETPOLINE), --retpoline)  \
   $(if $(CONFIG_RETHUNK), --rethunk)  \


This has a new conflict, may need something like:

--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -256,6 +256,9 @@ objtool-args-$(CONFIG_HAVE_JUMP_LABEL_HACK) += 
--hacks=jump_label
 objtool-args-$(CONFIG_HAVE_NOINSTR_HACK)   += --hacks=noinstr
 objtool-args-$(CONFIG_X86_KERNEL_IBT)  += --ibt
 objtool-args-$(CONFIG_FTRACE_MCOUNT_USE_OBJTOOL)   += --mcount
+ifdef CONFIG_FTRACE_MCOUNT_USE_OBJTOOL
+objtool-args-$(CONFIG_HAVE_OBJTOOL_NOP_MCOUNT) += --mnop
+endif
 objtool-args-$(CONFIG_UNWINDER_ORC)+= --orc
 objtool-args-$(CONFIG_RETPOLINE)   += --retpoline
 objtool-args-$(CONFIG_RETHUNK) += --rethunk


Thanks. That's definitely simpler.

I haven't checked if there are any other conflicts with 
tip/objtool/core though. Not sure how to proceed here.



- Naveen


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman
[ Cc += Bjorn & linux-pci ]

"Michael S. Tsirkin"  writes:
> On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
>> "Michael S. Tsirkin"  writes:
...
>> > 
>> > virtio: fixes, features
>> >
>> > 9k mtu perf improvements
>> > vdpa feature provisioning
>> > virtio blk SECURE ERASE support
>> >
>> > Fixes, cleanups all over the place.
>> >
>> > Signed-off-by: Michael S. Tsirkin 
>> >
>> > 
>> > Alvaro Karsz (1):
>> >   virtio_blk: add SECURE ERASE command support
>> >
>> > Angus Chen (1):
>> >   virtio_pci: don't try to use intxif pin is zero
>> 
>> This commit breaks virtio_pci for me on powerpc, when running as a qemu
>> guest.
>> 
>> vp_find_vqs() bails out because pci_dev->pin == 0.
>> 
>> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
>> succeed if we called it - which is what the code used to do.
>> 
>> I think this happens because pci_dev->pin is not populated in
>> pci_assign_irq().
>> 
>> I would absolutely believe this is bug in our PCI code, but I think it
>> may also affect other platforms that use of_irq_parse_and_map_pci().
>
> How about fixing this in of_irq_parse_and_map_pci then?
> Something like the below maybe?
> 
> diff --git a/drivers/pci/of.c b/drivers/pci/of.c
> index 196834ed44fe..504c4d75c83f 100644
> --- a/drivers/pci/of.c
> +++ b/drivers/pci/of.c
> @@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, 
> struct of_phandle_args *
>   if (pin == 0)
>   return -ENODEV;
>  
> + pdev->pin = pin;
> +
>   /* Local interrupt-map in the device node? Use it! */
>   if (of_get_property(dn, "interrupt-map", NULL)) {
>   pin = pci_swizzle_interrupt_pin(pdev, pin);

That doesn't fix it in all cases, because there's an early return if
there's a struct device_node associated with the pci_dev, before we even
read the pin.

Also the pci_dev is const, and removing the const would propagate to a
few other places.

The other obvious place to fix it would be in pci_assign_irq(), as
below. That fixes this bug for me, but is otherwise very lightly tested.

cheers


diff --git a/drivers/pci/setup-irq.c b/drivers/pci/setup-irq.c
index cc7d26b015f3..0135413b33af 100644
--- a/drivers/pci/setup-irq.c
+++ b/drivers/pci/setup-irq.c
@@ -22,6 +22,15 @@ void pci_assign_irq(struct pci_dev *dev)
int irq = 0;
struct pci_host_bridge *hbrg = pci_find_host_bridge(dev->bus);
 
+   /* Make sure dev->pin is populated */
+   pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
+
+   /* Cope with illegal. */
+   if (pin > 4)
+   pin = 1;
+
+   dev->pin = pin;
+
if (!(hbrg->map_irq)) {
pci_dbg(dev, "runtime IRQ mapping not provided by arch\n");
return;
@@ -34,11 +43,6 @@ void pci_assign_irq(struct pci_dev *dev)
 * time the interrupt line passes through a PCI-PCI bridge we must
 * apply the swizzle function.
 */
-   pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
-   /* Cope with illegal. */
-   if (pin > 4)
-   pin = 1;
-
if (pin) {
/* Follow the chain of bridges, swizzling as we go. */
if (hbrg->swizzle_irq)


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 2:00 PM, Thomas Zimmermann wrote:
>
> Could well be. But ofdrm intents to replace offb and this test has 
> worked well in offb for almost 15 yrs. If there are bug reports, I'm 
> happy to take patches, but until then I see no reason to change it.

I wouldn't change the code in offb unless a user reports a bug,
but I don't see a point in adding the same mistake to ofdrm if we
know it can't work on real hardware.

I tried to find out where this is configured in qemu, but it seems
to depend on the framebuffer backend there: most are always little-endian,
ati/bochs/vga-pci/virtio-vga are configurable from the guest through
some register setting, but vga.c picks a default from the
'TARGET_WORDS_BIGENDIAN' macro, which I think is set differently
between qemu-system-ppc64le and qemu-system-ppc64.

If you are using the framebuffer code from vga.c, I would guess that
that you can run a big-endian kernel with qemu-system-ppc64,
or a little-endian kernel with qemu-system-ppc64le and get the
correct colors, while running a little-endian kernel with
qemu-system-ppc64 and vga.c, or using a different framebuffer
emulation on a big-endian kernel would give you the wrong colors.

Which combinations did you actually test?

 Arnd


Re: [PATCH v2] perf: Rewrite core context handling

2022-10-12 Thread Peter Zijlstra
On Wed, Oct 12, 2022 at 02:09:00PM +0530, Ravi Bangoria wrote:

> > @@ -3366,6 +3370,14 @@ static void perf_event_sync_stat(struct
> > }
> >  }
> >  
> > +#define list_for_each_entry_double(pos1, pos2, head1, head2, member)   
> > \
> > +   for (pos1 = list_first_entry(head1, typeof(*pos1), member), \
> > +pos2 = list_first_entry(head2, typeof(*pos2), member); \
> > +!list_entry_is_head(pos1, head1, member) &&\
> > +!list_entry_is_head(pos2, head2, member);  \
> > +pos1 = list_next_entry(pos1, member),  \
> > +pos2 = list_next_entry(pos2, member))
> > +
> >  static void perf_event_swap_task_ctx_data(struct perf_event_context 
> > *prev_ctx,
> >   struct perf_event_context *next_ctx)
> >  {
> > @@ -3374,16 +3386,9 @@ static void perf_event_swap_task_ctx_dat
> > if (!prev_ctx->nr_task_data)
> > return;
> >  
> > -   prev_epc = list_first_entry(_ctx->pmu_ctx_list,
> > -   struct perf_event_pmu_context,
> > -   pmu_ctx_entry);
> > -   next_epc = list_first_entry(_ctx->pmu_ctx_list,
> > -   struct perf_event_pmu_context,
> > -   pmu_ctx_entry);
> > -
> > -   while (_epc->pmu_ctx_entry != _ctx->pmu_ctx_list &&
> > -  _epc->pmu_ctx_entry != _ctx->pmu_ctx_list) {
> > -
> > +   list_for_each_entry_double(prev_epc, next_epc,
> > +  _ctx->pmu_ctx_list, 
> > _ctx->pmu_ctx_list,
> > +  pmu_ctx_entry) {
> 
> There are more places which can use list_for_each_entry_double().
> I'll fix those.

I've gone and renamed it: double_list_for_each_entry(), but yeah, didn't
look too hard for other users.

> > @@ -4859,7 +4879,14 @@ static void put_pmu_ctx(struct perf_even
> > if (epc->ctx) {
> > struct perf_event_context *ctx = epc->ctx;
> >  
> > -   // XXX ctx->mutex
> > +   /*
> > +* XXX
> > +*
> > +* lockdep_assert_held(>mutex);
> > +*
> > +* can't because of the call-site in _free_event()/put_event()
> > +* which isn't always called under ctx->mutex.
> > +*/
> 
> Yes. I came across the same and could not figure out how to solve
> this. So Just kept XXX as is.

Yeah, I can sorta fix it, but it's ugly so there we are.

> >  
> > WARN_ON_ONCE(list_empty(>pmu_ctx_entry));
> > raw_spin_lock_irqsave(>lock, flags);

> > @@ -12657,6 +12675,13 @@ perf_event_create_kernel_counter(struct
> > goto err_unlock;
> > }
> >  
> > +   pmu_ctx = find_get_pmu_context(pmu, ctx, event);
> > +   if (IS_ERR(pmu_ctx)) {
> > +   err = PTR_ERR(pmu_ctx);
> > +   goto err_unlock;
> > +   }
> > +   event->pmu_ctx = pmu_ctx;
> 
> We should call find_get_pmu_context() with ctx->mutex held and thus
> above perf_event_create_kernel_counter() change. Is my understanding
> correct?

That's the intent yeah. But due to not always holding ctx->mutex over
put_pmu_ctx() this might be moot. I'm almost through auditing epc usage
and I think ctx->lock is sufficient, fingers crossed.

> > +
> > if (!task) {
> > /*
> >  * Check if the @cpu we're creating an event for is online.

> > @@ -12998,7 +13022,7 @@ void perf_event_free_task(struct task_st
> > struct perf_event_context *ctx;
> > struct perf_event *event, *tmp;
> >  
> > -   ctx = rcu_dereference(task->perf_event_ctxp);
> > +   ctx = rcu_access_pointer(task->perf_event_ctxp);
> 
> We dereference ctx pointer but with mutex and lock held. And thus
> rcu_access_pointer() is sufficient. Is my understanding correct?

We do not in fact hold ctx->lock here IIRC; but this is a NULL test, if
it is !NULL we know we have a reference on it and are good.


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Michal Suchánek
On Wed, Oct 12, 2022 at 10:38:29AM +0200, Arnd Bergmann wrote:
> On Wed, Oct 12, 2022, at 10:27 AM, Thomas Zimmermann wrote:
> > Am 12.10.22 um 09:44 schrieb Arnd Bergmann:
> >> On Wed, Oct 12, 2022, at 9:40 AM, Thomas Zimmermann wrote:
> >>> Am 12.10.22 um 09:17 schrieb Arnd Bergmann:
>  On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:
> >>>
>  Does qemu mark the device has having a particular endianess then, or
>  does it switch the layout of the framebuffer to match what the CPU
>  does?
> >>>
> >>> The latter. On neither architecture does qemu expose this flag. The
> >>> default endianess corresponds to the host.
> >> 
> >> "host" as in the machine that qemu runs on, or the machine that is
> >> being emulated? I suppose it would be broken either way, but in the
> >> latter case, we could get away with detecting that the machine is
> >> running under qemu.
> >
> > Sorry, my mistake. I meant "guest": the endianess of the framebuffer 
> > corresponds to the endianess of the emulated machine.  Given that many 
> > graphics cards support LE and BE modes, I assume that this behavior 
> > mimics real-hardware systems.
> 
> Not really: While the hardware may be able to switch between
> the modes, something has to actively set some hardware registers up
> that way, but the offb/ofdrm driver has no interface for interacting
> with that register, and the bootloader or firmware code that knows
> about the register has no information about what kernel it will
> eventually run. This is a bit architecture dependent, as e.g. on
> MIPS, a bi-endian hardware platform has to run a bootloader with the
> same endianness as the kernel, but on arm and powerpc the bootloader
> is usually fixed and the kernel switches to its configured endianness
> in the first few instructions after it gets entered.

But then the firmware knows that the kernel can switch endian and the
endian information should be provided. And maybe that should be emulated
better by qemu. Unfortunately, modern Power servers rarely come with a
graphics card so it's hard to test on real hardware.

Thanks

Michal


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Thomas Zimmermann

Hi

Am 12.10.22 um 10:38 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 10:27 AM, Thomas Zimmermann wrote:

Am 12.10.22 um 09:44 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 9:40 AM, Thomas Zimmermann wrote:

Am 12.10.22 um 09:17 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:



Does qemu mark the device has having a particular endianess then, or
does it switch the layout of the framebuffer to match what the CPU
does?


The latter. On neither architecture does qemu expose this flag. The
default endianess corresponds to the host.


"host" as in the machine that qemu runs on, or the machine that is
being emulated? I suppose it would be broken either way, but in the
latter case, we could get away with detecting that the machine is
running under qemu.


Sorry, my mistake. I meant "guest": the endianess of the framebuffer
corresponds to the endianess of the emulated machine.  Given that many
graphics cards support LE and BE modes, I assume that this behavior
mimics real-hardware systems.


Not really: While the hardware may be able to switch between
the modes, something has to actively set some hardware registers up
that way, but the offb/ofdrm driver has no interface for interacting
with that register, and the bootloader or firmware code that knows
about the register has no information about what kernel it will
eventually run. This is a bit architecture dependent, as e.g. on
MIPS, a bi-endian hardware platform has to run a bootloader with the
same endianness as the kernel, but on arm and powerpc the bootloader
is usually fixed and the kernel switches to its configured endianness
in the first few instructions after it gets entered.


Could well be. But ofdrm intents to replace offb and this test has 
worked well in offb for almost 15 yrs. If there are bug reports, I'm 
happy to take patches, but until then I see no reason to change it.


Best regards
Thomas



  Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael S. Tsirkin
On Wed, Oct 12, 2022 at 05:21:24PM +1100, Michael Ellerman wrote:
> "Michael S. Tsirkin"  writes:
> > The following changes since commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f:
> >
> >   Linux 6.0 (2022-10-02 14:09:07 -0700)
> >
> > are available in the Git repository at:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git 
> > tags/for_linus
> >
> > for you to fetch changes up to 71491c54eafa318fdd24a1f26a1c82b28e1ac21d:
> >
> >   virtio_pci: don't try to use intxif pin is zero (2022-10-07 20:00:44 
> > -0400)
> >
> > 
> > virtio: fixes, features
> >
> > 9k mtu perf improvements
> > vdpa feature provisioning
> > virtio blk SECURE ERASE support
> >
> > Fixes, cleanups all over the place.
> >
> > Signed-off-by: Michael S. Tsirkin 
> >
> > 
> > Alvaro Karsz (1):
> >   virtio_blk: add SECURE ERASE command support
> >
> > Angus Chen (1):
> >   virtio_pci: don't try to use intxif pin is zero
> 
> This commit breaks virtio_pci for me on powerpc, when running as a qemu
> guest.
> 
> vp_find_vqs() bails out because pci_dev->pin == 0.
> 
> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> succeed if we called it - which is what the code used to do.
> 
> I think this happens because pci_dev->pin is not populated in
> pci_assign_irq().
> 
> I would absolutely believe this is bug in our PCI code, but I think it
> may also affect other platforms that use of_irq_parse_and_map_pci().
> 
> cheers

How about fixing this in of_irq_parse_and_map_pci then?
Something like the below maybe?

diff --git a/drivers/pci/of.c b/drivers/pci/of.c
index 196834ed44fe..504c4d75c83f 100644
--- a/drivers/pci/of.c
+++ b/drivers/pci/of.c
@@ -446,6 +446,8 @@ static int of_irq_parse_pci(const struct pci_dev *pdev, 
struct of_phandle_args *
if (pin == 0)
return -ENODEV;
 
+   pdev->pin = pin;
+
/* Local interrupt-map in the device node? Use it! */
if (of_get_property(dn, "interrupt-map", NULL)) {
pin = pci_swizzle_interrupt_pin(pdev, pin);



Re: [PATCH 2/2] powerpc: move sync_file_range2 compat definition

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 5:53 AM, Nicholas Piggin wrote:
> sync_file_range2 is not a special unaligned-odd-pair calling convention
> syscall, it's just a regular one that does not have a generic compat
> definition. Move it out of sys_ppc32.c and into syscalls.c.
>
> Signed-off-by: Nicholas Piggin 
> ---
> This one doesn't fix anything and is not required for the previous
> fix, so it could be merged later. Now that we've repurposed sys_ppc32.c
> for the difficult syscalls and compat syscalls live all over the kernel
> now anyway, IMO it's makes things less confusing to move this.

For this one, I would just move the implementation right next to
sync_file_range2() the same way we define compat_sys_sync_file_range(),
and share it with arm64.

  Arnd


Re: [PATCH 1/2] powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 5:53 AM, Nicholas Piggin wrote:
> powerpc 32-bit system call (and function) calling convention for 64-bit
> arguments requires the next available odd-pair (two sequential registers
> with the first being odd-numbered) from the standard register argument
> allocation.
>
> The first argument register is r3, so a 64-bit argument that appears at
> an even position in the argument list must skip a register (unless there
> were preceeding 64-bit arguments, which might throw things off). This
> requires non-standard compat definitions to deal with the holes in the
> argument register allocation.
>
> With pt_regs syscall wrappers which use a standard mapper to map pt_regs
> GPRs to function arguments, 32-bit kernels hit the same basic problem,
> the standard definitions don't cope with the unused argument registers.
>
> Fix this by having 32-bit kernels share those syscall definitions with
> compat.
>
> Thanks to Jason for spending a lot of time finding and bisecting this and
> developing a trivial reproducer. The perfect bug report.
>
> Reported-by: Jason A. Donenfeld 
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Arnd Bergmann 

This looks like a good approach to fix the regression. Comments
below only for additional thoughts, don't let that hold up
merging.

> +#ifdef CONFIG_PPC32
> +long sys_ppc_pread64(unsigned int fd,
> +  char __user *ubuf, compat_size_t count,
> +  u32 reg6, u32 pos1, u32 pos2);
> +long sys_ppc_pwrite64(unsigned int fd,
> +   const char __user *ubuf, compat_size_t count,
> +   u32 reg6, u32 pos1, u32 pos2);
> +long sys_ppc_readahead(int fd, u32 r4,
> +u32 offset1, u32 offset2, u32 count);
> +long sys_ppc_truncate64(const char __user *path, u32 reg4,
> + unsigned long len1, unsigned long len2);
> +long sys_ppc_ftruncate64(unsigned int fd, u32 reg4,
> +  unsigned long len1, unsigned long len2);
> +long sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2,
> +  size_t len, int advice);
> +#endif

In general, I would leave out the #ifdef here and always declare
the functions, but it doesn't really matter.

>   *
> - * These routines maintain argument size conversion between 32bit and 64bit
> - * environment.
> + * 32-bit system calls with 64-bit arguments pass those in register pairs.
> + * This must be specially dealt with on 64-bit kernels. The 
> compat_arg_u64_dual
> + * in generic compat syscalls is not always usable because the register
> + * pairing is constrained depending on preceeding arguments.
> + *
> + * An analogous problem exists on 32-bit kernels with 
> ARCH_HAS_SYSCALL_WRAPPER,
> + * the defined system call functions take the pt_regs as an argument, and 
> there
> + * is a mapping macro which maps registers to arguments
> + * (SC_POWERPC_REGS_TO_ARGS) which also does not deal with these 64-bit
> + * arguments.
> + *
> + * This file contains these system calls.

It would be nice to eventually move these next to the regular system
call definitions, with more generic naming and #ifdef checks. It looks
like these are the exact same ones that we have in
arch/arm64/kernel/sys32.c and arch/mips/kernel/linux32.c,
while the other five (x86, s390, sparc, riscv, parisc) use the
version without padding that was recently added as the generic
compat syscall set.

> @@ -47,7 +57,17 @@
>  #include 
>  #include 
> 
> -COMPAT_SYSCALL_DEFINE6(ppc_pread64,
> +#ifdef CONFIG_PPC32
> +#define PPC32_SYSCALL_DEFINE4SYSCALL_DEFINE4
> +#define PPC32_SYSCALL_DEFINE5SYSCALL_DEFINE5
> +#define PPC32_SYSCALL_DEFINE6SYSCALL_DEFINE6
> +#else
> +#define PPC32_SYSCALL_DEFINE4COMPAT_SYSCALL_DEFINE4
> +#define PPC32_SYSCALL_DEFINE5COMPAT_SYSCALL_DEFINE5
> +#define PPC32_SYSCALL_DEFINE6COMPAT_SYSCALL_DEFINE6
> +#endif

I'm fairly sure what you do here is correct, but I am not convinced
we actually need this as long as none of the syscalls take a signed
'long' argument that requires sign-extension for compat mode but
not native 32-bit kernels.

If we add a generic version, it would be nice to always just
use SYSCALL_DEFINEx instead of COMPAT_SYSCALL_DEFINEx. This would
also simplify the syscall table. Do you see a possible problem with
that?

 Arnd


Re: [PATCH] powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()

2022-10-12 Thread Naveen N. Rao

Li Huafei wrote:


  # echo 'p cmdline_proc_show' > kprobe_events
  # echo 'p cmdline_proc_show+16' >> kprobe_events


I think we should extend multiple_kprobes selftest to also place
contiguous probes to catch such errors.


Yes. But each architecture implementation is different and it looks a
little difficult to decide which offsets need to be tested.


I don't think we need to be accurate here. A test to simply try putting 
a probe at every byte within the first 256 bytes of a kernel function 
should help catch many such issues. Some of those probes will be 
rejected, but we can ignore errors.



- Naveen


Re: [PATCH v2] perf: Rewrite core context handling

2022-10-12 Thread Ravi Bangoria
On 11-Oct-22 11:17 PM, Peter Zijlstra wrote:
> On Tue, Oct 11, 2022 at 04:02:56PM +0200, Peter Zijlstra wrote:
>> On Tue, Oct 11, 2022 at 06:49:55PM +0530, Ravi Bangoria wrote:
>>> On 11-Oct-22 4:59 PM, Peter Zijlstra wrote:
 On Sat, Oct 08, 2022 at 11:54:24AM +0530, Ravi Bangoria wrote:

> +static void perf_event_swap_task_ctx_data(struct perf_event_context 
> *prev_ctx,
> +   struct perf_event_context *next_ctx)
> +{
> + struct perf_event_pmu_context *prev_epc, *next_epc;
> +
> + if (!prev_ctx->nr_task_data)
> + return;
> +
> + prev_epc = list_first_entry(_ctx->pmu_ctx_list,
> + struct perf_event_pmu_context,
> + pmu_ctx_entry);
> + next_epc = list_first_entry(_ctx->pmu_ctx_list,
> + struct perf_event_pmu_context,
> + pmu_ctx_entry);
> +
> + while (_epc->pmu_ctx_entry != _ctx->pmu_ctx_list &&
> +_epc->pmu_ctx_entry != _ctx->pmu_ctx_list) {
> +
> + WARN_ON_ONCE(prev_epc->pmu != next_epc->pmu);
> +
> + /*
> +  * PMU specific parts of task perf context can require
> +  * additional synchronization. As an example of such
> +  * synchronization see implementation details of Intel
> +  * LBR call stack data profiling;
> +  */
> + if (prev_epc->pmu->swap_task_ctx)
> + prev_epc->pmu->swap_task_ctx(prev_epc, next_epc);
> + else
> + swap(prev_epc->task_ctx_data, next_epc->task_ctx_data);

 Did I forget to advance the iterators here?
>>>
>>> Yeah. Seems so. I overlooked it too.
>>
>> OK; so I'm not slowly going crazy staring at this code ;-) Let me go add
>> it now then. :-)
>>
>> But first I gotta taxi the kids around for a bit, bbl.
> 
> OK, so I've been going over the perf_event_pmu_context life-time thing
> as well, there were a bunch of XXXs there and I'm not sure Im happy with
> things, but I'd also forgotten most of it.
> 
> Ideally epc works like it's a regular member of ctx -- locking wise that
> is, but I'm not sure we can make that stick -- see the ctx->mutex issues
> we have with put_ctx().
> 
> As such, I'm going to have to re-audit all the epc usage to see if
> pure ctx->lock is sufficient.
> 
> I did do make epc RCU freed, because pretty much everything is and that
> was easy enough to make happen -- it means we don't need to worry about
> that.
> 
> But I'm going cross-eyes from staring at this all day, so more tomorrow.
> The below is what I currently have.
> 
> ---
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -833,13 +833,13 @@ struct perf_event {
>   *   `[1:n]-' `-[n:1]-> pmu <-[1:n]-'
>   *
>   *
> - * XXX destroy epc when empty
> - *   refcount, !rcu
> + * epc lifetime is refcount based and RCU freed (similar to 
> perf_event_context).
> + * epc locking is as if it were a member of perf_event_context; specifically:
>   *
> - * XXX epc locking
> + *   modification, both: ctx->mutex && ctx->lock
> + *   reading, either: ctx->mutex || ctx->lock
>   *
> - *   event->pmu_ctxctx->mutex && inactive
> - *   ctx->pmu_ctx_list ctx->mutex && ctx->lock
> + * XXX except this isn't true ... see put_pmu_ctx().
>   *
>   */
>  struct perf_event_pmu_context {
> @@ -857,6 +857,7 @@ struct perf_event_pmu_context {
>   unsigned intnr_events;
>  
>   atomic_trefcount; /* event <-> epc */
> + struct rcu_head rcu_head;
>  
>   void*task_ctx_data; /* pmu specific data */
>   /*
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1727,6 +1727,10 @@ perf_event_groups_next(struct perf_event
>   return NULL;
>  }
>  
> +#define perf_event_groups_for_cpu_pmu(event, groups, cpu, pmu)   
> \
> + for (event = perf_event_groups_first(groups, cpu, pmu, NULL);   \
> +  event; event = perf_event_groups_next(event, pmu))
> +
>  /*
>   * Iterate through the whole groups tree.
>   */
> @@ -3366,6 +3370,14 @@ static void perf_event_sync_stat(struct
>   }
>  }
>  
> +#define list_for_each_entry_double(pos1, pos2, head1, head2, member) \
> + for (pos1 = list_first_entry(head1, typeof(*pos1), member), \
> +  pos2 = list_first_entry(head2, typeof(*pos2), member); \
> +  !list_entry_is_head(pos1, head1, member) &&\
> +  !list_entry_is_head(pos2, head2, member);  \
> +  pos1 = list_next_entry(pos1, member),  \
> +  pos2 = list_next_entry(pos2, member))
> +
>  static void perf_event_swap_task_ctx_data(struct perf_event_context 
> *prev_ctx,
> struct perf_event_context 

Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 10:27 AM, Thomas Zimmermann wrote:
> Am 12.10.22 um 09:44 schrieb Arnd Bergmann:
>> On Wed, Oct 12, 2022, at 9:40 AM, Thomas Zimmermann wrote:
>>> Am 12.10.22 um 09:17 schrieb Arnd Bergmann:
 On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:
>>>
 Does qemu mark the device has having a particular endianess then, or
 does it switch the layout of the framebuffer to match what the CPU
 does?
>>>
>>> The latter. On neither architecture does qemu expose this flag. The
>>> default endianess corresponds to the host.
>> 
>> "host" as in the machine that qemu runs on, or the machine that is
>> being emulated? I suppose it would be broken either way, but in the
>> latter case, we could get away with detecting that the machine is
>> running under qemu.
>
> Sorry, my mistake. I meant "guest": the endianess of the framebuffer 
> corresponds to the endianess of the emulated machine.  Given that many 
> graphics cards support LE and BE modes, I assume that this behavior 
> mimics real-hardware systems.

Not really: While the hardware may be able to switch between
the modes, something has to actively set some hardware registers up
that way, but the offb/ofdrm driver has no interface for interacting
with that register, and the bootloader or firmware code that knows
about the register has no information about what kernel it will
eventually run. This is a bit architecture dependent, as e.g. on
MIPS, a bi-endian hardware platform has to run a bootloader with the
same endianness as the kernel, but on arm and powerpc the bootloader
is usually fixed and the kernel switches to its configured endianness
in the first few instructions after it gets entered.

 Arnd


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Thomas Zimmermann

Hi

Am 12.10.22 um 09:44 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 9:40 AM, Thomas Zimmermann wrote:

Am 12.10.22 um 09:17 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:



Does qemu mark the device has having a particular endianess then, or
does it switch the layout of the framebuffer to match what the CPU
does?


The latter. On neither architecture does qemu expose this flag. The
default endianess corresponds to the host.


"host" as in the machine that qemu runs on, or the machine that is
being emulated? I suppose it would be broken either way, but in the
latter case, we could get away with detecting that the machine is
running under qemu.


Sorry, my mistake. I meant "guest": the endianess of the framebuffer 
corresponds to the endianess of the emulated machine.  Given that many 
graphics cards support LE and BE modes, I assume that this behavior 
mimics real-hardware systems.


Best regards
Thomas



 Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: Issues with the first PowerPC updates for the kernel 6.1

2022-10-12 Thread Andrew Donnellan
On Wed, 2022-10-12 at 08:51 +0200, Christian Zigotzky wrote:
> Hi All,
> 
> I use the Nemo board with a PASemi PA6T CPU and have some issues
> since the first PowerPC updates for the kernel 6.1.
> 
> I successfully compiled the git kernel with the first PowerPC updates
> two days ago.
> 
> Unfortunately this kernel is really dangerous. Many things for
> example Network Manager and LightDM don't work anymore and produced
> several gigabyte of config files till the partition has been filled.
> 
> I deleted some files like the resolv.conf that had a size over 200
> GB!
> 
> Unfortunately, MintPPC was still damaged. For example LightDM doesn't
> work anymore and the MATE desktop doesn't display any icons anymore
> because Caja wasn't able to reserve memory anymore.
> 
> In this case, bisecting isn't an option and I have to wait some
> weeks. It is really difficult to find the issue if the userland will
> damaged again and again.

Could you try with
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20221012035335.866440-1-npig...@gmail.com/
to see if your issues are related to that?

Andrew

-- 
Andrew DonnellanOzLabs, ADL Canberra
a...@linux.ibm.com   IBM Australia Limited



Re: [PATCH 1/2] powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs

2022-10-12 Thread Andrew Donnellan
On Wed, 2022-10-12 at 13:53 +1000, Nicholas Piggin wrote:
> powerpc 32-bit system call (and function) calling convention for 64-
> bit
> arguments requires the next available odd-pair (two sequential
> registers
> with the first being odd-numbered) from the standard register
> argument
> allocation.
> 
> The first argument register is r3, so a 64-bit argument that appears
> at
> an even position in the argument list must skip a register (unless
> there
> were preceeding 64-bit arguments, which might throw things off). This
> requires non-standard compat definitions to deal with the holes in
> the
> argument register allocation.
> 
> With pt_regs syscall wrappers which use a standard mapper to map
> pt_regs
> GPRs to function arguments, 32-bit kernels hit the same basic
> problem,
> the standard definitions don't cope with the unused argument
> registers.
> 
> Fix this by having 32-bit kernels share those syscall definitions
> with
> compat.
> 
> Thanks to Jason for spending a lot of time finding and bisecting this
> and
> developing a trivial reproducer. The perfect bug report.
> 
> Reported-by: Jason A. Donenfeld 
> Signed-off-by: Nicholas Piggin 
> ---

Fixes: 7e92e01b72452 ("powerpc: Provide syscall wrapper")

-- 
Andrew DonnellanOzLabs, ADL Canberra
a...@linux.ibm.com   IBM Australia Limited



Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 9:40 AM, Thomas Zimmermann wrote:
> Am 12.10.22 um 09:17 schrieb Arnd Bergmann:
>> On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:
>
>> Does qemu mark the device has having a particular endianess then, or
>> does it switch the layout of the framebuffer to match what the CPU
>> does?
>
> The latter. On neither architecture does qemu expose this flag. The 
> default endianess corresponds to the host.

"host" as in the machine that qemu runs on, or the machine that is
being emulated? I suppose it would be broken either way, but in the
latter case, we could get away with detecting that the machine is
running under qemu.

Arnd


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Thomas Zimmermann

Hi

Am 12.10.22 um 09:17 schrieb Arnd Bergmann:

On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:

Am 11.10.22 um 22:06 schrieb Arnd Bergmann:

On Tue, Oct 11, 2022, at 1:30 PM, Thomas Zimmermann wrote:

Am 11.10.22 um 09:46 schrieb Javier Martinez Canillas:

+static bool display_get_big_endian_of(struct drm_device *dev, struct 
device_node *of_node)
+{
+   bool big_endian;
+
+#ifdef __BIG_ENDIAN
+   big_endian = true;
+   if (of_get_property(of_node, "little-endian", NULL))
+   big_endian = false;
+#else
+   big_endian = false;
+   if (of_get_property(of_node, "big-endian", NULL))
+   big_endian = true;
+#endif
+
+   return big_endian;
+}
+


Ah, I see. The heuristic then is whether the build is BE or LE or if the Device
Tree has an explicit node defining the endianess. The patch looks good to me:


Yes. I took this test from offb.


Has the driver been tested with little-endian kernels though? While
ppc32 kernels are always BE, you can build kernels as either big-endian
or little-endian for most (modern) powerpc64 and arm/arm64 hardware,
and I don't see why that should change the defaults of the driver
when describing the same framebuffer hardware.


Yes, I tested this on qemu's ppc64le and ppc64.


Does qemu mark the device has having a particular endianess then, or
does it switch the layout of the framebuffer to match what the CPU
does?


The latter. On neither architecture does qemu expose this flag. The 
default endianess corresponds to the host.


Best regards
Thomas



I've seen other cases where devices in qemu were defined using an
arbitrary definition of "cpu-endian", which is generally not how
real hardware works.

 Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


RE: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Angus Chen


> -Original Message-
> From: Michael Ellerman 
> Sent: Wednesday, October 12, 2022 2:21 PM
> To: Michael S. Tsirkin 
> Cc: k...@vger.kernel.org; virtualizat...@lists.linux-foundation.org;
> net...@vger.kernel.org; linux-ker...@vger.kernel.org;
> alvaro.ka...@solid-run.com; Angus Chen ;
> gav...@nvidia.com; jasow...@redhat.com; lingshan@intel.com;
> m...@redhat.com; wangdem...@inspur.com; xiujianf...@huawei.com;
> linuxppc-dev@lists.ozlabs.org; Linus Torvalds 
> Subject: Re: [GIT PULL] virtio: fixes, features
> 
> "Michael S. Tsirkin"  writes:
> > The following changes since commit
> 4fe89d07dcc2804c8b562f6c7896a45643d34b2f:
> >
> >   Linux 6.0 (2022-10-02 14:09:07 -0700)
> >
> > are available in the Git repository at:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
> tags/for_linus
> >
> > for you to fetch changes up to
> 71491c54eafa318fdd24a1f26a1c82b28e1ac21d:
> >
> >   virtio_pci: don't try to use intxif pin is zero (2022-10-07 20:00:44 
> > -0400)
> >
> > 
> > virtio: fixes, features
> >
> > 9k mtu perf improvements
> > vdpa feature provisioning
> > virtio blk SECURE ERASE support
> >
> > Fixes, cleanups all over the place.
> >
> > Signed-off-by: Michael S. Tsirkin 
> >
> > 
> > Alvaro Karsz (1):
> >   virtio_blk: add SECURE ERASE command support
> >
> > Angus Chen (1):
> >   virtio_pci: don't try to use intxif pin is zero
> 
> This commit breaks virtio_pci for me on powerpc, when running as a qemu
> guest.
> 
> vp_find_vqs() bails out because pci_dev->pin == 0.
> 
> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> succeed if we called it - which is what the code used to do.
> 
> I think this happens because pci_dev->pin is not populated in
> pci_assign_irq().
> 
> I would absolutely believe this is bug in our PCI code, but I think it
> may also affect other platforms that use of_irq_parse_and_map_pci().
> 
> cheers
HI,sorry for reply again. If I change the code like blew:
 pci_read_config_byte(dev, PCI_INTERRUPT_PIN, );
 if (!pin) {
warn_on("some thing");
 return 0;
}
It will fix the orign bug.
Or we should populated the pci_dev->pin value correctly according to PCI spec 
about "Interrupt Pin" Register.

I have no idea about it, any suggestions are welcome.
Thank you.


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Michal Suchánek
On Wed, Oct 12, 2022 at 08:29:39AM +0200, Arnd Bergmann wrote:
> On Tue, Oct 11, 2022, at 11:38 PM, Michal Suchánek wrote:
> > On Tue, Oct 11, 2022 at 10:06:59PM +0200, Arnd Bergmann wrote:
> >> On Tue, Oct 11, 2022, at 1:30 PM, Thomas Zimmermann wrote:
> >> > Am 11.10.22 um 09:46 schrieb Javier Martinez Canillas:
> >> >>> +static bool display_get_big_endian_of(struct drm_device *dev, struct 
> >> >>> device_node *of_node)
> >> >>> +{
> >> >>> +  bool big_endian;
> >> >>> +
> >> >>> +#ifdef __BIG_ENDIAN
> >> >>> +  big_endian = true;
> >> >>> +  if (of_get_property(of_node, "little-endian", NULL))
> >> >>> +  big_endian = false;
> >> >>> +#else
> >> >>> +  big_endian = false;
> >> >>> +  if (of_get_property(of_node, "big-endian", NULL))
> >> >>> +  big_endian = true;
> >> >>> +#endif
> >> >>> +
> >> >>> +  return big_endian;
> >> >>> +}
> >> >>> +
> >> >> 
> >> >> Ah, I see. The heuristic then is whether the build is BE or LE or if 
> >> >> the Device
> >> >> Tree has an explicit node defining the endianess. The patch looks good 
> >> >> to me:
> >> >
> >> > Yes. I took this test from offb.
> >> 
> >> Has the driver been tested with little-endian kernels though? While
> >> ppc32 kernels are always BE, you can build kernels as either big-endian
> >> or little-endian for most (modern) powerpc64 and arm/arm64 hardware,
> >> and I don't see why that should change the defaults of the driver
> >> when describing the same framebuffer hardware.
> >
> > The original code was added with
> > commit 7f29b87a7779 ("powerpc: offb: add support for foreign endianness")
> >
> > The hardware is either big-endian or runtime-switchable-endian.
> 
> Are you referring to CPU hardware or framebuffer hardware here?
CPU hardware
> 
> > It makes
> > sense to assume big-endian when runnig big-endian and the DT does not
> > specify endian which is likely on a historical system.
> 
> Agreed, assuming big-endian here clearly makes sense.
> 
> > It also makes sense to assume that on system with
> > runtime-switchable-endian the DT specifies the framebuffer endian.
> >
> > If systems that only do little-endian exist or emerge later then it also
> > makes sense to assume that the framebuffer matches the host if not
> > specified.
> >
> > I don't really see a problem here.
> >
> > BTW is this used on arm and on what platform?
> 
> I'm not aware of any users on Arm, most likely they all use
> simplefb/simpledrm or a gpu specific binding. There might be
> users on sparc, but they would obviously be big-endian
> as well.
> 
> > I do not see any bindings in dts.
> 
> Right, that is the real problem I see as well. I found the original
> CHRP binding document at
> https://www.devicetree.org/open-firmware/bindings/devices/html/lfb-1_0d.html
> 
> Unfortunately, this only specifies an 8-bit-per-pixel mode, and the
> multi-byte pixel support that was added in linux-2.1.125 was
> probably powermac specific without a public specification.
> 
> I think ideally we should add a binding document that describes what
> the driver actually expects, but in this case I would just drop the
> #ifdef check and always assume the framebuffer is big-endian unless
> the "little-endian" property is set, in order to have a sensible
> definition that does not depend on what OS (i.e. Linux
> CONFIG_CPU_BIG_ENDIAN) you are running.
> 
>Arnd


RE: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Angus Chen


> -Original Message-
> From: Michael Ellerman 
> Sent: Wednesday, October 12, 2022 2:21 PM
> To: Michael S. Tsirkin 
> Cc: k...@vger.kernel.org; virtualizat...@lists.linux-foundation.org;
> net...@vger.kernel.org; linux-ker...@vger.kernel.org;
> alvaro.ka...@solid-run.com; Angus Chen ;
> gav...@nvidia.com; jasow...@redhat.com; lingshan@intel.com;
> m...@redhat.com; wangdem...@inspur.com; xiujianf...@huawei.com;
> linuxppc-dev@lists.ozlabs.org; Linus Torvalds 
> Subject: Re: [GIT PULL] virtio: fixes, features
> 
> "Michael S. Tsirkin"  writes:
> > The following changes since commit
> 4fe89d07dcc2804c8b562f6c7896a45643d34b2f:
> >
> >   Linux 6.0 (2022-10-02 14:09:07 -0700)
> >
> > are available in the Git repository at:
> >
> >   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git
> tags/for_linus
> >
> > for you to fetch changes up to
> 71491c54eafa318fdd24a1f26a1c82b28e1ac21d:
> >
> >   virtio_pci: don't try to use intxif pin is zero (2022-10-07 20:00:44 
> > -0400)
> >
> > 
> > virtio: fixes, features
> >
> > 9k mtu perf improvements
> > vdpa feature provisioning
> > virtio blk SECURE ERASE support
> >
> > Fixes, cleanups all over the place.
> >
> > Signed-off-by: Michael S. Tsirkin 
> >
> > 
> > Alvaro Karsz (1):
> >   virtio_blk: add SECURE ERASE command support
> >
> > Angus Chen (1):
> >   virtio_pci: don't try to use intxif pin is zero
> 
> This commit breaks virtio_pci for me on powerpc, when running as a qemu
> guest.
> 
> vp_find_vqs() bails out because pci_dev->pin == 0.
> 
> But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
> succeed if we called it - which is what the code used to do.
> 
> I think this happens because pci_dev->pin is not populated in
> pci_assign_irq().
Yes,you are right.
> 
> I would absolutely believe this is bug in our PCI code, but I think it
> may also affect other platforms that use of_irq_parse_and_map_pci().
> 
Should I just revert or submit a new version?
> cheers


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 8:46 AM, Thomas Zimmermann wrote:
> Am 11.10.22 um 22:06 schrieb Arnd Bergmann:
>> On Tue, Oct 11, 2022, at 1:30 PM, Thomas Zimmermann wrote:
>>> Am 11.10.22 um 09:46 schrieb Javier Martinez Canillas:
> +static bool display_get_big_endian_of(struct drm_device *dev, struct 
> device_node *of_node)
> +{
> + bool big_endian;
> +
> +#ifdef __BIG_ENDIAN
> + big_endian = true;
> + if (of_get_property(of_node, "little-endian", NULL))
> + big_endian = false;
> +#else
> + big_endian = false;
> + if (of_get_property(of_node, "big-endian", NULL))
> + big_endian = true;
> +#endif
> +
> + return big_endian;
> +}
> +

 Ah, I see. The heuristic then is whether the build is BE or LE or if the 
 Device
 Tree has an explicit node defining the endianess. The patch looks good to 
 me:
>>>
>>> Yes. I took this test from offb.
>> 
>> Has the driver been tested with little-endian kernels though? While
>> ppc32 kernels are always BE, you can build kernels as either big-endian
>> or little-endian for most (modern) powerpc64 and arm/arm64 hardware,
>> and I don't see why that should change the defaults of the driver
>> when describing the same framebuffer hardware.
>
> Yes, I tested this on qemu's ppc64le and ppc64.

Does qemu mark the device has having a particular endianess then, or
does it switch the layout of the framebuffer to match what the CPU
does?

I've seen other cases where devices in qemu were defined using an
arbitrary definition of "cpu-endian", which is generally not how
real hardware works.

Arnd


Issues with the first PowerPC updates for the kernel 6.1

2022-10-12 Thread Christian Zigotzky
Hi All,

I use the Nemo board with a PASemi PA6T CPU and have some issues since the 
first PowerPC updates for the kernel 6.1.

I successfully compiled the git kernel with the first PowerPC updates two days 
ago.

Unfortunately this kernel is really dangerous. Many things for example Network 
Manager and LightDM don't work anymore and produced several gigabyte of config 
files till the partition has been filled.

I deleted some files like the resolv.conf that had a size over 200 GB!

Unfortunately, MintPPC was still damaged. For example LightDM doesn't work 
anymore and the MATE desktop doesn't display any icons anymore because Caja 
wasn't able to reserve memory anymore.

In this case, bisecting isn't an option and I have to wait some weeks. It is 
really difficult to find the issue if the userland will damaged again and again.

Cheers,
Christian


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Thomas Zimmermann

Hi

Am 11.10.22 um 22:06 schrieb Arnd Bergmann:

On Tue, Oct 11, 2022, at 1:30 PM, Thomas Zimmermann wrote:

Am 11.10.22 um 09:46 schrieb Javier Martinez Canillas:

+static bool display_get_big_endian_of(struct drm_device *dev, struct 
device_node *of_node)
+{
+   bool big_endian;
+
+#ifdef __BIG_ENDIAN
+   big_endian = true;
+   if (of_get_property(of_node, "little-endian", NULL))
+   big_endian = false;
+#else
+   big_endian = false;
+   if (of_get_property(of_node, "big-endian", NULL))
+   big_endian = true;
+#endif
+
+   return big_endian;
+}
+


Ah, I see. The heuristic then is whether the build is BE or LE or if the Device
Tree has an explicit node defining the endianess. The patch looks good to me:


Yes. I took this test from offb.


Has the driver been tested with little-endian kernels though? While
ppc32 kernels are always BE, you can build kernels as either big-endian
or little-endian for most (modern) powerpc64 and arm/arm64 hardware,
and I don't see why that should change the defaults of the driver
when describing the same framebuffer hardware.


Yes, I tested this on qemu's ppc64le and ppc64.

Best regards
Thomas



I could understand having a default to e.g. big-endian on all powerpc and
a default for little-endian on all arm, but having it tied to the
way the kernel is built seems wrong, and doesn't make sense in a
DT binding either.

   Arnd


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH v4 5/5] drm/ofdrm: Support big-endian scanout buffers

2022-10-12 Thread Arnd Bergmann
On Tue, Oct 11, 2022, at 11:38 PM, Michal Suchánek wrote:
> On Tue, Oct 11, 2022 at 10:06:59PM +0200, Arnd Bergmann wrote:
>> On Tue, Oct 11, 2022, at 1:30 PM, Thomas Zimmermann wrote:
>> > Am 11.10.22 um 09:46 schrieb Javier Martinez Canillas:
>> >>> +static bool display_get_big_endian_of(struct drm_device *dev, struct 
>> >>> device_node *of_node)
>> >>> +{
>> >>> +bool big_endian;
>> >>> +
>> >>> +#ifdef __BIG_ENDIAN
>> >>> +big_endian = true;
>> >>> +if (of_get_property(of_node, "little-endian", NULL))
>> >>> +big_endian = false;
>> >>> +#else
>> >>> +big_endian = false;
>> >>> +if (of_get_property(of_node, "big-endian", NULL))
>> >>> +big_endian = true;
>> >>> +#endif
>> >>> +
>> >>> +return big_endian;
>> >>> +}
>> >>> +
>> >> 
>> >> Ah, I see. The heuristic then is whether the build is BE or LE or if the 
>> >> Device
>> >> Tree has an explicit node defining the endianess. The patch looks good to 
>> >> me:
>> >
>> > Yes. I took this test from offb.
>> 
>> Has the driver been tested with little-endian kernels though? While
>> ppc32 kernels are always BE, you can build kernels as either big-endian
>> or little-endian for most (modern) powerpc64 and arm/arm64 hardware,
>> and I don't see why that should change the defaults of the driver
>> when describing the same framebuffer hardware.
>
> The original code was added with
> commit 7f29b87a7779 ("powerpc: offb: add support for foreign endianness")
>
> The hardware is either big-endian or runtime-switchable-endian.

Are you referring to CPU hardware or framebuffer hardware here?

> It makes
> sense to assume big-endian when runnig big-endian and the DT does not
> specify endian which is likely on a historical system.

Agreed, assuming big-endian here clearly makes sense.

> It also makes sense to assume that on system with
> runtime-switchable-endian the DT specifies the framebuffer endian.
>
> If systems that only do little-endian exist or emerge later then it also
> makes sense to assume that the framebuffer matches the host if not
> specified.
>
> I don't really see a problem here.
>
> BTW is this used on arm and on what platform?

I'm not aware of any users on Arm, most likely they all use
simplefb/simpledrm or a gpu specific binding. There might be
users on sparc, but they would obviously be big-endian
as well.

> I do not see any bindings in dts.

Right, that is the real problem I see as well. I found the original
CHRP binding document at
https://www.devicetree.org/open-firmware/bindings/devices/html/lfb-1_0d.html

Unfortunately, this only specifies an 8-bit-per-pixel mode, and the
multi-byte pixel support that was added in linux-2.1.125 was
probably powermac specific without a public specification.

I think ideally we should add a binding document that describes what
the driver actually expects, but in this case I would just drop the
#ifdef check and always assume the framebuffer is big-endian unless
the "little-endian" property is set, in order to have a sensible
definition that does not depend on what OS (i.e. Linux
CONFIG_CPU_BIG_ENDIAN) you are running.

   Arnd


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Michael Ellerman
"Michael S. Tsirkin"  writes:
> The following changes since commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f:
>
>   Linux 6.0 (2022-10-02 14:09:07 -0700)
>
> are available in the Git repository at:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
>
> for you to fetch changes up to 71491c54eafa318fdd24a1f26a1c82b28e1ac21d:
>
>   virtio_pci: don't try to use intxif pin is zero (2022-10-07 20:00:44 -0400)
>
> 
> virtio: fixes, features
>
> 9k mtu perf improvements
> vdpa feature provisioning
> virtio blk SECURE ERASE support
>
> Fixes, cleanups all over the place.
>
> Signed-off-by: Michael S. Tsirkin 
>
> 
> Alvaro Karsz (1):
>   virtio_blk: add SECURE ERASE command support
>
> Angus Chen (1):
>   virtio_pci: don't try to use intxif pin is zero

This commit breaks virtio_pci for me on powerpc, when running as a qemu
guest.

vp_find_vqs() bails out because pci_dev->pin == 0.

But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
succeed if we called it - which is what the code used to do.

I think this happens because pci_dev->pin is not populated in
pci_assign_irq().

I would absolutely believe this is bug in our PCI code, but I think it
may also affect other platforms that use of_irq_parse_and_map_pci().

cheers


Re: [PATCH] powerpc/kprobes: Fix null pointer reference in arch_prepare_kprobe()

2022-10-12 Thread Li Huafei


On 2022/9/30 17:47, Naveen N. Rao wrote:
> Li Huafei wrote:
>> I found a null pointer reference in arch_prepare_kprobe():
> 
> Good find!
> 

Hi Naveen,

Thank you for the review.

>>
>>   # echo 'p cmdline_proc_show' > kprobe_events
>>   # echo 'p cmdline_proc_show+16' >> kprobe_events
> 
> I think we should extend multiple_kprobes selftest to also place
> contiguous probes to catch such errors.
> 
Yes. But each architecture implementation is different and it looks a
little difficult to decide which offsets need to be tested.

>>   [   67.278533][  T122] Kernel attempted to read user page (0) -
>> exploitattempt? (uid: 0)
>>   [   67.279326][  T122] BUG: Kernel NULL pointer dereference on read
>> at 0x
>>   [   67.279738][  T122] Faulting instruction address: 0xc0050bfc
>>   [   67.280486][  T122] Oops: Kernel access of bad area, sig: 11 [#1]
>>   [   67.280846][  T122] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048
>> NUMA PowerNV
>>   [   67.281435][  T122] Modules linked in:
>>   [   67.281903][  T122] CPU: 0 PID: 122 Comm: sh Not tainted
>> 6.0.0-rc3-7-gdcf8e5633e2e #10
>>   [   67.282547][  T122] NIP:  c0050bfc LR: c0050bec
>> CTR:5bdc
>>   [   67.282920][  T122] REGS: c000348475b0 TRAP: 0300   Not
>> tainted (6.0.0-rc3-7-gdcf8e5633e2e)
>>   [   67.283424][  T122] MSR:  90009033
>>  CR: 88002444  XER: 20040006
>>   [   67.284023][  T122] CFAR: c022d100 DAR: 
>> DSISR: 4000 IRQMASK: 0
>>   [   67.284023][  T122] GPR00: c0050bec c00034847850
>> c13f6100 c1fb7718
>>   [   67.284023][  T122] GPR04: c0515c10 c0e5fe08
>> c133da60 c4839300
>>   [   67.284023][  T122] GPR08: c14ffb98 
>> c0515c0c c0e18576
>>   [   67.284023][  T122] GPR12: c0e60170 c15a
>> 0001155e0460 
>>   [   67.284023][  T122] GPR16:  7fffe8eeb3c8
>> 000116320728 
>>   [   67.284023][  T122] GPR20: 000116320720 
>> c12fa918 0006
>>   [   67.284023][  T122] GPR24: c14ffb98 c11ed360
>>  c1fb7928
>>   [   67.284023][  T122] GPR28:  
>> 7c0802a6 c1fb7918
>>   [   67.287799][  T122] NIP [c0050bfc]
>> arch_prepare_kprobe+0x10c/0x2d0
>>   [   67.288490][  T122] LR [c0050bec]
>> arch_prepare_kprobe+0xfc/0x2d0
>>   [   67.289025][  T122] Call Trace:
>>   [   67.289268][  T122] [c00034847850] [c12f77a0]
>> 0xc12f77a0 (unreliable)
>>   [   67.28][  T122] [c000348478d0] [c0231320]
>> register_kprobe+0x3c0/0x7a0
>>   [   67.290439][  T122] [c00034847940] [c02938c0]
>> __register_trace_kprobe+0x140/0x1a0
>>   [   67.290898][  T122] [c000348479b0] [c02944c4]
>> __trace_kprobe_create+0x794/0x1040
>>   [   67.291330][  T122] [c00034847b60] [c02a1614]
>> trace_probe_create+0xc4/0xe0
>>   [   67.291717][  T122] [c00034847bb0] [c029363c]
>> create_or_delete_trace_kprobe+0x2c/0x80
>>   [   67.292158][  T122] [c00034847bd0] [c0264420]
>> trace_parse_run_command+0xf0/0x210
>>   [   67.292611][  T122] [c00034847c70] [c02934a0]
>> probes_write+0x20/0x40
>>   [   67.292996][  T122] [c00034847c90] [c045e98c]
>> vfs_write+0xfc/0x450
>>   [   67.293356][  T122] [c00034847d50] [c045eec4]
>> ksys_write+0x84/0x140
>>   [   67.293716][  T122] [c00034847da0] [c002e4fc]
>> system_call_exception+0x17c/0x3a0
>>   [   67.294186][  T122] [c00034847e10] [c000c0e8]
>> system_call_vectored_common+0xe8/0x278
>>   [   67.294680][  T122] --- interrupt: 3000 at 0x7fffa5682de0
>>   [   67.294937][  T122] NIP:  7fffa5682de0 LR: 
>> CTR:
>>   [   67.295313][  T122] REGS: c00034847e80 TRAP: 3000   Not
>> tainted (6.0.0-rc3-7-gdcf8e5633e2e)
>>   [   67.295725][  T122] MSR:  9280f033
>>   CR: 44002408  XER: 
>>   [   67.296291][  T122] IRQMASK: 0
>>   [   67.296291][  T122] GPR00: 0004 7fffe8eeaec0
>> 7fffa5757300 0001
>>   [   67.296291][  T122] GPR04: 000116329c60 0017
>> 00116329 
>>   [   67.296291][  T122] GPR08: 0006 
>>  
>>   [   67.296291][  T122] GPR12:  7fffa580ac60
>> 0001155e0460 
>>   [   67.296291][  T122] GPR16:  7fffe8eeb3c8
>> 000116320728 
>>   [   67.296291][  T122] GPR20: 000116320720 
>>  0002
>>   [   67.296291][  T122] GPR24: 0001163206f0 0020
>> 7fffe8eeafa0 0001
>>   [   67.296291][  T122] GPR28:  0017
>> 000116329c60