Re: [PATCH v5 0/7] treewide cleanup of random integer usage

2022-10-08 Thread Kees Cook
On Fri, Oct 07, 2022 at 11:53:52PM -0600, Jason A. Donenfeld wrote:
> This is a five part treewide cleanup of random integer handling. The
> rules for random integers are:

Reviewing the delta between of my .cocci rules and your v5, everything
matches, except for get_random_int() conversions for files not in
your tree:

diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c 
b/drivers/gpu/drm/tests/drm_buddy_test.c
index 7a2b2d6bc3fe..62f69589a72d 100644
--- a/drivers/gpu/drm/tests/drm_buddy_test.c
+++ b/drivers/gpu/drm/tests/drm_buddy_test.c
@@ -729,7 +729,7 @@ static void drm_test_buddy_alloc_limit(struct kunit *test)
 static int drm_buddy_init_test(struct kunit *test)
 {
while (!random_seed)
-   random_seed = get_random_int();
+   random_seed = get_random_u32();
 
return 0;
 }
diff --git a/drivers/gpu/drm/tests/drm_mm_test.c 
b/drivers/gpu/drm/tests/drm_mm_test.c
index 659d1af4dca7..c4b66eeae203 100644
--- a/drivers/gpu/drm/tests/drm_mm_test.c
+++ b/drivers/gpu/drm/tests/drm_mm_test.c
@@ -2212,7 +2212,7 @@ static void drm_test_mm_color_evict_range(struct kunit 
*test)
 static int drm_mm_init_test(struct kunit *test)
 {
while (!random_seed)
-   random_seed = get_random_int();
+   random_seed = get_random_u32();
 
return 0;
 }

So, I guess I mean to say that "prandom: remove unused functions" is
going to cause some pain. :) Perhaps don't push that to -next, and do a
final pass next merge window to catch any new stuff, and then send those
updates and the removal before -rc1 closes?

-- 
Kees Cook


Re: [PATCH v3 3/5] treewide: use get_random_u32() when possible

2022-10-08 Thread Jason A. Donenfeld
On Sat, Oct 08, 2022 at 09:53:33PM +, David Laight wrote:
> From: Jason A. Donenfeld
> > Sent: 07 October 2022 18:56
> ...
> > > Given these kinds of less mechanical changes, it may make sense to split
> > > these from the "trivial" conversions in a treewide patch. The chance of
> > > needing a revert from the simple 1:1 conversions is much lower than the
> > > need to revert by-hand changes.
> > >
> > > The Cocci script I suggested in my v1 review gets 80% of the first
> > > patch, for example.
> > 
> > I'll split things up into a mechanical step and a non-mechanical step.
> > Good idea.
> 
> I'd also do something about the 'get_random_int() & 3' cases.
> (ie remainder by 2^n-1)
> These can be converted to 'get_random_u8() & 3' (etc).
> So they only need one random byte (not 4) and no multiply.
> 
> Possibly something based on (the quickly typed, and not C):
> #define get_random_below(val) [
>   if (builtin_constant(val))
>   BUILD_BUG_ON(!val || val > 0x1ull)
>   if (!(val & (val - 1)) {
>   if (val <= 0x100)
>   return get_random_u8() & (val - 1);
>   if (val <= 0x1)
>   return get_random_u16() & (val - 1);
>   return get_random_u32() & (val - 1);
>   }
>   }
>   BUILD_BUG_ON(sizeof (val) > 4);
>   return ((u64)get_random_u32() * val) >> 32;

This is already how the prandom_u32_max() implementation works, as
suggested in the cover letter. The multiplication by constants in it
reduces to bit shifts and you already get all the manual masking
possible.

> get_random_below() is a much better name than prandom_u32_max().

Yes, but that name is reserved for when I succeed at making a function
that bounds with a uniform distribution. prandom_u32_max()'s
distribution is non-uniform since it doesn't do rejection sampling. Work
in progress is on 
https://git.zx2c4.com/linux-rng/commit/?h=jd/get_random_u32_below .
But out of common respect for this already huge thread with a massive
CC list, if you want to bikeshed my WIP stuff, please start a new thread
for that and not bog this one down. IOW, no need to reply here directly.
That'd annoy me.

Jason


Re: [PATCH v4 4/6] treewide: use get_random_u32() when possible

2022-10-08 Thread Jason A. Donenfeld
On Sat, Oct 8, 2022 at 4:37 PM Jason A. Donenfeld  wrote:
>
> On Sat, Oct 08, 2022 at 10:18:45PM +, David Laight wrote:
> > From: Jason A. Donenfeld
> > > Sent: 07 October 2022 19:01
> > >
> > > The prandom_u32() function has been a deprecated inline wrapper around
> > > get_random_u32() for several releases now, and compiles down to the
> > > exact same code. Replace the deprecated wrapper with a direct call to
> > > the real function. The same also applies to get_random_int(), which is
> > > just a wrapper around get_random_u32().
> > >
> > ...
> > > diff --git a/net/802/garp.c b/net/802/garp.c
> > > index f6012f8e59f0..c1bb67e25430 100644
> > > --- a/net/802/garp.c
> > > +++ b/net/802/garp.c
> > > @@ -407,7 +407,7 @@ static void garp_join_timer_arm(struct garp_applicant 
> > > *app)
> > >  {
> > > unsigned long delay;
> > >
> > > -   delay = (u64)msecs_to_jiffies(garp_join_time) * prandom_u32() >> 32;
> > > +   delay = (u64)msecs_to_jiffies(garp_join_time) * get_random_u32() >> 
> > > 32;
> > > mod_timer(>join_timer, jiffies + delay);
> > >  }
> > >
> > > diff --git a/net/802/mrp.c b/net/802/mrp.c
> > > index 35e04cc5390c..3e9fe9f5d9bf 100644
> > > --- a/net/802/mrp.c
> > > +++ b/net/802/mrp.c
> > > @@ -592,7 +592,7 @@ static void mrp_join_timer_arm(struct mrp_applicant 
> > > *app)
> > >  {
> > > unsigned long delay;
> > >
> > > -   delay = (u64)msecs_to_jiffies(mrp_join_time) * prandom_u32() >> 32;
> > > +   delay = (u64)msecs_to_jiffies(mrp_join_time) * get_random_u32() >> 32;
> > > mod_timer(>join_timer, jiffies + delay);
> > >  }
> > >
> >
> > Aren't those:
> >   delay = prandom_u32_max(msecs_to_jiffies(xxx_join_time));
>
> Probably, but too involved and peculiar for this cleanup.
>
> Feel free to send a particular patch to that maintainer.

I guess the cocci patch looks like this, so maybe I'll put that in 1/7
if I respin this.

@@
expression E;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)


[powerpc:next] BUILD SUCCESS 376b3275c19f83d373e841e9af2d7658693190b9

2022-10-08 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: 376b3275c19f83d373e841e9af2d7658693190b9  KVM: PPC: Book3S HV: Fix 
stack frame regs marker

elapsed time: 735m

configs tested: 149
configs skipped: 8

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
um i386_defconfig
um   x86_64_defconfig
i386defconfig
x86_64  defconfig
alpha   defconfig
arc defconfig
s390defconfig
i386 randconfig-a011-20221003
i386 randconfig-a015-20221003
i386 randconfig-a016-20221003
i386 randconfig-a014-20221003
i386 randconfig-a012-20221003
x86_64   rhel-8.3
x86_64  rhel-8.3-func
x86_64   randconfig-a011-20221003
s390 allmodconfig
x86_64   randconfig-a014-20221003
x86_64randconfig-a013
x86_64rhel-8.3-kselftests
x86_64   randconfig-a012-20221003
arm defconfig
x86_64   randconfig-a013-20221003
x86_64   randconfig-a015-20221003
x86_64   randconfig-a016-20221003
i386 randconfig-a013-20221003
x86_64randconfig-a011
x86_64   allyesconfig
s390 allyesconfig
x86_64   rhel-8.3-kvm
powerpc   allnoconfig
m68k allmodconfig
x86_64   rhel-8.3-syz
arc  allyesconfig
powerpc  allmodconfig
x86_64 rhel-8.3-kunit
alphaallyesconfig
m68k allyesconfig
mips allyesconfig
x86_64randconfig-a015
i386 allyesconfig
arc  randconfig-r043-20221002
sh  sh7785lcr_32bit_defconfig
sh   allmodconfig
riscvrandconfig-r042-20221007
riscvrandconfig-r042-20221003
arc  randconfig-r043-20221007
ia64 allmodconfig
csky  allnoconfig
arc   allnoconfig
sparc64  alldefconfig
arc  randconfig-r043-20221008
powerpc ep8248e_defconfig
arm  badge4_defconfig
sh  r7780mp_defconfig
arc  randconfig-r043-20221003
mips db1xxx_defconfig
arcvdk_hs38_defconfig
alpha allnoconfig
ia64defconfig
riscv allnoconfig
powerpc redwood_defconfig
armclps711x_defconfig
sh   se7751_defconfig
arcnsim_700_defconfig
arm64allyesconfig
arm pxa_defconfig
arm  allyesconfig
nios2   defconfig
s390 randconfig-r044-20221007
s390 randconfig-r044-20221003
m68kstmark2_defconfig
sh  kfr2r09_defconfig
powerpcklondike_defconfig
armoxnas_v6_defconfig
powerpc sequoia_defconfig
um   alldefconfig
riscvrandconfig-r042-20221009
arc  randconfig-r043-20221009
s390 randconfig-r044-20221009
shdreamcast_defconfig
s390  debug_defconfig
powerpc asp8347_defconfig
nios2alldefconfig
powerpc wii_defconfig
sparc   sparc32_defconfig
m68k  atari_defconfig
sh   se7780_defconfig
x86_64   alldefconfig
i386  randconfig-c001
mips randconfig-c004-20221002
i386  debian-10.3-kvm
i386debian-10.3-kunit
i386 debian-10.3-func
powerpc ps3_defconfig
m68kmvme147_defconfig
armspear6xx_defconfig
powerpc rainier_defconfig
riscvnommu_virt_defconfig
riscv  rv32_defconfig
riscvnommu_k210_defconfig
i386

Re: [PATCH v4 4/6] treewide: use get_random_u32() when possible

2022-10-08 Thread Jason A. Donenfeld
On Sat, Oct 08, 2022 at 10:18:45PM +, David Laight wrote:
> From: Jason A. Donenfeld
> > Sent: 07 October 2022 19:01
> > 
> > The prandom_u32() function has been a deprecated inline wrapper around
> > get_random_u32() for several releases now, and compiles down to the
> > exact same code. Replace the deprecated wrapper with a direct call to
> > the real function. The same also applies to get_random_int(), which is
> > just a wrapper around get_random_u32().
> > 
> ...
> > diff --git a/net/802/garp.c b/net/802/garp.c
> > index f6012f8e59f0..c1bb67e25430 100644
> > --- a/net/802/garp.c
> > +++ b/net/802/garp.c
> > @@ -407,7 +407,7 @@ static void garp_join_timer_arm(struct garp_applicant 
> > *app)
> >  {
> > unsigned long delay;
> > 
> > -   delay = (u64)msecs_to_jiffies(garp_join_time) * prandom_u32() >> 32;
> > +   delay = (u64)msecs_to_jiffies(garp_join_time) * get_random_u32() >> 32;
> > mod_timer(>join_timer, jiffies + delay);
> >  }
> > 
> > diff --git a/net/802/mrp.c b/net/802/mrp.c
> > index 35e04cc5390c..3e9fe9f5d9bf 100644
> > --- a/net/802/mrp.c
> > +++ b/net/802/mrp.c
> > @@ -592,7 +592,7 @@ static void mrp_join_timer_arm(struct mrp_applicant 
> > *app)
> >  {
> > unsigned long delay;
> > 
> > -   delay = (u64)msecs_to_jiffies(mrp_join_time) * prandom_u32() >> 32;
> > +   delay = (u64)msecs_to_jiffies(mrp_join_time) * get_random_u32() >> 32;
> > mod_timer(>join_timer, jiffies + delay);
> >  }
> > 
> 
> Aren't those:
>   delay = prandom_u32_max(msecs_to_jiffies(xxx_join_time));

Probably, but too involved and peculiar for this cleanup.

Feel free to send a particular patch to that maintainer.

> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)


Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread Jason A. Donenfeld
On Sat, Oct 08, 2022 at 10:08:03PM +, David Laight wrote:
> From: Jason A. Donenfeld
> > Sent: 07 October 2022 19:01
> > 
> > Rather than incurring a division or requesting too many random bytes for
> > the given range, use the prandom_u32_max() function, which only takes
> > the minimum required bytes from the RNG and avoids divisions.
> > 
> ...
> > --- a/lib/cmdline_kunit.c
> > +++ b/lib/cmdline_kunit.c
> > @@ -76,7 +76,7 @@ static void cmdline_test_lead_int(struct kunit *test)
> > int rc = cmdline_test_values[i];
> > int offset;
> > 
> > -   sprintf(in, "%u%s", prandom_u32_max(256), str);
> > +   sprintf(in, "%u%s", get_random_int() % 256, str);
> > /* Only first '-' after the number will advance the pointer */
> > offset = strlen(in) - strlen(str) + !!(rc == 2);
> > cmdline_do_one_test(test, in, rc, offset);
> > @@ -94,7 +94,7 @@ static void cmdline_test_tail_int(struct kunit *test)
> > int rc = strcmp(str, "") ? (strcmp(str, "-") ? 0 : 1) : 1;
> > int offset;
> > 
> > -   sprintf(in, "%s%u", str, prandom_u32_max(256));
> > +   sprintf(in, "%s%u", str, get_random_int() % 256);
> > /*
> >  * Only first and leading '-' not followed by integer
> >  * will advance the pointer.
> 
> Something has gone backwards here
> And get_random_u8() looks a better fit.

Wrong patch version.

> 
>   David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 
> 1PT, UK
> Registration No: 1397386 (Wales)


RE: [PATCH v4 4/6] treewide: use get_random_u32() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 19:01
> 
> The prandom_u32() function has been a deprecated inline wrapper around
> get_random_u32() for several releases now, and compiles down to the
> exact same code. Replace the deprecated wrapper with a direct call to
> the real function. The same also applies to get_random_int(), which is
> just a wrapper around get_random_u32().
> 
...
> diff --git a/net/802/garp.c b/net/802/garp.c
> index f6012f8e59f0..c1bb67e25430 100644
> --- a/net/802/garp.c
> +++ b/net/802/garp.c
> @@ -407,7 +407,7 @@ static void garp_join_timer_arm(struct garp_applicant 
> *app)
>  {
>   unsigned long delay;
> 
> - delay = (u64)msecs_to_jiffies(garp_join_time) * prandom_u32() >> 32;
> + delay = (u64)msecs_to_jiffies(garp_join_time) * get_random_u32() >> 32;
>   mod_timer(>join_timer, jiffies + delay);
>  }
> 
> diff --git a/net/802/mrp.c b/net/802/mrp.c
> index 35e04cc5390c..3e9fe9f5d9bf 100644
> --- a/net/802/mrp.c
> +++ b/net/802/mrp.c
> @@ -592,7 +592,7 @@ static void mrp_join_timer_arm(struct mrp_applicant *app)
>  {
>   unsigned long delay;
> 
> - delay = (u64)msecs_to_jiffies(mrp_join_time) * prandom_u32() >> 32;
> + delay = (u64)msecs_to_jiffies(mrp_join_time) * get_random_u32() >> 32;
>   mod_timer(>join_timer, jiffies + delay);
>  }
> 

Aren't those:
delay = prandom_u32_max(msecs_to_jiffies(xxx_join_time));

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 19:01
> 
> Rather than incurring a division or requesting too many random bytes for
> the given range, use the prandom_u32_max() function, which only takes
> the minimum required bytes from the RNG and avoids divisions.
> 
...
> --- a/lib/cmdline_kunit.c
> +++ b/lib/cmdline_kunit.c
> @@ -76,7 +76,7 @@ static void cmdline_test_lead_int(struct kunit *test)
>   int rc = cmdline_test_values[i];
>   int offset;
> 
> - sprintf(in, "%u%s", prandom_u32_max(256), str);
> + sprintf(in, "%u%s", get_random_int() % 256, str);
>   /* Only first '-' after the number will advance the pointer */
>   offset = strlen(in) - strlen(str) + !!(rc == 2);
>   cmdline_do_one_test(test, in, rc, offset);
> @@ -94,7 +94,7 @@ static void cmdline_test_tail_int(struct kunit *test)
>   int rc = strcmp(str, "") ? (strcmp(str, "-") ? 0 : 1) : 1;
>   int offset;
> 
> - sprintf(in, "%s%u", str, prandom_u32_max(256));
> + sprintf(in, "%s%u", str, get_random_int() % 256);
>   /*
>* Only first and leading '-' not followed by integer
>* will advance the pointer.

Something has gone backwards here
And get_random_u8() looks a better fit.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v3 3/5] treewide: use get_random_u32() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 18:56
...
> > Given these kinds of less mechanical changes, it may make sense to split
> > these from the "trivial" conversions in a treewide patch. The chance of
> > needing a revert from the simple 1:1 conversions is much lower than the
> > need to revert by-hand changes.
> >
> > The Cocci script I suggested in my v1 review gets 80% of the first
> > patch, for example.
> 
> I'll split things up into a mechanical step and a non-mechanical step.
> Good idea.

I'd also do something about the 'get_random_int() & 3' cases.
(ie remainder by 2^n-1)
These can be converted to 'get_random_u8() & 3' (etc).
So they only need one random byte (not 4) and no multiply.

Possibly something based on (the quickly typed, and not C):
#define get_random_below(val) [
if (builtin_constant(val))
BUILD_BUG_ON(!val || val > 0x1ull)
if (!(val & (val - 1)) {
if (val <= 0x100)
return get_random_u8() & (val - 1);
if (val <= 0x1)
return get_random_u16() & (val - 1);
return get_random_u32() & (val - 1);
}
}
BUILD_BUG_ON(sizeof (val) > 4);
return ((u64)get_random_u32() * val) >> 32;
}

get_random_below() is a much better name than prandom_u32_max().

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


Re: [PATCH v5 0/7] treewide cleanup of random integer usage

2022-10-08 Thread Yury Norov
On Fri, Oct 07, 2022 at 11:53:52PM -0600, Jason A. Donenfeld wrote:
> Changes v4->v5:
> - Coccinelle is now used for as much mechanical aspects as possible,
>   with mechanical parts split off from non-mechanical parts. This should
>   drastically reduce the amount of code that needs to be reviewed
>   carefully. Each commit mentions now if it was done by hand or is
>   mechanical.
> 
> Hi folks,
> 
> This is a five part treewide cleanup of random integer handling. The
> rules for random integers are:
> 
> - If you want a secure or an insecure random u64, use get_random_u64().
> - If you want a secure or an insecure random u32, use get_random_u32().
>   * The old function prandom_u32() has been deprecated for a while now
> and is just a wrapper around get_random_u32(). Same for
> get_random_int().
> - If you want a secure or an insecure random u16, use get_random_u16().
> - If you want a secure or an insecure random u8, use get_random_u8().
> - If you want secure or insecure random bytes, use get_random_bytes().
>   * The old function prandom_bytes() has been deprecated for a while now
> and has long been a wrapper around get_random_bytes().
> - If you want a non-uniform random u32, u16, or u8 bounded by a certain
>   open interval maximum, use prandom_u32_max().
>   * I say "non-uniform", because it doesn't do any rejection sampling or
> divisions. Hence, it stays within the prandom_* namespace.
> 
> These rules ought to be applied uniformly, so that we can clean up the
> deprecated functions, and earn the benefits of using the modern
> functions. In particular, in addition to the boring substitutions, this
> patchset accomplishes a few nice effects:
> 
> - By using prandom_u32_max() with an upper-bound that the compiler can
>   prove at compile-time is ≤65536 or ≤256, internally get_random_u16()
>   or get_random_u8() is used, which wastes fewer batched random bytes,
>   and hence has higher throughput.
> 
> - By using prandom_u32_max() instead of %, when the upper-bound is not a
>   constant, division is still avoided, because prandom_u32_max() uses
>   a faster multiplication-based trick instead.
> 
> - By using get_random_u16() or get_random_u8() in cases where the return
>   value is intended to indeed be a u16 or a u8, we waste fewer batched
>   random bytes, and hence have higher throughput.
> 
> So, based on those rules and benefits from following them, this patchset
> breaks down into the following five steps:
> 
> 1) Replace `prandom_u32() % max` and variants thereof with
>prandom_u32_max(max).
> 
>* Part 1 is done with Coccinelle. Part 2 is done by hand.
> 
> 2) Replace `(type)get_random_u32()` and variants thereof with
>get_random_u16() or get_random_u8(). I took the pains to actually
>look and see what every lvalue type was across the entire tree.
> 
>* Part 1 is done with Coccinelle. Part 2 is done by hand.
> 
> 3) Replace remaining deprecated uses of prandom_u32() and
>get_random_int() with get_random_u32(). 
> 
>* A boring search and replace operation.
> 
> 4) Replace remaining deprecated uses of prandom_bytes() with
>get_random_bytes().
> 
>* A boring search and replace operation.
> 
> 5) Remove the deprecated and now-unused prandom_u32() and
>prandom_bytes() inline wrapper functions.
> 
>* Just deleting code and updating comments.
> 
> I was thinking of taking this through my random.git tree (on which this
> series is currently based) and submitting it near the end of the merge
> window, or waiting for the very end of the 6.1 cycle when there will be
> the fewest new patches brewing. If somebody with some treewide-cleanup
> experience might share some wisdom about what the best timing usually
> winds up being, I'm all ears.
> 
> Please take a look! The number of lines touched is quite small, so this
> should be reviewable, and as much as is possible has been pushed into
> Coccinelle scripts.

For the series:
Reviewed-by: Yury Norov 

Although, looking at it, I have a feeling that kernel needs to drop all
fixed-size random APIs like get_random_uXX() or get_random_int(), because
people will continue using the 'get_random_int() % num' carelessly.

Thanks,
Yury


Re: [PATCH v5 0/7] treewide cleanup of random integer usage

2022-10-08 Thread Greg Kroah-Hartman
On Fri, Oct 07, 2022 at 11:53:52PM -0600, Jason A. Donenfeld wrote:
> Changes v4->v5:
> - Coccinelle is now used for as much mechanical aspects as possible,
>   with mechanical parts split off from non-mechanical parts. This should
>   drastically reduce the amount of code that needs to be reviewed
>   carefully. Each commit mentions now if it was done by hand or is
>   mechanical.

All look good to me, thanks for the cleanups.

Reviewed-by: Greg Kroah-Hartman 


Re: [PATCH] soc: fsl: qe: Add check for ioremap

2022-10-08 Thread kernel test robot
Hi Jiasheng,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on soc/for-next]
[also build test ERROR on linus/master v6.0 next-20221007]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Jiasheng-Jiang/soc-fsl-qe-Add-check-for-ioremap/20221008-104034
base:   https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git for-next
config: powerpc-mpc866_ads_defconfig
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 
791a7ae1ba3efd6bca96338e10ffde557ba83920)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# 
https://github.com/intel-lab-lkp/linux/commit/cd2197ddf995fc2d10fbe8639c1ddc70e6551aeb
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Jiasheng-Jiang/soc-fsl-qe-Add-check-for-ioremap/20221008-104034
git checkout cd2197ddf995fc2d10fbe8639c1ddc70e6551aeb
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=powerpc SHELL=/bin/bash drivers/net/ethernet/freescale/fs_enet/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from 
drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c:26:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:640:
   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:637:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :27:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:578:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)  readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from 
drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c:26:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:640:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:637:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :29:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:579:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)  readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from 
drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c:26:
   In file included from include/linux/interrupt.h:11:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:640:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~
   arch/powerpc/include/asm/io.h:637:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^

Re: [RFC PATCH 0/5] Generic IPI sending tracepoint

2022-10-08 Thread Steven Rostedt
On Fri, 7 Oct 2022 17:01:33 -0300
Marcelo Tosatti  wrote:

> > As for the targeted CPUs, the existing tracepoint does export them, albeit 
> > in
> > cpumask form, which is quite inconvenient from a tooling perspective. For
> > instance, as far as I'm aware, it's not possible to do event filtering on a
> > cpumask via trace-cmd.  
> 
> https://man7.org/linux/man-pages/man1/trace-cmd-set.1.html
> 
>-f filter
>Specify a filter for the previous event. This must come after
>a -e. This will filter what events get recorded based on the
>content of the event. Filtering is passed to the kernel
>directly so what filtering is allowed may depend on what
>version of the kernel you have. Basically, it will let you
>use C notation to check if an event should be processed or
>not.
> 
>==, >=, <=, >, <, &, |, && and ||
> 
>The above are usually safe to use to compare fields.

We could always add an "isset(x)" filter ;-)

-- Steve


Re: [RFC PATCH 4/5] irq_work: Trace calls to arch_irq_work_raise()

2022-10-08 Thread Steven Rostedt
On Fri,  7 Oct 2022 16:45:32 +0100
Valentin Schneider  wrote:
>  }
>  
> +static inline void irq_work_raise(void)
> +{
> + if (arch_irq_work_has_interrupt())
> + trace_ipi_send_cpu(_RET_IP_, smp_processor_id());

To save on the branch, let's make the above:

if (trace_ipi_send_cpu_enabled() && arch_irq_work_has_interrupt())

As the "trace_*_enabled()" is a static branch that will make it a nop
when the tracepoint is not enabled.

-- Steve


> +
> + arch_irq_work_raise();
> +}
> +
>  /* Enqueue on current CPU, work must already be claimed and preempt disabled 
> */


Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread Andy Shevchenko
On Fri, Oct 07, 2022 at 08:50:43PM -0700, Kees Cook wrote:
> On October 7, 2022 7:21:28 PM PDT, "Jason A. Donenfeld"  
> wrote:
> >On Fri, Oct 07, 2022 at 03:47:44PM -0700, Kees Cook wrote:
> >> On Fri, Oct 07, 2022 at 12:01:03PM -0600, Jason A. Donenfeld wrote:

...

> >> These are more fun, but Coccinelle can still do them with a little
> >> Pythonic help:
> >> 
> >> // Find a potential literal
> >> @literal_mask@
> >> expression LITERAL;
> >> identifier randfunc =~ "get_random_int|prandom_u32|get_random_u32";
> >> position p;
> >> @@
> >> 
> >> (randfunc()@p & (LITERAL))
> >> 
> >> // Add one to the literal.
> >> @script:python add_one@
> >> literal << literal_mask.LITERAL;
> >> RESULT;
> >> @@
> >> 
> >> if literal.startswith('0x'):
> >> value = int(literal, 16) + 1
> >> coccinelle.RESULT = cocci.make_expr("0x%x" % (value))
> >> elif literal[0] in '123456789':
> >> value = int(literal, 10) + 1
> >> coccinelle.RESULT = cocci.make_expr("%d" % (value))
> >> else:
> >> print("I don't know how to handle: %s" % (literal))

Wouldn't Python take care about (known) prefixes itself?

try:
x = int(literal)
except ValueError as ex:
print(..., ex.error)

> >> // Replace the literal mask with the calculated result.
> >> @plus_one@
> >> expression literal_mask.LITERAL;
> >> position literal_mask.p;
> >> expression add_one.RESULT;
> >> identifier FUNC;
> >> @@
> >> 
> >> -   (FUNC()@p & (LITERAL))
> >> +   prandom_u32_max(RESULT)
> >
> >Oh that's pretty cool. I can do the saturation check in python, since
> >`value` holds the parsed result. Neat.
> 
> It is (at least how I have it here) just the string, so YMMV.

...

> >Thanks a bunch for the guidance.
> 
> Sure thing! I was pleased to figure out how to do the python bit.

I believe it can be optimized

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread Julia Lawall
> >> @minus_one@
> >> expression FULL;
> >> @@
> >>
> >> - (get_random_int() & ((FULL) - 1)
> >> + prandom_u32_max(FULL)
> >
> >Ahh, well, okay, this is the example I mentioned above. Only works if
> >FULL is saturated. Any clever way to get coccinelle to prove that? Can
> >it look at the value of constants?
>
> I'm not sure if Cocci will do that without a lot of work. The literals trick 
> I used below would need a lot of fanciness. :)

If FULL is an arbitrary expression, it would not be easy to automate.  If
it is a constant then you can use python/ocaml to analyze its value.  But
if it's a #define constant then you would need a previous rule to match the
#define and find that value.

For LITERAL, I think you could just do constant int LITERAL; for the
metavariable declaration.

julia


Re: [PATCH v3] i2c/pasemi: PASemi I2C controller IRQ enablement

2022-10-08 Thread Sven Peter
Hi,

> On 7. Oct 2022, at 02:43, Arminder Singh  wrote:
> 
> This patch adds IRQ support to the PASemi I2C controller driver to 
> increase the performace of I2C transactions on platforms with PASemi I2C 
> controllers. While primarily intended for Apple silicon platforms, this 
> patch should also help in enabling IRQ support for older PASemi hardware 
> as well should the need arise.
> 
> Signed-off-by: Arminder Singh 
> ---
> This version of the patch has been tested on an M1 Ultra Mac Studio,
> as well as an M1 MacBook Pro, and userspace launches successfully
> while using the IRQ path for I2C transactions.

I think Wolfram suggested to keep this in the commit message. If in doubt 
listen to him and not me because he’s much more experienced with the kernel 
than I am ;)

> 
> This version of the patch only contains fixes to the whitespace and
> alignment issues found in v2 of the patch, and as such the testing that
> Christian Zigotsky did on PASemi hardware for v2 of the patch also applies
> to this version of the patch as well.
> (See v2 patch email thread for the "Tested-by" tag)

You can just collect and keep those tags above your signed off by if you only 
change things like whitespaces.

> 
> v2 to v3 changes:
> - Fixed some whitespace and alignment issues found in v2 of the patch
> 
> v1 to v2 changes:
> - moved completion setup from pasemi_platform_i2c_probe to
>   pasemi_i2c_common_probe to allow PASemi and Apple platforms to share
>   common completion setup code in case PASemi hardware gets IRQ support
>   added
> - initialized the status variable in pasemi_smb_waitready when going down
>   the non-IRQ path
> - removed an unnecessary cast of dev_id in the IRQ handler
> - fixed alignment of struct member names in i2c-pasemi-core.h
>   (addresses Christophe's feedback in the original submission)
> - IRQs are now disabled after the wait_for_completion_timeout call
>   instead of inside the IRQ handler
>   (prevents the IRQ from going off after the completion times out)
> - changed the request_irq call to a devm_request_irq call to obviate
>   the need for a remove function and a free_irq call
>   (thanks to Sven for pointing this out in the original submission)
> - added a reinit_completion call to pasemi_reset 
>   as a failsafe to prevent missed interrupts from causing the completion
>   to never complete (thanks to Arnd Bergmann for pointing this out)
> - removed the bitmask variable in favor of just using the value
>   directly (it wasn't used anywhere else)
> 
> v2 linked here: 
> https://lore.kernel.org/linux-i2c/mn2pr01mb535821c8058c7814b2f8eedf9f...@mn2pr01mb5358.prod.exchangelabs.com/
> v1 linked here: 
> https://lore.kernel.org/linux-i2c/mn2pr01mb535838492432c910f2381f929f...@mn2pr01mb5358.prod.exchangelabs.com/T/#m11b3504c2667517aad7521514c99ca0e07a9381f
> 
> Hopefully the patch is good to go this time around!
> 
> drivers/i2c/busses/i2c-pasemi-core.c | 29 
> drivers/i2c/busses/i2c-pasemi-core.h |  5 
> drivers/i2c/busses/i2c-pasemi-platform.c |  6 +
> 3 files changed, 36 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-pasemi-core.c 
> b/drivers/i2c/busses/i2c-pasemi-core.c
> index 9028ffb58cc0..4855144b370e 100644
> --- a/drivers/i2c/busses/i2c-pasemi-core.c
> +++ b/drivers/i2c/busses/i2c-pasemi-core.c
> @@ -21,6 +21,7 @@
> #define REG_MTXFIFO0x00
> #define REG_MRXFIFO0x04
> #define REG_SMSTA0x14
> +#define REG_IMASK0x18
> #define REG_CTL0x1c
> #define REG_REV0x28
> 
> @@ -66,6 +67,7 @@ static void pasemi_reset(struct pasemi_smbus *smbus)
>val |= CTL_EN;
> 
>reg_write(smbus, REG_CTL, val);
> +reinit_completion(>irq_completion);
> }
> 
> static void pasemi_smb_clear(struct pasemi_smbus *smbus)
> @@ -81,11 +83,18 @@ static int pasemi_smb_waitready(struct pasemi_smbus 
> *smbus)
>int timeout = 10;
>unsigned int status;
> 
> -status = reg_read(smbus, REG_SMSTA);
> -
> -while (!(status & SMSTA_XEN) && timeout--) {
> -msleep(1);
> +if (smbus->use_irq) {
> +reinit_completion(>irq_completion);
> +reg_write(smbus, REG_IMASK, SMSTA_XEN | SMSTA_MTN);
> +wait_for_completion_timeout(>irq_completion, 
> msecs_to_jiffies(10));
> +reg_write(smbus, REG_IMASK, 0);
>status = reg_read(smbus, REG_SMSTA);
> +} else {
> +status = reg_read(smbus, REG_SMSTA);
> +while (!(status & SMSTA_XEN) && timeout--) {
> +msleep(1);
> +status = reg_read(smbus, REG_SMSTA);
> +}
>}
> 
>/* Got NACK? */
> @@ -344,10 +353,14 @@ int pasemi_i2c_common_probe(struct pasemi_smbus *smbus)
> 
>/* set up the sysfs linkage to our parent device */
>smbus->adapter.dev.parent = smbus->dev;
> +smbus->use_irq = 0;
> +init_completion(>irq_completion);
> 
>if (smbus->hw_rev != PASEMI_HW_REV_PCI)
>smbus->hw_rev = reg_read(smbus, REG_REV);
> 
> +reg_write(smbus, REG_IMASK, 

[PATCH v5 7/7] prandom: remove unused functions

2022-10-08 Thread Jason A. Donenfeld
With no callers left of prandom_u32() and prandom_bytes(), as well as
get_random_int(), remove these deprecated wrappers, in favor of
get_random_u32() and get_random_bytes().

Reviewed-by: Kees Cook 
Signed-off-by: Jason A. Donenfeld 
---
 drivers/char/random.c   | 11 +--
 include/linux/prandom.h | 12 
 include/linux/random.h  |  5 -
 3 files changed, 5 insertions(+), 23 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 01acf235f263..2fe28eeb2f38 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -97,7 +97,7 @@ MODULE_PARM_DESC(ratelimit_disable, "Disable random ratelimit 
suppression");
  * Returns whether or not the input pool has been seeded and thus guaranteed
  * to supply cryptographically secure random numbers. This applies to: the
  * /dev/urandom device, the get_random_bytes function, and the get_random_{u8,
- * u16,u32,u64,int,long} family of functions.
+ * u16,u32,u64,long} family of functions.
  *
  * Returns: true if the input pool has been seeded.
  *  false if the input pool has not been seeded.
@@ -161,15 +161,14 @@ EXPORT_SYMBOL(wait_for_random_bytes);
  * u16 get_random_u16()
  * u32 get_random_u32()
  * u64 get_random_u64()
- * unsigned int get_random_int()
  * unsigned long get_random_long()
  *
  * These interfaces will return the requested number of random bytes
  * into the given buffer or as a return value. This is equivalent to
- * a read from /dev/urandom. The u8, u16, u32, u64, int, and long
- * family of functions may be higher performance for one-off random
- * integers, because they do a bit of buffering and do not invoke
- * reseeding until the buffer is emptied.
+ * a read from /dev/urandom. The u8, u16, u32, u64, long family of
+ * functions may be higher performance for one-off random integers,
+ * because they do a bit of buffering and do not invoke reseeding
+ * until the buffer is emptied.
  *
  */
 
diff --git a/include/linux/prandom.h b/include/linux/prandom.h
index 78db003bc290..e0a0759dd09c 100644
--- a/include/linux/prandom.h
+++ b/include/linux/prandom.h
@@ -12,18 +12,6 @@
 #include 
 #include 
 
-/* Deprecated: use get_random_u32 instead. */
-static inline u32 prandom_u32(void)
-{
-   return get_random_u32();
-}
-
-/* Deprecated: use get_random_bytes instead. */
-static inline void prandom_bytes(void *buf, size_t nbytes)
-{
-   return get_random_bytes(buf, nbytes);
-}
-
 struct rnd_state {
__u32 s1, s2, s3, s4;
 };
diff --git a/include/linux/random.h b/include/linux/random.h
index 08322f700cdc..147a5e0d0b8e 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -42,10 +42,6 @@ u8 get_random_u8(void);
 u16 get_random_u16(void);
 u32 get_random_u32(void);
 u64 get_random_u64(void);
-static inline unsigned int get_random_int(void)
-{
-   return get_random_u32();
-}
 static inline unsigned long get_random_long(void)
 {
 #if BITS_PER_LONG == 64
@@ -100,7 +96,6 @@ declare_get_random_var_wait(u8, u8)
 declare_get_random_var_wait(u16, u16)
 declare_get_random_var_wait(u32, u32)
 declare_get_random_var_wait(u64, u32)
-declare_get_random_var_wait(int, unsigned int)
 declare_get_random_var_wait(long, unsigned long)
 #undef declare_get_random_var
 
-- 
2.37.3



[PATCH v5 6/7] treewide: use get_random_bytes when possible

2022-10-08 Thread Jason A. Donenfeld
The prandom_bytes() function has been a deprecated inline wrapper around
get_random_bytes() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. This was done as a basic find and replace.

Reviewed-by: Kees Cook 
Reviewed-by: Christophe Leroy  # powerpc
Signed-off-by: Jason A. Donenfeld 
---
 arch/powerpc/crypto/crc-vpmsum_test.c   |  2 +-
 block/blk-crypto-fallback.c |  2 +-
 crypto/async_tx/raid6test.c |  2 +-
 drivers/dma/dmatest.c   |  2 +-
 drivers/mtd/nand/raw/nandsim.c  |  2 +-
 drivers/mtd/tests/mtd_nandecctest.c |  2 +-
 drivers/mtd/tests/speedtest.c   |  2 +-
 drivers/mtd/tests/stresstest.c  |  2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c   |  2 +-
 drivers/net/ethernet/rocker/rocker_main.c   |  2 +-
 drivers/net/wireguard/selftest/allowedips.c | 12 ++--
 fs/ubifs/debug.c|  2 +-
 kernel/kcsan/selftest.c |  2 +-
 lib/random32.c  |  2 +-
 lib/test_objagg.c   |  2 +-
 lib/uuid.c  |  2 +-
 net/ipv4/route.c|  2 +-
 net/mac80211/rc80211_minstrel_ht.c  |  2 +-
 net/sched/sch_pie.c |  2 +-
 19 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/crypto/crc-vpmsum_test.c 
b/arch/powerpc/crypto/crc-vpmsum_test.c
index c1c1ef9457fb..273c527868db 100644
--- a/arch/powerpc/crypto/crc-vpmsum_test.c
+++ b/arch/powerpc/crypto/crc-vpmsum_test.c
@@ -82,7 +82,7 @@ static int __init crc_test_init(void)
 
if (len <= offset)
continue;
-   prandom_bytes(data, len);
+   get_random_bytes(data, len);
len -= offset;
 
crypto_shash_update(crct10dif_shash, data+offset, len);
diff --git a/block/blk-crypto-fallback.c b/block/blk-crypto-fallback.c
index 621abd1b0e4d..ad9844c5b40c 100644
--- a/block/blk-crypto-fallback.c
+++ b/block/blk-crypto-fallback.c
@@ -539,7 +539,7 @@ static int blk_crypto_fallback_init(void)
if (blk_crypto_fallback_inited)
return 0;
 
-   prandom_bytes(blank_key, BLK_CRYPTO_MAX_KEY_SIZE);
+   get_random_bytes(blank_key, BLK_CRYPTO_MAX_KEY_SIZE);
 
err = bioset_init(_bio_split, 64, 0, 0);
if (err)
diff --git a/crypto/async_tx/raid6test.c b/crypto/async_tx/raid6test.c
index c9d218e53bcb..f74505f2baf0 100644
--- a/crypto/async_tx/raid6test.c
+++ b/crypto/async_tx/raid6test.c
@@ -37,7 +37,7 @@ static void makedata(int disks)
int i;
 
for (i = 0; i < disks; i++) {
-   prandom_bytes(page_address(data[i]), PAGE_SIZE);
+   get_random_bytes(page_address(data[i]), PAGE_SIZE);
dataptrs[i] = data[i];
dataoffs[i] = 0;
}
diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index 9fe2ae794316..ffe621695e47 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -312,7 +312,7 @@ static unsigned long dmatest_random(void)
 {
unsigned long buf;
 
-   prandom_bytes(, sizeof(buf));
+   get_random_bytes(, sizeof(buf));
return buf;
 }
 
diff --git a/drivers/mtd/nand/raw/nandsim.c b/drivers/mtd/nand/raw/nandsim.c
index 4bdaf4aa7007..c941a5a41ea6 100644
--- a/drivers/mtd/nand/raw/nandsim.c
+++ b/drivers/mtd/nand/raw/nandsim.c
@@ -1393,7 +1393,7 @@ static int ns_do_read_error(struct nandsim *ns, int num)
unsigned int page_no = ns->regs.row;
 
if (ns_read_error(page_no)) {
-   prandom_bytes(ns->buf.byte, num);
+   get_random_bytes(ns->buf.byte, num);
NS_WARN("simulating read error in page %u\n", page_no);
return 1;
}
diff --git a/drivers/mtd/tests/mtd_nandecctest.c 
b/drivers/mtd/tests/mtd_nandecctest.c
index 1c7201b0f372..440988562cfd 100644
--- a/drivers/mtd/tests/mtd_nandecctest.c
+++ b/drivers/mtd/tests/mtd_nandecctest.c
@@ -266,7 +266,7 @@ static int nand_ecc_test_run(const size_t size)
goto error;
}
 
-   prandom_bytes(correct_data, size);
+   get_random_bytes(correct_data, size);
ecc_sw_hamming_calculate(correct_data, size, correct_ecc, sm_order);
for (i = 0; i < ARRAY_SIZE(nand_ecc_test); i++) {
nand_ecc_test[i].prepare(error_data, error_ecc,
diff --git a/drivers/mtd/tests/speedtest.c b/drivers/mtd/tests/speedtest.c
index c9ec7086bfa1..075bce32caa5 100644
--- a/drivers/mtd/tests/speedtest.c
+++ b/drivers/mtd/tests/speedtest.c
@@ -223,7 +223,7 @@ static int __init mtd_speedtest_init(void)
if (!iobuf)
goto out;
 
-   prandom_bytes(iobuf, mtd->erasesize);
+   get_random_bytes(iobuf, mtd->erasesize);
 
bbt = kzalloc(ebcnt, GFP_KERNEL);

[PATCH v5 5/7] treewide: use get_random_u32() when possible

2022-10-08 Thread Jason A. Donenfeld
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Kees Cook 
Acked-by: Toke Høiland-Jørgensen  # for sch_cake
Acked-by: Chuck Lever  # for nfsd
Reviewed-by: Jan Kara  # for ext4
Acked-by: Mika Westerberg  # for thunderbolt
Acked-by: Darrick J. Wong  # for xfs
Signed-off-by: Jason A. Donenfeld 
---
 Documentation/networking/filter.rst|  2 +-
 arch/parisc/kernel/process.c   |  2 +-
 arch/parisc/kernel/sys_parisc.c|  4 ++--
 arch/s390/mm/mmap.c|  2 +-
 arch/x86/kernel/cpu/amd.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c|  6 +++---
 drivers/gpu/drm/i915/selftests/i915_selftest.c |  2 +-
 drivers/gpu/drm/selftests/test-drm_buddy.c |  2 +-
 drivers/gpu/drm/selftests/test-drm_mm.c|  2 +-
 drivers/infiniband/hw/cxgb4/cm.c   |  4 ++--
 drivers/infiniband/hw/hfi1/tid_rdma.c  |  2 +-
 drivers/infiniband/hw/mlx4/mad.c   |  2 +-
 drivers/infiniband/ulp/ipoib/ipoib_cm.c|  2 +-
 drivers/md/raid5-cache.c   |  2 +-
 .../media/test-drivers/vivid/vivid-touch-cap.c |  4 ++--
 drivers/misc/habanalabs/gaudi2/gaudi2.c|  2 +-
 drivers/mtd/nand/raw/nandsim.c |  2 +-
 drivers/net/bonding/bond_main.c|  2 +-
 drivers/net/ethernet/broadcom/cnic.c   |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_cm.c |  2 +-
 drivers/net/ethernet/rocker/rocker_main.c  |  6 +++---
 .../wireless/broadcom/brcm80211/brcmfmac/pno.c |  2 +-
 .../net/wireless/marvell/mwifiex/cfg80211.c|  4 ++--
 .../net/wireless/microchip/wilc1000/cfg80211.c |  2 +-
 .../net/wireless/quantenna/qtnfmac/cfg80211.c  |  2 +-
 drivers/net/wireless/ti/wlcore/main.c  |  2 +-
 drivers/nvme/common/auth.c |  2 +-
 drivers/scsi/cxgbi/cxgb4i/cxgb4i.c |  4 ++--
 drivers/target/iscsi/cxgbit/cxgbit_cm.c|  2 +-
 drivers/thunderbolt/xdomain.c  |  2 +-
 drivers/video/fbdev/uvesafb.c  |  2 +-
 fs/exfat/inode.c   |  2 +-
 fs/ext4/ialloc.c   |  2 +-
 fs/ext4/ioctl.c|  4 ++--
 fs/ext4/mmp.c  |  2 +-
 fs/f2fs/namei.c|  2 +-
 fs/fat/inode.c |  2 +-
 fs/nfsd/nfs4state.c|  4 ++--
 fs/ntfs3/fslog.c   |  6 +++---
 fs/ubifs/journal.c |  2 +-
 fs/xfs/libxfs/xfs_ialloc.c |  2 +-
 fs/xfs/xfs_icache.c|  2 +-
 fs/xfs/xfs_log.c   |  2 +-
 include/net/netfilter/nf_queue.h   |  2 +-
 include/net/red.h  |  2 +-
 include/net/sock.h |  2 +-
 kernel/bpf/bloom_filter.c  |  2 +-
 kernel/bpf/core.c  |  2 +-
 kernel/bpf/hashtab.c   |  2 +-
 kernel/bpf/verifier.c  |  2 +-
 kernel/kcsan/selftest.c|  2 +-
 lib/random32.c |  2 +-
 lib/reed_solomon/test_rslib.c  |  6 +++---
 lib/test_fprobe.c  |  2 +-
 lib/test_kprobes.c |  2 +-
 lib/test_min_heap.c|  6 +++---
 lib/test_rhashtable.c  |  6 +++---
 mm/shmem.c |  2 +-
 mm/slab.c  |  2 +-
 net/802/garp.c |  2 +-
 net/802/mrp.c  |  2 +-
 net/core/pktgen.c  |  4 ++--
 net/ipv4/route.c   |  2 +-
 net/ipv4/tcp_cdg.c |  2 +-
 net/ipv4/udp.c |  2 +-
 net/ipv6/ip6_flowlabel.c   |  2 +-
 net/ipv6/output_core.c |  2 +-
 net/netfilter/ipvs/ip_vs_conn.c|  2 +-
 net/netfilter/xt_statistic.c   |  2 +-
 net/openvswitch/actions.c  |  2 +-
 net/sched/sch_cake.c   |  2 +-
 net/sched/sch_netem.c  | 18 +-
 net/sunrpc/auth_gss/gss_krb5_wrap.c|  4 ++--
 net/sunrpc/xprt.c  |  2 +-
 net/unix/af_unix.c |  2 +-
 75 files changed, 104 insertions(+), 104 deletions(-)

diff --git a/Documentation/networking/filter.rst 

[PATCH v5 4/7] treewide: use get_random_{u8,u16}() when possible, part 2

2022-10-08 Thread Jason A. Donenfeld
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done by hand,
identifying all of the places where one of the random integer functions
was used in a non-32-bit context.

Reviewed-by: Kees Cook 
Signed-off-by: Jason A. Donenfeld 
---
 arch/s390/kernel/process.c  | 2 +-
 lib/test_vmalloc.c  | 2 +-
 net/ipv4/ip_output.c| 2 +-
 net/netfilter/nf_nat_core.c | 4 ++--
 net/rds/bind.c  | 2 +-
 net/sched/sch_sfb.c | 2 +-
 6 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c
index 5ec78555dd2e..42af4b3aa02b 100644
--- a/arch/s390/kernel/process.c
+++ b/arch/s390/kernel/process.c
@@ -230,7 +230,7 @@ unsigned long arch_align_stack(unsigned long sp)
 
 static inline unsigned long brk_rnd(void)
 {
-   return (get_random_int() & BRK_RND_MASK) << PAGE_SHIFT;
+   return (get_random_u16() & BRK_RND_MASK) << PAGE_SHIFT;
 }
 
 unsigned long arch_randomize_brk(struct mm_struct *mm)
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index a26bbbf20e62..cf7780572f5b 100644
--- a/lib/test_vmalloc.c
+++ b/lib/test_vmalloc.c
@@ -80,7 +80,7 @@ static int random_size_align_alloc_test(void)
int i;
 
for (i = 0; i < test_loop_count; i++) {
-   rnd = prandom_u32();
+   rnd = get_random_u8();
 
/*
 * Maximum 1024 pages, if PAGE_SIZE is 4096.
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 04e2034f2f8e..a4fbdbff14b3 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -172,7 +172,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, const struct 
sock *sk,
 * Avoid using the hashed IP ident generator.
 */
if (sk->sk_protocol == IPPROTO_TCP)
-   iph->id = (__force __be16)prandom_u32();
+   iph->id = (__force __be16)get_random_u16();
else
__ip_select_ident(net, iph, 1);
}
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 7981be526f26..57c7686ac485 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -468,7 +468,7 @@ static void nf_nat_l4proto_unique_tuple(struct 
nf_conntrack_tuple *tuple,
if (range->flags & NF_NAT_RANGE_PROTO_OFFSET)
off = (ntohs(*keyptr) - ntohs(range->base_proto.all));
else
-   off = prandom_u32();
+   off = get_random_u16();
 
attempts = range_size;
if (attempts > max_attempts)
@@ -490,7 +490,7 @@ static void nf_nat_l4proto_unique_tuple(struct 
nf_conntrack_tuple *tuple,
if (attempts >= range_size || attempts < 16)
return;
attempts /= 2;
-   off = prandom_u32();
+   off = get_random_u16();
goto another_round;
 }
 
diff --git a/net/rds/bind.c b/net/rds/bind.c
index 5b5fb4ca8d3e..97a29172a8ee 100644
--- a/net/rds/bind.c
+++ b/net/rds/bind.c
@@ -104,7 +104,7 @@ static int rds_add_bound(struct rds_sock *rs, const struct 
in6_addr *addr,
return -EINVAL;
last = rover;
} else {
-   rover = max_t(u16, prandom_u32(), 2);
+   rover = max_t(u16, get_random_u16(), 2);
last = rover - 1;
}
 
diff --git a/net/sched/sch_sfb.c b/net/sched/sch_sfb.c
index 2829455211f8..7eb70acb4d58 100644
--- a/net/sched/sch_sfb.c
+++ b/net/sched/sch_sfb.c
@@ -379,7 +379,7 @@ static int sfb_enqueue(struct sk_buff *skb, struct Qdisc 
*sch,
goto enqueue;
}
 
-   r = prandom_u32() & SFB_MAX_PROB;
+   r = get_random_u16() & SFB_MAX_PROB;
 
if (unlikely(r < p_min)) {
if (unlikely(p_min > SFB_MAX_PROB / 2)) {
-- 
2.37.3



[PATCH v5 3/7] treewide: use get_random_{u8,u16}() when possible, part 1

2022-10-08 Thread Jason A. Donenfeld
Rather than truncate a 32-bit value to a 16-bit value or an 8-bit value,
simply use the get_random_{u8,u16}() functions, which are faster than
wasting the additional bytes from a 32-bit value. This was done
mechanically with this coccinelle script:

@@
expression E;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u16;
typedef u8;
@@
(
- (get_random_u32() & 0x)
+ get_random_u16()
|
- (get_random_u32() & 0xff)
+ get_random_u8()
|
- (get_random_u32() % 65536)
+ get_random_u16()
|
- (get_random_u32() % 256)
+ get_random_u8()
|
- (get_random_u32() >> 16)
+ get_random_u16()
|
- (get_random_u32() >> 24)
+ get_random_u8()
|
- (u16)get_random_u32()
+ get_random_u16()
|
- (u8)get_random_u32()
+ get_random_u8()
|
- prandom_u32_max(65536)
+ get_random_u16()
|
- prandom_u32_max(256)
+ get_random_u8()
|
- E->inet_id = get_random_u32()
+ E->inet_id = get_random_u16()
)

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u16;
identifier v;
@@
- u16 v = get_random_u32();
+ u16 v = get_random_u16();

@@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u8;
identifier v;
@@
- u8 v = get_random_u32();
+ u8 v = get_random_u8();

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

((T)get_random_u32()@p & (LITERAL))

// Examine limits
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value < 256:
coccinelle.RESULT = cocci.make_ident("get_random_u8")
elif value < 65536:
coccinelle.RESULT = cocci.make_ident("get_random_u16")
else:
print("Skipping large mask of %s" % (literal))
cocci.include_match(False)

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
identifier add_one.RESULT;
identifier FUNC;
@@

-   (FUNC()@p & (LITERAL))
+   (RESULT() & LITERAL)

Reviewed-by: Kees Cook 
Acked-by: Toke Høiland-Jørgensen  # for sch_cake
Signed-off-by: Jason A. Donenfeld 
---
 arch/arm/kernel/signal.c  | 2 +-
 arch/arm64/kernel/syscall.c   | 2 +-
 crypto/testmgr.c  | 8 
 drivers/media/common/v4l2-tpg/v4l2-tpg-core.c | 2 +-
 drivers/media/test-drivers/vivid/vivid-radio-rx.c | 4 ++--
 .../net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c   | 2 +-
 drivers/net/hamradio/baycom_epp.c | 2 +-
 drivers/net/hamradio/hdlcdrv.c| 2 +-
 drivers/net/hamradio/yam.c| 2 +-
 drivers/net/wireguard/selftest/allowedips.c   | 4 ++--
 drivers/net/wireless/st/cw1200/wsm.c  | 2 +-
 drivers/scsi/lpfc/lpfc_hbadisc.c  | 6 +++---
 lib/cmdline_kunit.c   | 4 ++--
 net/dccp/ipv4.c   | 4 ++--
 net/ipv4/datagram.c   | 2 +-
 net/ipv4/tcp_ipv4.c   | 4 ++--
 net/mac80211/scan.c   | 2 +-
 net/sched/sch_cake.c  | 6 +++---
 net/sctp/socket.c | 2 +-
 19 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/arch/arm/kernel/signal.c b/arch/arm/kernel/signal.c
index ea128e32e8ca..e07f359254c3 100644
--- a/arch/arm/kernel/signal.c
+++ b/arch/arm/kernel/signal.c
@@ -655,7 +655,7 @@ struct page *get_signal_page(void)
 PAGE_SIZE / sizeof(u32));
 
/* Give the signal return code some randomness */
-   offset = 0x200 + (get_random_int() & 0x7fc);
+   offset = 0x200 + (get_random_u16() & 0x7fc);
signal_return_offset = offset;
 
/* Copy signal return handlers into the page */
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 733451fe7e41..d72e8f23422d 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -67,7 +67,7 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int 
scno,
 *
 * The resulting 5 bits of entropy is seen in SP[8:4].
 */
-   choose_random_kstack_offset(get_random_int() & 0x1FF);
+   choose_random_kstack_offset(get_random_u16() & 0x1FF);
 }
 
 static inline bool has_syscall_work(unsigned long flags)
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index be45217acde4..981c637fa2ed 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -927,7 +927,7 @@ static void generate_random_bytes(u8 *buf, size_t count)
 

[PATCH v5 2/7] treewide: use prandom_u32_max() when possible, part 2

2022-10-08 Thread Jason A. Donenfeld
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done by hand, covering things that coccinelle could not do on its own.

Reviewed-by: Kees Cook 
Reviewed-by: Jan Kara  # for ext2, ext4, and sbitmap
Signed-off-by: Jason A. Donenfeld 
---
 fs/ext2/ialloc.c   |  3 +--
 fs/ext4/ialloc.c   |  5 ++---
 lib/sbitmap.c  |  2 +-
 lib/test_vmalloc.c | 17 -
 4 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/fs/ext2/ialloc.c b/fs/ext2/ialloc.c
index 998dd2ac8008..f4944c4dee60 100644
--- a/fs/ext2/ialloc.c
+++ b/fs/ext2/ialloc.c
@@ -277,8 +277,7 @@ static int find_group_orlov(struct super_block *sb, struct 
inode *parent)
int best_ndir = inodes_per_group;
int best_group = -1;
 
-   group = prandom_u32();
-   parent_group = (unsigned)group % ngroups;
+   parent_group = prandom_u32_max(ngroups);
for (i = 0; i < ngroups; i++) {
group = (parent_group + i) % ngroups;
desc = ext2_get_group_desc (sb, group, NULL);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index f73e5eb43eae..36d5bc595cc2 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -463,10 +463,9 @@ static int find_group_orlov(struct super_block *sb, struct 
inode *parent,
hinfo.hash_version = DX_HASH_HALF_MD4;
hinfo.seed = sbi->s_hash_seed;
ext4fs_dirhash(parent, qstr->name, qstr->len, );
-   grp = hinfo.hash;
+   parent_group = hinfo.hash % ngroups;
} else
-   grp = prandom_u32();
-   parent_group = (unsigned)grp % ngroups;
+   parent_group = prandom_u32_max(ngroups);
for (i = 0; i < ngroups; i++) {
g = (parent_group + i) % ngroups;
get_orlov_stats(sb, g, flex_size, );
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index c4f04edf3ee9..ef0661504561 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -21,7 +21,7 @@ static int init_alloc_hint(struct sbitmap *sb, gfp_t flags)
int i;
 
for_each_possible_cpu(i)
-   *per_cpu_ptr(sb->alloc_hint, i) = prandom_u32() % depth;
+   *per_cpu_ptr(sb->alloc_hint, i) = 
prandom_u32_max(depth);
}
return 0;
 }
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index 4f2f2d1bac56..a26bbbf20e62 100644
--- a/lib/test_vmalloc.c
+++ b/lib/test_vmalloc.c
@@ -151,9 +151,7 @@ static int random_size_alloc_test(void)
int i;
 
for (i = 0; i < test_loop_count; i++) {
-   n = prandom_u32();
-   n = (n % 100) + 1;
-
+   n = prandom_u32_max(100) + 1;
p = vmalloc(n * PAGE_SIZE);
 
if (!p)
@@ -293,16 +291,12 @@ pcpu_alloc_test(void)
return -1;
 
for (i = 0; i < 35000; i++) {
-   unsigned int r;
-
-   r = prandom_u32();
-   size = (r % (PAGE_SIZE / 4)) + 1;
+   size = prandom_u32_max(PAGE_SIZE / 4) + 1;
 
/*
 * Maximum PAGE_SIZE
 */
-   r = prandom_u32();
-   align = 1 << ((r % 11) + 1);
+   align = 1 << (prandom_u32_max(11) + 1);
 
pcpu[i] = __alloc_percpu(size, align);
if (!pcpu[i])
@@ -393,14 +387,11 @@ static struct test_driver {
 
 static void shuffle_array(int *arr, int n)
 {
-   unsigned int rnd;
int i, j;
 
for (i = n - 1; i > 0; i--)  {
-   rnd = prandom_u32();
-
/* Cut the range. */
-   j = rnd % i;
+   j = prandom_u32_max(i);
 
/* Swap indexes. */
swap(arr[i], arr[j]);
-- 
2.37.3



[PATCH v5 1/7] treewide: use prandom_u32_max() when possible, part 1

2022-10-08 Thread Jason A. Donenfeld
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-   RAND = get_random_u32();
... when != RAND
-   RAND %= (E);
+   RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 
2**16 - 1 or value == 2**8 - 1:
print("Skipping 0x%x for cleanup elsewhere" % (value))
cocci.include_match(False)
elif value & (value + 1) != 0:
print("Skipping 0x%x because it's not a power of two minus one" % 
(value))
cocci.include_match(False)
elif literal.startswith('0x'):
coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-   (FUNC()@p & (LITERAL))
+   prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-   T VAR;
-   VAR = (E);
-   return VAR;
+   return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-   T VAR;
... when != VAR
 }

Reviewed-by: Kees Cook 
Reviewed-by: KP Singh 
Reviewed-by: Jan Kara  # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder  # for drbd
Acked-by: Ulf Hansson  # for mmc
Acked-by: Darrick J. Wong  # for xfs
Signed-off-by: Jason A. Donenfeld 
---
 arch/arm/kernel/process.c |  2 +-
 arch/arm64/kernel/process.c   |  2 +-
 arch/loongarch/kernel/process.c   |  2 +-
 arch/loongarch/kernel/vdso.c  |  2 +-
 arch/mips/kernel/process.c|  2 +-
 arch/mips/kernel/vdso.c   |  2 +-
 arch/parisc/kernel/vdso.c |  2 +-
 arch/powerpc/kernel/process.c |  2 +-
 arch/s390/kernel/process.c|  2 +-
 arch/s390/kernel/vdso.c   |  2 +-
 arch/sparc/vdso/vma.c |  2 +-
 arch/um/kernel/process.c  |  2 +-
 arch/x86/entry/vdso/vma.c |  2 +-
 arch/x86/kernel/module.c  |  2 +-
 arch/x86/kernel/process.c |  2 +-
 arch/x86/mm/pat/cpa-test.c|  4 +-
 crypto/testmgr.c  | 86 +--
 drivers/block/drbd/drbd_receiver.c|  4 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  2 +-
 drivers/infiniband/core/cma.c |  2 +-
 drivers/infiniband/hw/cxgb4/id_table.c|  4 +-
 drivers/infiniband/hw/hns/hns_roce_ah.c   |  5 +-
 drivers/infiniband/ulp/rtrs/rtrs-clt.c|  3 +-
 drivers/md/bcache/request.c   |  2 +-
 .../test-drivers/vivid/vivid-touch-cap.c  |  2 +-
 drivers/mmc/core/core.c   |  4 +-
 drivers/mmc/host/dw_mmc.c |  2 +-
 drivers/mtd/nand/raw/nandsim.c|  4 +-
 drivers/mtd/tests/mtd_nandecctest.c   | 10 +--
 drivers/mtd/tests/stresstest.c| 17 +---
 drivers/mtd/ubi/debug.c   |  2 +-
 drivers/mtd/ubi/debug.h   |  6 +-
 drivers/net/ethernet/broadcom/cnic.c  |  3 +-
 .../chelsio/inline_crypto/chtls/chtls_io.c|  4 +-
 drivers/net/hamradio/baycom_epp.c |  2 +-
 drivers/net/hamradio/hdlcdrv.c|  2 +-
 drivers/net/hamradio/yam.c|  2 +-
 drivers/net/phy/at803x.c  |  2 +-
 .../broadcom/brcm80211/brcmfmac/p2p.c |  2 +-
 .../net/wireless/intel/iwlwifi/mvm/mac-ctxt.c |  2 +-
 drivers/scsi/fcoe/fcoe_ctlr.c |  4 +-
 drivers/scsi/qedi/qedi_main.c |  2 +-
 fs/ceph/inode.c   |  2 +-
 

[PATCH v5 0/7] treewide cleanup of random integer usage

2022-10-08 Thread Jason A. Donenfeld
Changes v4->v5:
- Coccinelle is now used for as much mechanical aspects as possible,
  with mechanical parts split off from non-mechanical parts. This should
  drastically reduce the amount of code that needs to be reviewed
  carefully. Each commit mentions now if it was done by hand or is
  mechanical.

Hi folks,

This is a five part treewide cleanup of random integer handling. The
rules for random integers are:

- If you want a secure or an insecure random u64, use get_random_u64().
- If you want a secure or an insecure random u32, use get_random_u32().
  * The old function prandom_u32() has been deprecated for a while now
and is just a wrapper around get_random_u32(). Same for
get_random_int().
- If you want a secure or an insecure random u16, use get_random_u16().
- If you want a secure or an insecure random u8, use get_random_u8().
- If you want secure or insecure random bytes, use get_random_bytes().
  * The old function prandom_bytes() has been deprecated for a while now
and has long been a wrapper around get_random_bytes().
- If you want a non-uniform random u32, u16, or u8 bounded by a certain
  open interval maximum, use prandom_u32_max().
  * I say "non-uniform", because it doesn't do any rejection sampling or
divisions. Hence, it stays within the prandom_* namespace.

These rules ought to be applied uniformly, so that we can clean up the
deprecated functions, and earn the benefits of using the modern
functions. In particular, in addition to the boring substitutions, this
patchset accomplishes a few nice effects:

- By using prandom_u32_max() with an upper-bound that the compiler can
  prove at compile-time is ≤65536 or ≤256, internally get_random_u16()
  or get_random_u8() is used, which wastes fewer batched random bytes,
  and hence has higher throughput.

- By using prandom_u32_max() instead of %, when the upper-bound is not a
  constant, division is still avoided, because prandom_u32_max() uses
  a faster multiplication-based trick instead.

- By using get_random_u16() or get_random_u8() in cases where the return
  value is intended to indeed be a u16 or a u8, we waste fewer batched
  random bytes, and hence have higher throughput.

So, based on those rules and benefits from following them, this patchset
breaks down into the following five steps:

1) Replace `prandom_u32() % max` and variants thereof with
   prandom_u32_max(max).

   * Part 1 is done with Coccinelle. Part 2 is done by hand.

2) Replace `(type)get_random_u32()` and variants thereof with
   get_random_u16() or get_random_u8(). I took the pains to actually
   look and see what every lvalue type was across the entire tree.

   * Part 1 is done with Coccinelle. Part 2 is done by hand.

3) Replace remaining deprecated uses of prandom_u32() and
   get_random_int() with get_random_u32(). 

   * A boring search and replace operation.

4) Replace remaining deprecated uses of prandom_bytes() with
   get_random_bytes().

   * A boring search and replace operation.

5) Remove the deprecated and now-unused prandom_u32() and
   prandom_bytes() inline wrapper functions.

   * Just deleting code and updating comments.

I was thinking of taking this through my random.git tree (on which this
series is currently based) and submitting it near the end of the merge
window, or waiting for the very end of the 6.1 cycle when there will be
the fewest new patches brewing. If somebody with some treewide-cleanup
experience might share some wisdom about what the best timing usually
winds up being, I'm all ears.

Please take a look! The number of lines touched is quite small, so this
should be reviewable, and as much as is possible has been pushed into
Coccinelle scripts.

Thanks,
Jason

Cc: Andreas Noever 
Cc: Andrew Morton 
Cc: Andy Shevchenko 
Cc: Borislav Petkov 
Cc: Catalin Marinas 
Cc: Christoph Böhmwalder 
Cc: Christoph Hellwig 
Cc: Christophe Leroy 
Cc: Daniel Borkmann 
Cc: Dave Airlie 
Cc: Dave Hansen 
Cc: David S. Miller 
Cc: Eric Dumazet 
Cc: Florian Westphal 
Cc: Greg Kroah-Hartman ,
Cc: H. Peter Anvin 
Cc: Heiko Carstens 
Cc: Helge Deller 
Cc: Herbert Xu 
Cc: Huacai Chen 
Cc: Hugh Dickins 
Cc: Jakub Kicinski 
Cc: James E.J. Bottomley 
Cc: Jan Kara 
Cc: Jason Gunthorpe 
Cc: Jens Axboe 
Cc: Johannes Berg 
Cc: Jonathan Corbet 
Cc: Jozsef Kadlecsik 
Cc: KP Singh 
Cc: Kees Cook 
Cc: Marco Elver 
Cc: Mauro Carvalho Chehab 
Cc: Michael Ellerman 
Cc: Pablo Neira Ayuso 
Cc: Paolo Abeni 
Cc: Peter Zijlstra 
Cc: Richard Weinberger 
Cc: Russell King 
Cc: Theodore Ts'o 
Cc: Thomas Bogendoerfer 
Cc: Thomas Gleixner 
Cc: Thomas Graf 
Cc: Ulf Hansson 
Cc: Vignesh Raghavendra 
Cc: WANG Xuerui 
Cc: Will Deacon 
Cc: Yury Norov 
Cc: dri-de...@lists.freedesktop.org
Cc: kasan-...@googlegroups.com
Cc: kernel-janit...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-bl...@vger.kernel.org
Cc: linux-cry...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Cc: linux-me...@vger.kernel.org
Cc: 

Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread Kees Cook
[resending because I failed to CC]

On October 7, 2022 7:21:28 PM PDT, "Jason A. Donenfeld"  wrote:
>On Fri, Oct 07, 2022 at 03:47:44PM -0700, Kees Cook wrote:
>> On Fri, Oct 07, 2022 at 12:01:03PM -0600, Jason A. Donenfeld wrote:
>> > Rather than incurring a division or requesting too many random bytes for
>> > the given range, use the prandom_u32_max() function, which only takes
>> > the minimum required bytes from the RNG and avoids divisions.
>> 
>> I actually meant splitting the by-hand stuff by subsystem, but nearly
>> all of these can be done mechanically too, so it shouldn't be bad. Notes
>> below...
>
>Oh, cool, more coccinelle. You're basically giving me a class on these
>recipes. Much appreciated.

You're welcome! This was a fun exercise. :)

>
>> > [...]
>> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
>> > index 92bcc1768f0b..87203429f802 100644
>> > --- a/arch/arm64/kernel/process.c
>> > +++ b/arch/arm64/kernel/process.c
>> > @@ -595,7 +595,7 @@ unsigned long __get_wchan(struct task_struct *p)
>> >  unsigned long arch_align_stack(unsigned long sp)
>> >  {
>> >if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
>> > -  sp -= get_random_int() & ~PAGE_MASK;
>> > +  sp -= prandom_u32_max(PAGE_SIZE);
>> >return sp & ~0xf;
>> >  }
>> >  
>> 
>> @mask@
>> expression MASK;
>> @@
>> 
>> - (get_random_int() & ~(MASK))
>> + prandom_u32_max(MASK)
>
>Not quite! PAGE_MASK != PAGE_SIZE. In this case, things get a litle
>more complicated where you can do:
>
>get_random_int() & MASK == prandom_u32_max(MASK + 1)
>*only if all the top bits of MASK are set* That is, if MASK one less

Oh whoops! Yes, right, I totally misread SIZE as MASK.

>than a power of two. Or if MASK & (MASK + 1) == 0.
>
>(If those top bits aren't set, you can technically do
>prandom_u32_max(MASK >> n + 1) << n. That'd be a nice thing to work out.
>But yeesh, maybe a bit much for the time being and probably a bit beyond
>coccinelle.)
>
>This case here, though, is a bit more special, where we can just rely on
>an obvious given kernel identity. Namely, PAGE_MASK == ~(PAGE_SIZE - 1).
>So ~PAGE_MASK == PAGE_SIZE - 1.
>So get_random_int() & ~PAGE_MASK == prandom_u32_max(PAGE_SIZE - 1 + 1).
>So get_random_int() & ~PAGE_MASK == prandom_u32_max(PAGE_SIZE).
>
>And most importantly, this makes the code more readable, since everybody
>knows what bounding by PAGE_SIZE means, where as what on earth is
>happening with the &~PAGE_MASK thing. So it's a good change. I'll try to
>teach coccinelle about that special case.

Yeah, it should be possible to just check for the literal.

>
>
>
>> > diff --git a/arch/loongarch/kernel/vdso.c b/arch/loongarch/kernel/vdso.c
>> > index f32c38abd791..8c9826062652 100644
>> > --- a/arch/loongarch/kernel/vdso.c
>> > +++ b/arch/loongarch/kernel/vdso.c
>> > @@ -78,7 +78,7 @@ static unsigned long vdso_base(void)
>> >unsigned long base = STACK_TOP;
>> >  
>> >if (current->flags & PF_RANDOMIZE) {
>> > -  base += get_random_int() & (VDSO_RANDOMIZE_SIZE - 1);
>> > +  base += prandom_u32_max(VDSO_RANDOMIZE_SIZE);
>> >base = PAGE_ALIGN(base);
>> >}
>> >  
>> 
>> @minus_one@
>> expression FULL;
>> @@
>> 
>> - (get_random_int() & ((FULL) - 1)
>> + prandom_u32_max(FULL)
>
>Ahh, well, okay, this is the example I mentioned above. Only works if
>FULL is saturated. Any clever way to get coccinelle to prove that? Can
>it look at the value of constants?

I'm not sure if Cocci will do that without a lot of work. The literals trick I 
used below would need a lot of fanciness. :)

>
>> 
>> > diff --git a/arch/parisc/kernel/vdso.c b/arch/parisc/kernel/vdso.c
>> > index 63dc44c4c246..47e5960a2f96 100644
>> > --- a/arch/parisc/kernel/vdso.c
>> > +++ b/arch/parisc/kernel/vdso.c
>> > @@ -75,7 +75,7 @@ int arch_setup_additional_pages(struct linux_binprm 
>> > *bprm,
>> >  
>> >map_base = mm->mmap_base;
>> >if (current->flags & PF_RANDOMIZE)
>> > -  map_base -= (get_random_int() & 0x1f) * PAGE_SIZE;
>> > +  map_base -= prandom_u32_max(0x20) * PAGE_SIZE;
>> >  
>> >vdso_text_start = get_unmapped_area(NULL, map_base, vdso_text_len, 0, 
>> > 0);
>> >  
>> 
>> These are more fun, but Coccinelle can still do them with a little
>> Pythonic help:
>> 
>> // Find a potential literal
>> @literal_mask@
>> expression LITERAL;
>> identifier randfunc =~ "get_random_int|prandom_u32|get_random_u32";
>> position p;
>> @@
>> 
>> (randfunc()@p & (LITERAL))
>> 
>> // Add one to the literal.
>> @script:python add_one@
>> literal << literal_mask.LITERAL;
>> RESULT;
>> @@
>> 
>> if literal.startswith('0x'):
>> value = int(literal, 16) + 1
>> coccinelle.RESULT = cocci.make_expr("0x%x" % (value))
>> elif literal[0] in '123456789':
>> value = int(literal, 10) + 1
>> coccinelle.RESULT = cocci.make_expr("%d" % (value))
>> else:
>> print("I don't know how to handle: %s" % (literal))
>> 
>> // 

Re: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread Jason A. Donenfeld
On Fri, Oct 07, 2022 at 03:47:44PM -0700, Kees Cook wrote:
> > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
> > index 4f2f2d1bac56..56ffaa8dd3f6 100644
> > --- a/lib/test_vmalloc.c
> > +++ b/lib/test_vmalloc.c
> > @@ -151,9 +151,7 @@ static int random_size_alloc_test(void)
> > int i;
> >  
> > for (i = 0; i < test_loop_count; i++) {
> > -   n = prandom_u32();
> > -   n = (n % 100) + 1;
> > -
> > +   n = prandom_u32_max(n % 100) + 1;
> > p = vmalloc(n * PAGE_SIZE);
> >  
> > if (!p)
> 
> This looks wrong. Cocci says:
> 
> -   n = prandom_u32();
> -   n = (n % 100) + 1;
> +   n = prandom_u32_max(100) + 1;

I agree that's wrong, but what rule did you use to make Cocci generate
that?

Jason