Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2025-04-06 Thread Herbert Xu
On Tue, Oct 08, 2024 at 02:28:27AM +0300, Jarkko Sakkinen wrote:
>
> I guess it is TPM2_StartAuthSession returning TPM_RC_VALUE. Probably
> James should look into this as the bus encryption code is clearly
> tripping here.

OK I've just got a confirmation from the lkp that the tpm RNG is
indeed buggy:

https://lore.kernel.org/oe-lkp/202503261331.d388f82a-...@intel.com/

Quoting from the kmsg:

kern  :err   : [   28.766528] tpm tpm0: A TPM error (2307) occurred start auth 
session
kern  :info  : [   28.775547] resctrl: L3 allocation detected
kern  :warn  : [   28.777572] [ cut here ]
kern  :info  : [   28.780433] resctrl: MB allocation detected
kern  :warn  : [   28.785745] WARNING: CPU: 46 PID: 576 at 
drivers/char/hw_random/core.c:188 hwrng_fillfn+0x19c/0x1f0

...

kern  :warn  : [   28.891243] RAX: 0903 RBX: ff110001077f6760 RCX: 
ffa00ecb7e57

So the return value was 0x903, way bigger than what we asked for
(0x40).

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-10-07 Thread Jarkko Sakkinen
On Fri, 2024-09-27 at 08:42 +0800, Herbert Xu wrote:
> On Tue, Sep 24, 2024 at 08:43:00PM +0300, Jarkko Sakkinen wrote:
> > 
> > Without any traces that would provide more information I don't see
> > the smoking gun.
> 
> I haven't confirmed that it's definitely the tpm2 driver, it's just
> based on the backtrace.  Hopefully my patch will confirm it one way
> or the other.  Here is the backtrace:

Agreed.

> 
> [  100.784159] vmd :c2:00.5: Bound to PCI domain 10002 
> [  100.786209] Monitor-Mwait will be used to enter C-1 state 
> [  100.786225] Monitor-Mwait will be used to enter C-2 state 
> [  100.786244] ACPI: \_SB_.SCK0.C000: Found 2 idle states 
> [  100.823093] input: Power Button as
> /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 
> [  100.823636] ACPI: button: Power Button [PWRF] 
> [  100.905756] ERST: Error Record Serialization Table (ERST) support
> is initialized. 
> [  100.905858] pstore: Using crash dump compression: deflate 
> [  100.905861] pstore: Registered erst as persistent store backend 
> [  100.907044] Serial: 8250/16550 driver, 4 ports, IRQ sharing
> enabled 
> [  100.908305] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud =
> 115200) is a 16550A 
> [  100.926608] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud =
> 115200) is a 16550A 
> [  100.942953] Non-volatile memory driver v1.3 
> [  100.947908] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id
> 22) 
> [  101.226913] ACPI: bus type drm_connector registered 
> [  101.229708] alg: ecdh-nist-p256 (ecdh-nist-p256-generic) is
> disabled due to FIPS 
> [  101.229745] tpm tpm0: crypto ecdh allocation failed 
> [  101.236311] tpm tpm0: A TPM error (708) occurred start auth

I guess it is TPM2_StartAuthSession returning TPM_RC_VALUE. Probably
James should look into this as the bus encryption code is clearly
tripping here.

I'm on second week on a new job so cannot promise any bandwidth
yet this week. Earliest next week...

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-26 Thread Herbert Xu
On Tue, Sep 24, 2024 at 08:43:00PM +0300, Jarkko Sakkinen wrote:
>
> Without any traces that would provide more information I don't see
> the smoking gun.

I haven't confirmed that it's definitely the tpm2 driver, it's just
based on the backtrace.  Hopefully my patch will confirm it one way
or the other.  Here is the backtrace:

[  100.784159] vmd :c2:00.5: Bound to PCI domain 10002 
[  100.786209] Monitor-Mwait will be used to enter C-1 state 
[  100.786225] Monitor-Mwait will be used to enter C-2 state 
[  100.786244] ACPI: \_SB_.SCK0.C000: Found 2 idle states 
[  100.823093] input: Power Button as 
/devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 
[  100.823636] ACPI: button: Power Button [PWRF] 
[  100.905756] ERST: Error Record Serialization Table (ERST) support is 
initialized. 
[  100.905858] pstore: Using crash dump compression: deflate 
[  100.905861] pstore: Registered erst as persistent store backend 
[  100.907044] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled 
[  100.908305] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 
16550A 
[  100.926608] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 
16550A 
[  100.942953] Non-volatile memory driver v1.3 
[  100.947908] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22) 
[  101.226913] ACPI: bus type drm_connector registered 
[  101.229708] alg: ecdh-nist-p256 (ecdh-nist-p256-generic) is disabled due to 
FIPS 
[  101.229745] tpm tpm0: crypto ecdh allocation failed 
[  101.236311] tpm tpm0: A TPM error (708) occurred start auth session 
[  101.238797] 
== 
[  101.238800] BUG: KASAN: slab-out-of-bounds in blake2s_update+0x135/0x2b0 
[  101.238808] Read of size 44 at addr ff11000167334d98 by task hwrng/318 
[  101.238811]  
[  101.238813] CPU: 26 UID: 0 PID: 318 Comm: hwrng Not tainted 
6.11.0-0.rc5.22.el10.x86_64+debug #1 
[  101.238818] Hardware name: Supermicro SSG-110P-NTR10-EI018/X12SPO-NTF, BIOS 
1.3 05/20/2022 
[  101.238820] Call Trace: 
[  101.238823]   
[  101.238826]  dump_stack_lvl+0x6f/0xb0 
[  101.238833]  ? blake2s_update+0x135/0x2b0 
[  101.238836]  print_address_description.constprop.0+0x88/0x330 
[  101.238843]  ? blake2s_update+0x135/0x2b0 
[  101.238847]  print_report+0x108/0x209 
[  101.238851]  ? blake2s_update+0x135/0x2b0 
[  101.238855]  ? __virt_addr_valid+0x20b/0x440 
[  101.238859]  ? blake2s_update+0x135/0x2b0 
[  101.238863]  kasan_report+0xa8/0xe0 
[  101.238868]  ? blake2s_update+0x135/0x2b0 
[  101.238874]  kasan_check_range+0x10f/0x1f0 
[  101.238879]  __asan_memcpy+0x23/0x60 
[  101.238883]  blake2s_update+0x135/0x2b0 
[  101.238887]  add_hwgenerator_randomness+0x3d/0xe0 
[  101.238895]  hwrng_fillfn+0x144/0x270 
[  101.238900]  ? __pfx_hwrng_fillfn+0x10/0x10 
[  101.238904]  kthread+0x2d2/0x3a0 
[  101.238908]  ? __pfx_kthread+0x10/0x10 
[  101.238912]  ret_from_fork+0x31/0x70 
[  101.238917]  ? __pfx_kthread+0x10/0x10 
[  101.238920]  ret_from_fork_asm+0x1a/0x30 
[  101.238929]   
[  101.238931]  
[  101.238932] Allocated by task 1: 
[  101.238934]  kasan_save_stack+0x30/0x50 
[  101.238937]  kasan_save_track+0x14/0x30 
[  101.238940]  __kasan_kmalloc+0x8f/0xa0 
[  101.238942]  __kmalloc_noprof+0x1fe/0x410 
[  101.238947]  kobj_map+0x7e/0x6d0 
[  101.238951]  cdev_add+0x92/0x180 
[  101.238954]  tty_cdev_add+0x17a/0x280 
[  101.238957]  tty_register_device_attr+0x401/0x740 
[  101.238960]  tty_register_driver+0x381/0x6f0 
[  101.238963]  vty_init+0x2c1/0x2f0 
[  101.238967]  tty_init+0x13b/0x150 
[  101.238970]  do_one_initcall+0x11c/0x5c0 
[  101.238975]  do_initcalls+0x1b4/0x1f0 
[  101.238980]  kernel_init_freeable+0x4ae/0x520 
[  101.238984]  kernel_init+0x1c/0x150 
[  101.238988]  ret_from_fork+0x31/0x70 
[  101.238991]  ret_from_fork_asm+0x1a/0x30 
[  101.238994]  
[  101.238995] The buggy address belongs to the object at ff11000167334d80 
[  101.238995]  which belongs to the cache kmalloc-64 of size 64 
[  101.238998] The buggy address is located 24 bytes inside of 
[  101.238998]  allocated 56-byte region [ff11000167334d80, ff11000167334db8) 
[  101.239002]  
[  101.239003] The buggy address belongs to the physical page: 
[  101.239004] page: refcount:1 mapcount:0 mapping: index:0x0 
pfn:0x167334 
[  101.239008] flags: 0x17c000(node=0|zone=2|lastcpupid=0x1f) 
[  101.239012] page_type: 0xfdff(slab) 
[  101.239016] raw: 0017c000 ff1100010003c8c0 dead0122 
 
[  101.239019] raw:  00200020 0001fdff 
 
[  101.239021] page dumped because: kasan: bad access detected 
[  101.239023]  
[  101.239024] Memory state around the buggy address: 
[  101.239025]  ff11000167334c80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc 
fc 
[  101.239028]  ff11000167334d00: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc 
fc 
[  101.239030] >ff11000167334d80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc 
fc 
[  101.239031]

Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-24 Thread Jarkko Sakkinen
On Tue Sep 24, 2024 at 7:05 PM EEST, Jarkko Sakkinen wrote:
> On Tue Sep 24, 2024 at 1:32 AM EEST, Herbert Xu wrote:
> > On Mon, Sep 23, 2024 at 04:48:27PM +0200, Greg KH wrote:
> > >
> > > Please see:
> > >   
> > > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> > > which describes that.  We should make it more explicit that any WARN()
> > > or WARN_ON() calls that can be hit by user interactions somehow, will
> > > end up getting a CVE id when we fix it up to not do so.
> >
> > If the aformentioned WARN_ON hits, then the driver has probabaly
> > already done a buffer overrun so it's a CVE anyway.
>
> We'll see I finally got into testing this. Sorry for latencies, I'm
> switching jobs and unfortunately German Post Office lost my priority
> mail containing contracts (sent them from Finland to Berlin) so have
> been signing, scanning etc. the whole day :-) My last week in the
> current job, and next week is the first in the new job, so this
> week is a bit bumpy.

I get nothing with this:

diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index aba024cbe7c5..856a8356d971 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -341,12 +341,15 @@ int tpm2_get_random(struct tpm_chip *chip, u8 *dest, 
size_t max)

dest_ptr += recd;
total += recd;
+
+   WARN_ON(num_bytes < recd);
num_bytes -= recd;
} while (retries-- && total < max);

tpm_buf_destroy(&buf);
tpm2_end_auth_session(chip);

+   WARN_ON(total > max);
return total ? total : -EIO;
 out:
tpm_buf_destroy(&buf);

[WARN_ON()'s here are only for the temporary diff]

Call stack:

1. tpm2_get_random():
   
https://elixir.bootlin.com/linux/v6.11-rc7/source/drivers/char/tpm/tpm2-cmd.c#L281
2. tpm_get_random():
   
https://elixir.bootlin.com/linux/v6.11-rc7/source/drivers/char/tpm/tpm-interface.c#L430
3. tpm_hwrng_read():
   
https://elixir.bootlin.com/linux/v6.11-rc7/source/drivers/char/tpm/tpm-chip.c#L524

Everything seems to have also appropriate range checks.

Without any traces that would provide more information I don't see
the smoking gun.

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-24 Thread Jarkko Sakkinen
On Tue Sep 24, 2024 at 1:32 AM EEST, Herbert Xu wrote:
> On Mon, Sep 23, 2024 at 04:48:27PM +0200, Greg KH wrote:
> >
> > Please see:
> > 
> > https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> > which describes that.  We should make it more explicit that any WARN()
> > or WARN_ON() calls that can be hit by user interactions somehow, will
> > end up getting a CVE id when we fix it up to not do so.
>
> If the aformentioned WARN_ON hits, then the driver has probabaly
> already done a buffer overrun so it's a CVE anyway.

We'll see I finally got into testing this. Sorry for latencies, I'm
switching jobs and unfortunately German Post Office lost my priority
mail containing contracts (sent them from Finland to Berlin) so have
been signing, scanning etc. the whole day :-) My last week in the
current job, and next week is the first in the new job, so this
week is a bit bumpy.

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Herbert Xu
On Mon, Sep 23, 2024 at 04:48:27PM +0200, Greg KH wrote:
>
> Please see:
>   
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> which describes that.  We should make it more explicit that any WARN()
> or WARN_ON() calls that can be hit by user interactions somehow, will
> end up getting a CVE id when we fix it up to not do so.

If the aformentioned WARN_ON hits, then the driver has probabaly
already done a buffer overrun so it's a CVE anyway.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 5:48 PM EEST, Greg KH wrote:
> On Mon, Sep 23, 2024 at 05:31:47PM +0300, Jarkko Sakkinen wrote:
> > On Mon Sep 23, 2024 at 12:26 PM EEST, Herbert Xu wrote:
> >  > > +
> > > > > + err = rng->read(rng, buffer, size, wait);
> > > > > + if (WARN_ON_ONCE(err > 0 && err > size))
> > > > 
> > > > Are you sure you want to use WARN_ON_ONCE here instead of
> > > > pr_warn_once()? I.e. is it worth of taking down the whole
> > > > kernel?
> > >
> > > Absolutely.  If this triggers it's a serious kernel bug and we
> > > should gather as much information as possible.  pr_warn_once is
> > > not the same thing as WARN_ON_ONCE in terms of what it prints.
> > 
> > Personally I allow the use of WARN only as the last resort.
> > 
> > If you need stack printout you can always use dump_stack().
> > 
> > >
> > > If people want to turn WARNs into BUGs, then they've only got
> > > themselves to blame when the kernel goes down.  On the other
> > > hand perhaps they *do* want this to panic and we should hand
> > > it to them.
> > 
> > Actually when you turn on "panic_on_warn" the user expectation is and
> > should be that the sites where WARN is used have been hand picked with
> > consideration so that panic happens for a reason.
> > 
> > This has also been denoted repeatedly by Greg:
> > 
> > https://lore.kernel.org/linux-cve-announce/2024061828-CVE-2024-36975-6719@gregkh/
> > 
> > I should check this somewhere but actually these days a wrongly chosen
> > WARN() might lead to CVE entry. That fix was by me but I never created
> > the CVE.
> > 
> > Greg, did we have something under Documentation/ that would fully
> > address the use of WARN?
>
> Please see:
>   
> https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
> which describes that.  We should make it more explicit that any WARN()
> or WARN_ON() calls that can be hit by user interactions somehow, will
> end up getting a CVE id when we fix it up to not do so.

I bookmarked this thanks :-)

Herbert, I'll do comprehensive testing tmrw by adding some invariants to
tpm2_get_random(). I'd really love to reimplement it because the current
implementation frankly sucks (and it's by me) but yep, we nee to fix it
first and foremost.

>
> thanks,
>
> greg k-h


BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Greg KH
On Mon, Sep 23, 2024 at 05:31:47PM +0300, Jarkko Sakkinen wrote:
> On Mon Sep 23, 2024 at 12:26 PM EEST, Herbert Xu wrote:
>  > > +
> > > > +   err = rng->read(rng, buffer, size, wait);
> > > > +   if (WARN_ON_ONCE(err > 0 && err > size))
> > > 
> > > Are you sure you want to use WARN_ON_ONCE here instead of
> > > pr_warn_once()? I.e. is it worth of taking down the whole
> > > kernel?
> >
> > Absolutely.  If this triggers it's a serious kernel bug and we
> > should gather as much information as possible.  pr_warn_once is
> > not the same thing as WARN_ON_ONCE in terms of what it prints.
> 
> Personally I allow the use of WARN only as the last resort.
> 
> If you need stack printout you can always use dump_stack().
> 
> >
> > If people want to turn WARNs into BUGs, then they've only got
> > themselves to blame when the kernel goes down.  On the other
> > hand perhaps they *do* want this to panic and we should hand
> > it to them.
> 
> Actually when you turn on "panic_on_warn" the user expectation is and
> should be that the sites where WARN is used have been hand picked with
> consideration so that panic happens for a reason.
> 
> This has also been denoted repeatedly by Greg:
> 
> https://lore.kernel.org/linux-cve-announce/2024061828-CVE-2024-36975-6719@gregkh/
> 
> I should check this somewhere but actually these days a wrongly chosen
> WARN() might lead to CVE entry. That fix was by me but I never created
> the CVE.
> 
> Greg, did we have something under Documentation/ that would fully
> address the use of WARN?

Please see:

https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on
which describes that.  We should make it more explicit that any WARN()
or WARN_ON() calls that can be hit by user interactions somehow, will
end up getting a CVE id when we fix it up to not do so.

thanks,

greg k-h



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 5:31 PM EEST, Jarkko Sakkinen wrote:
> Greg, did we have something under Documentation/ that would fully
> address the use of WARN?

... personally I think that even if there was a separate document, this
should be somehow addressed already in SubmittingPatches so that it
can't be possibly missed by anyone.

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 12:26 PM EEST, Herbert Xu wrote:
 > > +
> > > + err = rng->read(rng, buffer, size, wait);
> > > + if (WARN_ON_ONCE(err > 0 && err > size))
> > 
> > Are you sure you want to use WARN_ON_ONCE here instead of
> > pr_warn_once()? I.e. is it worth of taking down the whole
> > kernel?
>
> Absolutely.  If this triggers it's a serious kernel bug and we
> should gather as much information as possible.  pr_warn_once is
> not the same thing as WARN_ON_ONCE in terms of what it prints.

Personally I allow the use of WARN only as the last resort.

If you need stack printout you can always use dump_stack().

>
> If people want to turn WARNs into BUGs, then they've only got
> themselves to blame when the kernel goes down.  On the other
> hand perhaps they *do* want this to panic and we should hand
> it to them.

Actually when you turn on "panic_on_warn" the user expectation is and
should be that the sites where WARN is used have been hand picked with
consideration so that panic happens for a reason.

This has also been denoted repeatedly by Greg:

https://lore.kernel.org/linux-cve-announce/2024061828-CVE-2024-36975-6719@gregkh/

I should check this somewhere but actually these days a wrongly chosen
WARN() might lead to CVE entry. That fix was by me but I never created
the CVE.

Greg, did we have something under Documentation/ that would fully
address the use of WARN?

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Herbert Xu
On Mon, Sep 23, 2024 at 10:52:49AM +0300, Jarkko Sakkinen wrote:
> On Mon Sep 23, 2024 at 9:05 AM EEST, Herbert Xu wrote:
> > Dear TPM maintainers:
> 
> There's really only just me (for past 10 years). Maybe that should be
> updatred.

:)

> > BUG_ON(!mutex_is_locked(&reading_mutex));
> > -   if (rng->read)
> > -   return rng->read(rng, (void *)buffer, size, wait);
> > +   if (rng->read) {
> > +   int err;
> > +
> > +   err = rng->read(rng, buffer, size, wait);
> > +   if (WARN_ON_ONCE(err > 0 && err > size))
> 
> Are you sure you want to use WARN_ON_ONCE here instead of
> pr_warn_once()? I.e. is it worth of taking down the whole
> kernel?

Absolutely.  If this triggers it's a serious kernel bug and we
should gather as much information as possible.  pr_warn_once is
not the same thing as WARN_ON_ONCE in terms of what it prints.

If people want to turn WARNs into BUGs, then they've only got
themselves to blame when the kernel goes down.  On the other
hand perhaps they *do* want this to panic and we should hand
it to them.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 11:07 AM EEST, Jarkko Sakkinen wrote:
> On Mon Sep 23, 2024 at 10:52 AM EEST, Jarkko Sakkinen wrote:
> > On Mon Sep 23, 2024 at 9:05 AM EEST, Herbert Xu wrote:
> > > Dear TPM maintainers:
> >
> > There's really only just me (for past 10 years). Maybe that should be
> > updatred.
> >
> > >
> > > Please have a look at the tpm hwrng driver because it appears to
> > > be returning a length longer than the buffer length that we gave
> > > it.  In particular, tpm2 appears to be the culprit (though I didn't
> > > really check tpm1 at all so it could also be buggy).
> > >
> > > The following patch hopefully should confirm that this is indeed
> > > caused by TPM and not some other HWRNG driver.
> >
> >
> >
> >
> > >
> > > ---8<---
> > > If a buggy driver returns a length that is longer than the size
> > > of the buffer provided to it, then this may lead to a buffer overread
> > > in the caller.
> > >
> > > Stop this by adding a check for it in the hwrng core.
> > >
> > > Reported-by: Guangwu Zhang 
> > > Signed-off-by: Herbert Xu 
> > >
> > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > > index 57c51efa5613..018316f54621 100644
> > > --- a/drivers/char/hw_random/core.c
> > > +++ b/drivers/char/hw_random/core.c
> > > @@ -181,8 +181,15 @@ static inline int rng_get_data(struct hwrng *rng, u8 
> > > *buffer, size_t size,
> > >   int present;
> > >  
> > >   BUG_ON(!mutex_is_locked(&reading_mutex));
> > > - if (rng->read)
> > > - return rng->read(rng, (void *)buffer, size, wait);
> > > + if (rng->read) {
> > > + int err;
> > > +
> > > + err = rng->read(rng, buffer, size, wait);
> > > + if (WARN_ON_ONCE(err > 0 && err > size))
> >
> > Are you sure you want to use WARN_ON_ONCE here instead of
> > pr_warn_once()? I.e. is it worth of taking down the whole
> > kernel?
>
> I looked at tpm2_get_random() and it is pretty inefficient (not same
> as saying it has a bug). I'd love to reimplement it.
>
> We would be better of it would pull random let's say with 256 byte
> or 512 byte chunks and cache that internal to tpm_chip. Then the
> requests would be served from that pre-fetched pool and not do
> round-trip to TPM every single time.
>
> This would improve overall kernel performance given the reduced
> number of wait states. I wonder if anyone knows what would be a
> good fetch size that will work on most TPM2 chips?

Herbert, I'm going to go to gym but I could help you to debug the
issue you're seeing with a patch from tpm2_get_random(). We need
to fix that one first ofc (as you never should build on top of
broken code). Once back from gym and grocery shopping I'll bake
a debugging patch.

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 10:52 AM EEST, Jarkko Sakkinen wrote:
> On Mon Sep 23, 2024 at 9:05 AM EEST, Herbert Xu wrote:
> > Dear TPM maintainers:
>
> There's really only just me (for past 10 years). Maybe that should be
> updatred.
>
> >
> > Please have a look at the tpm hwrng driver because it appears to
> > be returning a length longer than the buffer length that we gave
> > it.  In particular, tpm2 appears to be the culprit (though I didn't
> > really check tpm1 at all so it could also be buggy).
> >
> > The following patch hopefully should confirm that this is indeed
> > caused by TPM and not some other HWRNG driver.
>
>
>
>
> >
> > ---8<---
> > If a buggy driver returns a length that is longer than the size
> > of the buffer provided to it, then this may lead to a buffer overread
> > in the caller.
> >
> > Stop this by adding a check for it in the hwrng core.
> >
> > Reported-by: Guangwu Zhang 
> > Signed-off-by: Herbert Xu 
> >
> > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> > index 57c51efa5613..018316f54621 100644
> > --- a/drivers/char/hw_random/core.c
> > +++ b/drivers/char/hw_random/core.c
> > @@ -181,8 +181,15 @@ static inline int rng_get_data(struct hwrng *rng, u8 
> > *buffer, size_t size,
> > int present;
> >  
> > BUG_ON(!mutex_is_locked(&reading_mutex));
> > -   if (rng->read)
> > -   return rng->read(rng, (void *)buffer, size, wait);
> > +   if (rng->read) {
> > +   int err;
> > +
> > +   err = rng->read(rng, buffer, size, wait);
> > +   if (WARN_ON_ONCE(err > 0 && err > size))
>
> Are you sure you want to use WARN_ON_ONCE here instead of
> pr_warn_once()? I.e. is it worth of taking down the whole
> kernel?

I looked at tpm2_get_random() and it is pretty inefficient (not same
as saying it has a bug). I'd love to reimplement it.

We would be better of it would pull random let's say with 256 byte
or 512 byte chunks and cache that internal to tpm_chip. Then the
requests would be served from that pre-fetched pool and not do
round-trip to TPM every single time.

This would improve overall kernel performance given the reduced
number of wait states. I wonder if anyone knows what would be a
good fetch size that will work on most TPM2 chips?

BR, Jarkko



Re: [PATCH] hwrng: core - Add WARN_ON for buggy read return values

2024-09-23 Thread Jarkko Sakkinen
On Mon Sep 23, 2024 at 9:05 AM EEST, Herbert Xu wrote:
> Dear TPM maintainers:

There's really only just me (for past 10 years). Maybe that should be
updatred.

>
> Please have a look at the tpm hwrng driver because it appears to
> be returning a length longer than the buffer length that we gave
> it.  In particular, tpm2 appears to be the culprit (though I didn't
> really check tpm1 at all so it could also be buggy).
>
> The following patch hopefully should confirm that this is indeed
> caused by TPM and not some other HWRNG driver.




>
> ---8<---
> If a buggy driver returns a length that is longer than the size
> of the buffer provided to it, then this may lead to a buffer overread
> in the caller.
>
> Stop this by adding a check for it in the hwrng core.
>
> Reported-by: Guangwu Zhang 
> Signed-off-by: Herbert Xu 
>
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index 57c51efa5613..018316f54621 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -181,8 +181,15 @@ static inline int rng_get_data(struct hwrng *rng, u8 
> *buffer, size_t size,
>   int present;
>  
>   BUG_ON(!mutex_is_locked(&reading_mutex));
> - if (rng->read)
> - return rng->read(rng, (void *)buffer, size, wait);
> + if (rng->read) {
> + int err;
> +
> + err = rng->read(rng, buffer, size, wait);
> + if (WARN_ON_ONCE(err > 0 && err > size))

Are you sure you want to use WARN_ON_ONCE here instead of
pr_warn_once()? I.e. is it worth of taking down the whole
kernel?

> + err = size;
> +
> + return err;
> + }
>  
>   if (rng->data_present)
>   present = rng->data_present(rng, wait);

BR, Jarkko