Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Fri, Nov 10, 2017 at 10:37:30PM +1100, Herbert Xu wrote: > On Mon, Nov 06, 2017 at 10:19:51PM -0800, Eric Biggers wrote: > > From: Eric Biggers> > > > On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > > largest permitted inputs (16384 bits), the kernel spends 10+ seconds > > doing modular exponentiation in mpi_powm() without rescheduling. If all > > threads do it, it locks up the system. Moreover, it can cause > > rcu_sched-stall warnings. > > > > Notwithstanding the insanity of doing this calculation in kernel mode > > rather than in userspace, fix it by calling cond_resched() as each bit > > from the exponent is processed. It's still noninterruptible, but at > > least it's preemptible now. > > > > Cc: sta...@vger.kernel.org # v4.12+ > > Signed-off-by: Eric Biggers > > Patch applied. Thanks. > -- If it's not too late can you fix the stable line to be just Cc: sta...@vger.kernel.org As Mat pointed out KEYCTL_DH_COMPUTE was actually introduced in v4.7. Also I think the code is also reachable through RSA by adding an x509 certificate using the "asymmetric" key type, although that appears to be limited to 4096-bit inputs rather than 16384 bits. Eric
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Fri, Nov 10, 2017 at 10:37:30PM +1100, Herbert Xu wrote: > On Mon, Nov 06, 2017 at 10:19:51PM -0800, Eric Biggers wrote: > > From: Eric Biggers > > > > On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > > largest permitted inputs (16384 bits), the kernel spends 10+ seconds > > doing modular exponentiation in mpi_powm() without rescheduling. If all > > threads do it, it locks up the system. Moreover, it can cause > > rcu_sched-stall warnings. > > > > Notwithstanding the insanity of doing this calculation in kernel mode > > rather than in userspace, fix it by calling cond_resched() as each bit > > from the exponent is processed. It's still noninterruptible, but at > > least it's preemptible now. > > > > Cc: sta...@vger.kernel.org # v4.12+ > > Signed-off-by: Eric Biggers > > Patch applied. Thanks. > -- If it's not too late can you fix the stable line to be just Cc: sta...@vger.kernel.org As Mat pointed out KEYCTL_DH_COMPUTE was actually introduced in v4.7. Also I think the code is also reachable through RSA by adding an x509 certificate using the "asymmetric" key type, although that appears to be limited to 4096-bit inputs rather than 16384 bits. Eric
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Mon, Nov 06, 2017 at 10:19:51PM -0800, Eric Biggers wrote: > From: Eric Biggers> > On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > largest permitted inputs (16384 bits), the kernel spends 10+ seconds > doing modular exponentiation in mpi_powm() without rescheduling. If all > threads do it, it locks up the system. Moreover, it can cause > rcu_sched-stall warnings. > > Notwithstanding the insanity of doing this calculation in kernel mode > rather than in userspace, fix it by calling cond_resched() as each bit > from the exponent is processed. It's still noninterruptible, but at > least it's preemptible now. > > Cc: sta...@vger.kernel.org # v4.12+ > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Mon, Nov 06, 2017 at 10:19:51PM -0800, Eric Biggers wrote: > From: Eric Biggers > > On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > largest permitted inputs (16384 bits), the kernel spends 10+ seconds > doing modular exponentiation in mpi_powm() without rescheduling. If all > threads do it, it locks up the system. Moreover, it can cause > rcu_sched-stall warnings. > > Notwithstanding the insanity of doing this calculation in kernel mode > rather than in userspace, fix it by calling cond_resched() as each bit > from the exponent is processed. It's still noninterruptible, but at > least it's preemptible now. > > Cc: sta...@vger.kernel.org # v4.12+ > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Tue, Nov 07, 2017 at 10:38:30AM -0800, Mat Martineau wrote: > > Eric, > > On Mon, 6 Nov 2017, Eric Biggers wrote: > > >From: Eric Biggers> > > >On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > >largest permitted inputs (16384 bits), the kernel spends 10+ seconds > >doing modular exponentiation in mpi_powm() without rescheduling. If all > >threads do it, it locks up the system. Moreover, it can cause > >rcu_sched-stall warnings. > > > >Notwithstanding the insanity of doing this calculation in kernel mode > >rather than in userspace, fix it by calling cond_resched() as each bit > >from the exponent is processed. It's still noninterruptible, but at > >least it's preemptible now. > > cond_resched() is in the outer loop and gets called every > BITS_PER_LONG bits. That seems to be often enough for the system > that was taking 10+ seconds, and might be ok for slower processors. > > Was your intent to call cond_resched() for every bit as you > described in the commit message? > You're right, the cond_resched() is actually once per "limb", not once per bit. With the largest permitted inputs (16384 bits), each limb of the exponent takes about 38 milliseconds on an x86_64 CPU. Therefore on some other CPUs it will probably take 100+ milliseconds, which is much too long. So I guess it should do cond_resched() for each bit. I'll send a revised patch... Eric
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
On Tue, Nov 07, 2017 at 10:38:30AM -0800, Mat Martineau wrote: > > Eric, > > On Mon, 6 Nov 2017, Eric Biggers wrote: > > >From: Eric Biggers > > > >On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the > >largest permitted inputs (16384 bits), the kernel spends 10+ seconds > >doing modular exponentiation in mpi_powm() without rescheduling. If all > >threads do it, it locks up the system. Moreover, it can cause > >rcu_sched-stall warnings. > > > >Notwithstanding the insanity of doing this calculation in kernel mode > >rather than in userspace, fix it by calling cond_resched() as each bit > >from the exponent is processed. It's still noninterruptible, but at > >least it's preemptible now. > > cond_resched() is in the outer loop and gets called every > BITS_PER_LONG bits. That seems to be often enough for the system > that was taking 10+ seconds, and might be ok for slower processors. > > Was your intent to call cond_resched() for every bit as you > described in the commit message? > You're right, the cond_resched() is actually once per "limb", not once per bit. With the largest permitted inputs (16384 bits), each limb of the exponent takes about 38 milliseconds on an x86_64 CPU. Therefore on some other CPUs it will probably take 100+ milliseconds, which is much too long. So I guess it should do cond_resched() for each bit. I'll send a revised patch... Eric
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
Eric, On Mon, 6 Nov 2017, Eric Biggers wrote: From: Eric BiggersOn a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the largest permitted inputs (16384 bits), the kernel spends 10+ seconds doing modular exponentiation in mpi_powm() without rescheduling. If all threads do it, it locks up the system. Moreover, it can cause rcu_sched-stall warnings. Notwithstanding the insanity of doing this calculation in kernel mode rather than in userspace, fix it by calling cond_resched() as each bit from the exponent is processed. It's still noninterruptible, but at least it's preemptible now. cond_resched() is in the outer loop and gets called every BITS_PER_LONG bits. That seems to be often enough for the system that was taking 10+ seconds, and might be ok for slower processors. Was your intent to call cond_resched() for every bit as you described in the commit message? Thanks for the fix. Mat Cc: sta...@vger.kernel.org # v4.12+ Signed-off-by: Eric Biggers --- lib/mpi/mpi-pow.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/mpi/mpi-pow.c b/lib/mpi/mpi-pow.c index e24388a863a7..f089a52dbbdb 100644 --- a/lib/mpi/mpi-pow.c +++ b/lib/mpi/mpi-pow.c @@ -26,6 +26,7 @@ * however I decided to publish this code under the plain GPL. */ +#include #include #include "mpi-internal.h" #include "longlong.h" @@ -263,6 +264,8 @@ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod) break; e = ep[i]; c = BITS_PER_MPI_LIMB; + + cond_resched(); } /* We shifted MOD, the modulo reduction argument, left MOD_SHIFT_CNT -- 2.15.0 -- Mat Martineau Intel OTC
Re: [PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
Eric, On Mon, 6 Nov 2017, Eric Biggers wrote: From: Eric Biggers On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the largest permitted inputs (16384 bits), the kernel spends 10+ seconds doing modular exponentiation in mpi_powm() without rescheduling. If all threads do it, it locks up the system. Moreover, it can cause rcu_sched-stall warnings. Notwithstanding the insanity of doing this calculation in kernel mode rather than in userspace, fix it by calling cond_resched() as each bit from the exponent is processed. It's still noninterruptible, but at least it's preemptible now. cond_resched() is in the outer loop and gets called every BITS_PER_LONG bits. That seems to be often enough for the system that was taking 10+ seconds, and might be ok for slower processors. Was your intent to call cond_resched() for every bit as you described in the commit message? Thanks for the fix. Mat Cc: sta...@vger.kernel.org # v4.12+ Signed-off-by: Eric Biggers --- lib/mpi/mpi-pow.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/mpi/mpi-pow.c b/lib/mpi/mpi-pow.c index e24388a863a7..f089a52dbbdb 100644 --- a/lib/mpi/mpi-pow.c +++ b/lib/mpi/mpi-pow.c @@ -26,6 +26,7 @@ * however I decided to publish this code under the plain GPL. */ +#include #include #include "mpi-internal.h" #include "longlong.h" @@ -263,6 +264,8 @@ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod) break; e = ep[i]; c = BITS_PER_MPI_LIMB; + + cond_resched(); } /* We shifted MOD, the modulo reduction argument, left MOD_SHIFT_CNT -- 2.15.0 -- Mat Martineau Intel OTC
[PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
From: Eric BiggersOn a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the largest permitted inputs (16384 bits), the kernel spends 10+ seconds doing modular exponentiation in mpi_powm() without rescheduling. If all threads do it, it locks up the system. Moreover, it can cause rcu_sched-stall warnings. Notwithstanding the insanity of doing this calculation in kernel mode rather than in userspace, fix it by calling cond_resched() as each bit from the exponent is processed. It's still noninterruptible, but at least it's preemptible now. Cc: sta...@vger.kernel.org # v4.12+ Signed-off-by: Eric Biggers --- lib/mpi/mpi-pow.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/mpi/mpi-pow.c b/lib/mpi/mpi-pow.c index e24388a863a7..f089a52dbbdb 100644 --- a/lib/mpi/mpi-pow.c +++ b/lib/mpi/mpi-pow.c @@ -26,6 +26,7 @@ * however I decided to publish this code under the plain GPL. */ +#include #include #include "mpi-internal.h" #include "longlong.h" @@ -263,6 +264,8 @@ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod) break; e = ep[i]; c = BITS_PER_MPI_LIMB; + + cond_resched(); } /* We shifted MOD, the modulo reduction argument, left MOD_SHIFT_CNT -- 2.15.0
[PATCH] lib/mpi: call cond_resched() from mpi_powm() loop
From: Eric Biggers On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the largest permitted inputs (16384 bits), the kernel spends 10+ seconds doing modular exponentiation in mpi_powm() without rescheduling. If all threads do it, it locks up the system. Moreover, it can cause rcu_sched-stall warnings. Notwithstanding the insanity of doing this calculation in kernel mode rather than in userspace, fix it by calling cond_resched() as each bit from the exponent is processed. It's still noninterruptible, but at least it's preemptible now. Cc: sta...@vger.kernel.org # v4.12+ Signed-off-by: Eric Biggers --- lib/mpi/mpi-pow.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/lib/mpi/mpi-pow.c b/lib/mpi/mpi-pow.c index e24388a863a7..f089a52dbbdb 100644 --- a/lib/mpi/mpi-pow.c +++ b/lib/mpi/mpi-pow.c @@ -26,6 +26,7 @@ * however I decided to publish this code under the plain GPL. */ +#include #include #include "mpi-internal.h" #include "longlong.h" @@ -263,6 +264,8 @@ int mpi_powm(MPI res, MPI base, MPI exp, MPI mod) break; e = ep[i]; c = BITS_PER_MPI_LIMB; + + cond_resched(); } /* We shifted MOD, the modulo reduction argument, left MOD_SHIFT_CNT -- 2.15.0