Re: new warning on sysrq kernel crash trigger
On Thu, Dec 17, 2015 at 9:28 AM, Greg KH wrote: > On Wed, Dec 16, 2015 at 11:25:21AM -0500, Rik van Riel wrote: >> On 12/15/2015 07:52 PM, Ani Sinha wrote: >> > Rik, should I send a separate email with the patch or you are OK >> > with what I sent in the email? Are you queueing up my patch for >> > applying upstream? >> >> I don't have a git tree for people to pull from, and >> it looks like the tty & sysrq maintainers are Greg KH >> and Jiri Slaby. >> >> Greg, Jiri, where do you prefer Ani sends the patch >> for inclusion, or should it go in through Paul's tree? > > I don't care which, either is fine for me. I have sent just the patch to the relevant folks again with CC linux-kernel@. Hope this is now all set for pulling upstream. Let me know if I need to do anything else. thanks ani -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Wed, Dec 16, 2015 at 11:25:21AM -0500, Rik van Riel wrote: > On 12/15/2015 07:52 PM, Ani Sinha wrote: > > Rik, should I send a separate email with the patch or you are OK > > with what I sent in the email? Are you queueing up my patch for > > applying upstream? > > I don't have a git tree for people to pull from, and > it looks like the tty & sysrq maintainers are Greg KH > and Jiri Slaby. > > Greg, Jiri, where do you prefer Ani sends the patch > for inclusion, or should it go in through Paul's tree? I don't care which, either is fine for me. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Wed, Dec 16, 2015 at 11:25:21AM -0500, Rik van Riel wrote: > On 12/15/2015 07:52 PM, Ani Sinha wrote: > > Rik, should I send a separate email with the patch or you are OK > > with what I sent in the email? Are you queueing up my patch for > > applying upstream? > > I don't have a git tree for people to pull from, and > it looks like the tty & sysrq maintainers are Greg KH > and Jiri Slaby. > > Greg, Jiri, where do you prefer Ani sends the patch > for inclusion, or should it go in through Paul's tree? I don't care which, either is fine for me. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Thu, Dec 17, 2015 at 9:28 AM, Greg KHwrote: > On Wed, Dec 16, 2015 at 11:25:21AM -0500, Rik van Riel wrote: >> On 12/15/2015 07:52 PM, Ani Sinha wrote: >> > Rik, should I send a separate email with the patch or you are OK >> > with what I sent in the email? Are you queueing up my patch for >> > applying upstream? >> >> I don't have a git tree for people to pull from, and >> it looks like the tty & sysrq maintainers are Greg KH >> and Jiri Slaby. >> >> Greg, Jiri, where do you prefer Ani sends the patch >> for inclusion, or should it go in through Paul's tree? > > I don't care which, either is fine for me. I have sent just the patch to the relevant folks again with CC linux-kernel@. Hope this is now all set for pulling upstream. Let me know if I need to do anything else. thanks ani -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 12/15/2015 07:52 PM, Ani Sinha wrote: > Rik, should I send a separate email with the patch or you are OK > with what I sent in the email? Are you queueing up my patch for > applying upstream? I don't have a git tree for people to pull from, and it looks like the tty & sysrq maintainers are Greg KH and Jiri Slaby. Greg, Jiri, where do you prefer Ani sends the patch for inclusion, or should it go in through Paul's tree? -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBCAAGBQJWcZBxAAoJEM553pKExN6DsFAH/jl6UdCd4vj6ovzCHDr9lWZL C/I0DwCDRx5VxvyyiiiQz49yWjSSZue7ZAeis42YoJ89apHh3jwYGqUc8WrHz0j1 DVwPMk6DjiInTK2dIsyVVeMxCSr6wk6NDvC8/KwownBK9OvcI20bEfBLdjRUj4Y0 ySe92VStk3n9GIez9M2XAfPV9ADWcUbN6KNkqbKYf9h0qgl3h+9ZhvsiQHPOEdnG +dsD/FVwnVYDQOdwWroZHi0UmorHS6gQbEHHO851xIkKIztMGY00CnvJOehdJWW8 BNc1pTAUpWiPvDddzjhmGdwEx5kSp/y3JwLu5BJTfVuNHc2Ss9KdZCYNmbpggHk= =dSvj -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 12/14/2015 07:14 PM, Anirban Sinha wrote: > > > On Mon, 14 Dec 2015, Rik van Riel wrote: > >> On 12/14/2015 11:24 AM, Ani Sinha wrote: >>> Rik, any comments? >> >> Another good option is to simply ignore this warning, or drop the >> rcu_read_lock before doing the alt-syrsq-c action. >> >> After all, alt-sysrq-c is "crash the system, take a crash dump", >> which is not an action the system ever returns from. >> > > Yea I thought about this idea previously but then discarded it > thinking it would be too hacky. Here's the cooked up patch. I hope > this can be approved for mainline soon (I'm on vacation and working > just on this issue remotely) : > > From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 > 2001 From: Ani Sinha Date: Mon, 14 Dec 2015 > 14:55:08 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called > from invalid context' warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced > spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. > Since rcu_read_lock() does not disable preemption, > faulthandler_disabled() in __do_page_fault() in x86/fault.c returns > false. When the code later calls might_sleep() in the pagefault > handler, we get the following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, > pid: 4706, name: bash Preemption disabled at:[] > printk+0x48/0x4a > > To fix this, we release the RCU read lock before we crash. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha Reviewed-by: Rik van Riel -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBCAAGBQJWcY++AAoJEM553pKExN6DMsEIAIgRI2dlnimHDR30BWhAhj1m rPlG3zEKsilR5/MjD3y/LZqIqG2PmMEpIGajeTOu5O9cZhIyon/6snHTST36kN2Y 2CMCdUYNTQtDLpg8RoFsu8cvL4gBdi4J+o/U4E8gFXn6MqNsk3U0Dow/BJl1dPAm V2/aN2K6od3+HU0q3ZJGfcnc4SipkAnA3nmrh5OntXLtZBfye6ge7UONxLzBI2vR +7sGTd3ebKd9AZlYevZQxnSaeJbikGJoCwreqMVTueX8fbhvvReo/f6OfnXF6HaF vDK6lle/BFuHYb11/cWonSuKcphpAOfvX+n90BtbBMedUKNlGvLBBH55feIbOpw= =BP+x -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 12/14/2015 07:14 PM, Anirban Sinha wrote: > > > On Mon, 14 Dec 2015, Rik van Riel wrote: > >> On 12/14/2015 11:24 AM, Ani Sinha wrote: >>> Rik, any comments? >> >> Another good option is to simply ignore this warning, or drop the >> rcu_read_lock before doing the alt-syrsq-c action. >> >> After all, alt-sysrq-c is "crash the system, take a crash dump", >> which is not an action the system ever returns from. >> > > Yea I thought about this idea previously but then discarded it > thinking it would be too hacky. Here's the cooked up patch. I hope > this can be approved for mainline soon (I'm on vacation and working > just on this issue remotely) : > > From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 > 2001 From: Ani SinhaDate: Mon, 14 Dec 2015 > 14:55:08 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called > from invalid context' warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced > spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. > Since rcu_read_lock() does not disable preemption, > faulthandler_disabled() in __do_page_fault() in x86/fault.c returns > false. When the code later calls might_sleep() in the pagefault > handler, we get the following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, > pid: 4706, name: bash Preemption disabled at:[] > printk+0x48/0x4a > > To fix this, we release the RCU read lock before we crash. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha Reviewed-by: Rik van Riel -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBCAAGBQJWcY++AAoJEM553pKExN6DMsEIAIgRI2dlnimHDR30BWhAhj1m rPlG3zEKsilR5/MjD3y/LZqIqG2PmMEpIGajeTOu5O9cZhIyon/6snHTST36kN2Y 2CMCdUYNTQtDLpg8RoFsu8cvL4gBdi4J+o/U4E8gFXn6MqNsk3U0Dow/BJl1dPAm V2/aN2K6od3+HU0q3ZJGfcnc4SipkAnA3nmrh5OntXLtZBfye6ge7UONxLzBI2vR +7sGTd3ebKd9AZlYevZQxnSaeJbikGJoCwreqMVTueX8fbhvvReo/f6OfnXF6HaF vDK6lle/BFuHYb11/cWonSuKcphpAOfvX+n90BtbBMedUKNlGvLBBH55feIbOpw= =BP+x -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 12/15/2015 07:52 PM, Ani Sinha wrote: > Rik, should I send a separate email with the patch or you are OK > with what I sent in the email? Are you queueing up my patch for > applying upstream? I don't have a git tree for people to pull from, and it looks like the tty & sysrq maintainers are Greg KH and Jiri Slaby. Greg, Jiri, where do you prefer Ani sends the patch for inclusion, or should it go in through Paul's tree? -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBCAAGBQJWcZBxAAoJEM553pKExN6DsFAH/jl6UdCd4vj6ovzCHDr9lWZL C/I0DwCDRx5VxvyyiiiQz49yWjSSZue7ZAeis42YoJ89apHh3jwYGqUc8WrHz0j1 DVwPMk6DjiInTK2dIsyVVeMxCSr6wk6NDvC8/KwownBK9OvcI20bEfBLdjRUj4Y0 ySe92VStk3n9GIez9M2XAfPV9ADWcUbN6KNkqbKYf9h0qgl3h+9ZhvsiQHPOEdnG +dsD/FVwnVYDQOdwWroZHi0UmorHS6gQbEHHO851xIkKIztMGY00CnvJOehdJWW8 BNc1pTAUpWiPvDddzjhmGdwEx5kSp/y3JwLu5BJTfVuNHc2Ss9KdZCYNmbpggHk= =dSvj -END PGP SIGNATURE- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Rik, should I send a separate email with the patch or you are OK with what I sent in the email? Are you queueing up my patch for applying upstream? On Tue, Dec 15, 2015 at 5:44 AM, Anirban Sinha wrote: > > > On Mon, 14 Dec 2015, Rik van Riel wrote: > >> On 12/14/2015 11:24 AM, Ani Sinha wrote: >> > Rik, any comments? >> >> Another good option is to simply ignore this warning, or drop >> the rcu_read_lock before doing the alt-syrsq-c action. >> >> After all, alt-sysrq-c is "crash the system, take a crash dump", >> which is not an action the system ever returns from. >> > > Yea I thought about this idea previously but then discarded it thinking it > would be too hacky. Here's the cooked up patch. I hope this can be > approved for mainline soon (I'm on vacation and working just on this issue > remotely) : > > From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Mon, 14 Dec 2015 14:55:08 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, we release the RCU read lock before we crash. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha > --- > drivers/tty/sysrq.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c > index 5381a72..08987ad 100644 > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -133,6 +133,12 @@ static void sysrq_handle_crash(int key) > { > char *killer = NULL; > > + /* we need to release the RCU read lock here, > + otherwise we get an annoying > + 'BUG: sleeping function called from invalid context' > + complaint from the kernel before the panic. > + */ > + rcu_read_unlock(); > panic_on_oops = 1; /* force panic */ > wmb(); > *killer = 1; > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Rik, should I send a separate email with the patch or you are OK with what I sent in the email? Are you queueing up my patch for applying upstream? On Tue, Dec 15, 2015 at 5:44 AM, Anirban Sinhawrote: > > > On Mon, 14 Dec 2015, Rik van Riel wrote: > >> On 12/14/2015 11:24 AM, Ani Sinha wrote: >> > Rik, any comments? >> >> Another good option is to simply ignore this warning, or drop >> the rcu_read_lock before doing the alt-syrsq-c action. >> >> After all, alt-sysrq-c is "crash the system, take a crash dump", >> which is not an action the system ever returns from. >> > > Yea I thought about this idea previously but then discarded it thinking it > would be too hacky. Here's the cooked up patch. I hope this can be > approved for mainline soon (I'm on vacation and working just on this issue > remotely) : > > From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Mon, 14 Dec 2015 14:55:08 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, we release the RCU read lock before we crash. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha > --- > drivers/tty/sysrq.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c > index 5381a72..08987ad 100644 > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -133,6 +133,12 @@ static void sysrq_handle_crash(int key) > { > char *killer = NULL; > > + /* we need to release the RCU read lock here, > + otherwise we get an annoying > + 'BUG: sleeping function called from invalid context' > + complaint from the kernel before the panic. > + */ > + rcu_read_unlock(); > panic_on_oops = 1; /* force panic */ > wmb(); > *killer = 1; > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Mon, 14 Dec 2015, Rik van Riel wrote: > On 12/14/2015 11:24 AM, Ani Sinha wrote: > > Rik, any comments? > > Another good option is to simply ignore this warning, or drop > the rcu_read_lock before doing the alt-syrsq-c action. > > After all, alt-sysrq-c is "crash the system, take a crash dump", > which is not an action the system ever returns from. > Yea I thought about this idea previously but then discarded it thinking it would be too hacky. Here's the cooked up patch. I hope this can be approved for mainline soon (I'm on vacation and working just on this issue remotely) : >From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 2001 From: Ani Sinha Date: Mon, 14 Dec 2015 14:55:08 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, we release the RCU read lock before we crash. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- drivers/tty/sysrq.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 5381a72..08987ad 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -133,6 +133,12 @@ static void sysrq_handle_crash(int key) { char *killer = NULL; + /* we need to release the RCU read lock here, + otherwise we get an annoying + 'BUG: sleeping function called from invalid context' + complaint from the kernel before the panic. + */ + rcu_read_unlock(); panic_on_oops = 1; /* force panic */ wmb(); *killer = 1; -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On 12/14/2015 11:24 AM, Ani Sinha wrote: > Rik, any comments? Another good option is to simply ignore this warning, or drop the rcu_read_lock before doing the alt-syrsq-c action. After all, alt-sysrq-c is "crash the system, take a crash dump", which is not an action the system ever returns from. static struct sysrq_key_op sysrq_crash_op = { .handler= sysrq_handle_crash, .help_msg = "crash(c)", .action_msg = "Trigger a crash", .enable_mask= SYSRQ_ENABLE_DUMP, }; -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Rik, any comments? On Sat, Dec 12, 2015 at 6:33 AM, Paul E. McKenney wrote: > On Fri, Dec 11, 2015 at 04:16:37PM -0800, Ani Sinha wrote: >> On Fri, 11 Dec 2015, Paul E. McKenney wrote: >> > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> > > On 12/11/2015 03:44 PM, Ani Sinha wrote: >> > > > >> > > > >> > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> > > > >> > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> > > >>> Hi guys >> > > >>> >> > > >>> I am noticing a new warning in linux 3.18 which we did not see before >> > > >>> in linux 3.4 : >> > > >>> >> > > >>> bash-4.1# echo c > /proc/sysrq-trigger >> > > >>> [ 978.807185] BUG: sleeping function called from invalid context at >> > > >>> ../arch/x86/mm/fault.c:1187 >> > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: >> > > >>> bash >> > > >>> [ 978.987358] Preemption disabled at:[] >> > > >>> printk+0x48/0x4a >> > > >>> >> > > >>> >> > > >>> I have bisected this to the following change : >> > > >>> >> > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> > > >>> Author: Rik van Riel >> > > >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> > > >>> >> > > >>> sysrq: rcu-ify __handle_sysrq >> > > >>> >> > > >>> >> > > >>> the rcu_read_lock() in handle_sysrq() bumps up >> > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> > > >>> calls might_sleep() in x86/mm/fault.c line 1191, >> > > >>> preempt_count_equals(0) returns false and hence the warning is >> > > >>> printed. >> > > >>> >> > > >>> One way to handle this would be to do something like this: >> > > >>> >> > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> > > >>> index eef44d9..d4dbe22 100644 >> > > >>> --- a/arch/x86/mm/fault.c >> > > >>> +++ b/arch/x86/mm/fault.c >> > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> > > >>> long error_code, >> > > >>> * If we're in an interrupt, have no user context or are running >> > > >>> * in a region with pagefaults disabled then we must not take the >> > > >>> fault >> > > >>> */ >> > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || >> > > >>> !mm)) { >> > > >> >> > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> > > >> rcu_preempt_depth() unconditionally returns zero. And if >> > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> > > >> the might_sleep() splat. >> > > >> >> > > >> Maybe use SRCU instead of RCU for this purpose? >> > > >> >> > > > >> > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> > > > From: Ani Sinha >> > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid >> > > > context' >> > > > warning in sysrq generated crash. >> > > > >> > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> > > > replaced spin_lock_irqsave() calls with >> > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> > > > disable preemption, faulthandler_disabled() in >> > > > __do_page_fault() in x86/fault.c returns false. When the code >> > > > later calls might_sleep() in the pagefault handler, we get the >> > > > following warning: >> > > > >> > > > BUG: sleeping function called from invalid context at >> > > > ../arch/x86/mm/fault.c:1187 >> > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> > > > Preemption disabled at:[] printk+0x48/0x4a >> > > > >> > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> > > >> > > The sysrq code can be called from irq context. >> > > >> > > Trying to use SRCU from an irq context sounds like it could >> > > be a bad idea, though admittedly I do not know enough about >> > > SRCU to know for sure :) >> > >> > Indeed, not the best idea! ;-) >> > >> > I could imagine something like this: >> > >> > if (in_irq()) >> > rcu_read_lock(); >> > else >> > idx = srcu_read_lock(_rcu); >> > >> > And ditto for unlock. Then, for the update: >> > >> > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); >> > >> > Where: >> > >> > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) >> > { >> > call_srcu(_rcu, head, func); >> > } >> > >> >> >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> From: Ani Sinha >> Date: Fri, 11 Dec 2015 12:07:42 -0800 >> Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> warning in sysrq generated crash. >> >> Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> replaced spin_lock_irqsave() calls with >> rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> disable preemption, faulthandler_disabled() in >> __do_page_fault() in x86/fault.c returns false. When the code >> later calls might_sleep() in the pagefault handler, we get the >> following
Re: new warning on sysrq kernel crash trigger
On Mon, 14 Dec 2015, Rik van Riel wrote: > On 12/14/2015 11:24 AM, Ani Sinha wrote: > > Rik, any comments? > > Another good option is to simply ignore this warning, or drop > the rcu_read_lock before doing the alt-syrsq-c action. > > After all, alt-sysrq-c is "crash the system, take a crash dump", > which is not an action the system ever returns from. > Yea I thought about this idea previously but then discarded it thinking it would be too hacky. Here's the cooked up patch. I hope this can be approved for mainline soon (I'm on vacation and working just on this issue remotely) : >From 105ff3ffce380650b3d58b3594a9be47bd604b28 Mon Sep 17 00:00:00 2001 From: Ani SinhaDate: Mon, 14 Dec 2015 14:55:08 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, we release the RCU read lock before we crash. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- drivers/tty/sysrq.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 5381a72..08987ad 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -133,6 +133,12 @@ static void sysrq_handle_crash(int key) { char *killer = NULL; + /* we need to release the RCU read lock here, + otherwise we get an annoying + 'BUG: sleeping function called from invalid context' + complaint from the kernel before the panic. + */ + rcu_read_unlock(); panic_on_oops = 1; /* force panic */ wmb(); *killer = 1; -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Rik, any comments? On Sat, Dec 12, 2015 at 6:33 AM, Paul E. McKenneywrote: > On Fri, Dec 11, 2015 at 04:16:37PM -0800, Ani Sinha wrote: >> On Fri, 11 Dec 2015, Paul E. McKenney wrote: >> > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> > > On 12/11/2015 03:44 PM, Ani Sinha wrote: >> > > > >> > > > >> > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> > > > >> > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> > > >>> Hi guys >> > > >>> >> > > >>> I am noticing a new warning in linux 3.18 which we did not see before >> > > >>> in linux 3.4 : >> > > >>> >> > > >>> bash-4.1# echo c > /proc/sysrq-trigger >> > > >>> [ 978.807185] BUG: sleeping function called from invalid context at >> > > >>> ../arch/x86/mm/fault.c:1187 >> > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: >> > > >>> bash >> > > >>> [ 978.987358] Preemption disabled at:[] >> > > >>> printk+0x48/0x4a >> > > >>> >> > > >>> >> > > >>> I have bisected this to the following change : >> > > >>> >> > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> > > >>> Author: Rik van Riel >> > > >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> > > >>> >> > > >>> sysrq: rcu-ify __handle_sysrq >> > > >>> >> > > >>> >> > > >>> the rcu_read_lock() in handle_sysrq() bumps up >> > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> > > >>> calls might_sleep() in x86/mm/fault.c line 1191, >> > > >>> preempt_count_equals(0) returns false and hence the warning is >> > > >>> printed. >> > > >>> >> > > >>> One way to handle this would be to do something like this: >> > > >>> >> > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> > > >>> index eef44d9..d4dbe22 100644 >> > > >>> --- a/arch/x86/mm/fault.c >> > > >>> +++ b/arch/x86/mm/fault.c >> > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> > > >>> long error_code, >> > > >>> * If we're in an interrupt, have no user context or are running >> > > >>> * in a region with pagefaults disabled then we must not take the >> > > >>> fault >> > > >>> */ >> > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || >> > > >>> !mm)) { >> > > >> >> > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> > > >> rcu_preempt_depth() unconditionally returns zero. And if >> > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> > > >> the might_sleep() splat. >> > > >> >> > > >> Maybe use SRCU instead of RCU for this purpose? >> > > >> >> > > > >> > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> > > > From: Ani Sinha >> > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid >> > > > context' >> > > > warning in sysrq generated crash. >> > > > >> > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> > > > replaced spin_lock_irqsave() calls with >> > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> > > > disable preemption, faulthandler_disabled() in >> > > > __do_page_fault() in x86/fault.c returns false. When the code >> > > > later calls might_sleep() in the pagefault handler, we get the >> > > > following warning: >> > > > >> > > > BUG: sleeping function called from invalid context at >> > > > ../arch/x86/mm/fault.c:1187 >> > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> > > > Preemption disabled at:[] printk+0x48/0x4a >> > > > >> > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> > > >> > > The sysrq code can be called from irq context. >> > > >> > > Trying to use SRCU from an irq context sounds like it could >> > > be a bad idea, though admittedly I do not know enough about >> > > SRCU to know for sure :) >> > >> > Indeed, not the best idea! ;-) >> > >> > I could imagine something like this: >> > >> > if (in_irq()) >> > rcu_read_lock(); >> > else >> > idx = srcu_read_lock(_rcu); >> > >> > And ditto for unlock. Then, for the update: >> > >> > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); >> > >> > Where: >> > >> > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) >> > { >> > call_srcu(_rcu, head, func); >> > } >> > >> >> >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> From: Ani Sinha >> Date: Fri, 11 Dec 2015 12:07:42 -0800 >> Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> warning in sysrq generated crash. >> >> Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> replaced spin_lock_irqsave() calls with >> rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> disable preemption, faulthandler_disabled() in >> __do_page_fault() in x86/fault.c returns false. When the code >>
Re: new warning on sysrq kernel crash trigger
On 12/14/2015 11:24 AM, Ani Sinha wrote: > Rik, any comments? Another good option is to simply ignore this warning, or drop the rcu_read_lock before doing the alt-syrsq-c action. After all, alt-sysrq-c is "crash the system, take a crash dump", which is not an action the system ever returns from. static struct sysrq_key_op sysrq_crash_op = { .handler= sysrq_handle_crash, .help_msg = "crash(c)", .action_msg = "Trigger a crash", .enable_mask= SYSRQ_ENABLE_DUMP, }; -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 04:16:37PM -0800, Ani Sinha wrote: > On Fri, 11 Dec 2015, Paul E. McKenney wrote: > > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > > > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > > > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > > >>> Hi guys > > > >>> > > > >>> I am noticing a new warning in linux 3.18 which we did not see before > > > >>> in linux 3.4 : > > > >>> > > > >>> bash-4.1# echo c > /proc/sysrq-trigger > > > >>> [ 978.807185] BUG: sleeping function called from invalid context at > > > >>> ../arch/x86/mm/fault.c:1187 > > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: > > > >>> bash > > > >>> [ 978.987358] Preemption disabled at:[] > > > >>> printk+0x48/0x4a > > > >>> > > > >>> > > > >>> I have bisected this to the following change : > > > >>> > > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > > >>> Author: Rik van Riel > > > >>> Date: Fri Jun 6 14:38:13 2014 -0700 > > > >>> > > > >>> sysrq: rcu-ify __handle_sysrq > > > >>> > > > >>> > > > >>> the rcu_read_lock() in handle_sysrq() bumps up > > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > > >>> calls might_sleep() in x86/mm/fault.c line 1191, > > > >>> preempt_count_equals(0) returns false and hence the warning is > > > >>> printed. > > > >>> > > > >>> One way to handle this would be to do something like this: > > > >>> > > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > > >>> index eef44d9..d4dbe22 100644 > > > >>> --- a/arch/x86/mm/fault.c > > > >>> +++ b/arch/x86/mm/fault.c > > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > > >>> long error_code, > > > >>> * If we're in an interrupt, have no user context or are running > > > >>> * in a region with pagefaults disabled then we must not take the > > > >>> fault > > > >>> */ > > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || > > > >>> !mm)) { > > > >> > > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > > >> rcu_preempt_depth() unconditionally returns zero. And if > > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > > >> the might_sleep() splat. > > > >> > > > >> Maybe use SRCU instead of RCU for this purpose? > > > >> > > > > > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > > > From: Ani Sinha > > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > > > warning in sysrq generated crash. > > > > > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > > > replaced spin_lock_irqsave() calls with > > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > > > disable preemption, faulthandler_disabled() in > > > > __do_page_fault() in x86/fault.c returns false. When the code > > > > later calls might_sleep() in the pagefault handler, we get the > > > > following warning: > > > > > > > > BUG: sleeping function called from invalid context at > > > > ../arch/x86/mm/fault.c:1187 > > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > > Preemption disabled at:[] printk+0x48/0x4a > > > > > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > > > > > The sysrq code can be called from irq context. > > > > > > Trying to use SRCU from an irq context sounds like it could > > > be a bad idea, though admittedly I do not know enough about > > > SRCU to know for sure :) > > > > Indeed, not the best idea! ;-) > > > > I could imagine something like this: > > > > if (in_irq()) > > rcu_read_lock(); > > else > > idx = srcu_read_lock(_rcu); > > > > And ditto for unlock. Then, for the update: > > > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > > > Where: > > > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) > > { > > call_srcu(_rcu, head, func); > > } > > > > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[]
Re: new warning on sysrq kernel crash trigger
On Fri, 11 Dec 2015, Paul E. McKenney wrote: > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > >>> Hi guys > > >>> > > >>> I am noticing a new warning in linux 3.18 which we did not see before > > >>> in linux 3.4 : > > >>> > > >>> bash-4.1# echo c > /proc/sysrq-trigger > > >>> [ 978.807185] BUG: sleeping function called from invalid context at > > >>> ../arch/x86/mm/fault.c:1187 > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > >>> [ 978.987358] Preemption disabled at:[] > > >>> printk+0x48/0x4a > > >>> > > >>> > > >>> I have bisected this to the following change : > > >>> > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > >>> Author: Rik van Riel > > >>> Date: Fri Jun 6 14:38:13 2014 -0700 > > >>> > > >>> sysrq: rcu-ify __handle_sysrq > > >>> > > >>> > > >>> the rcu_read_lock() in handle_sysrq() bumps up > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > >>> calls might_sleep() in x86/mm/fault.c line 1191, > > >>> preempt_count_equals(0) returns false and hence the warning is > > >>> printed. > > >>> > > >>> One way to handle this would be to do something like this: > > >>> > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > >>> index eef44d9..d4dbe22 100644 > > >>> --- a/arch/x86/mm/fault.c > > >>> +++ b/arch/x86/mm/fault.c > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > >>> long error_code, > > >>> * If we're in an interrupt, have no user context or are running > > >>> * in a region with pagefaults disabled then we must not take the fault > > >>> */ > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > >> > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > >> rcu_preempt_depth() unconditionally returns zero. And if > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > >> the might_sleep() splat. > > >> > > >> Maybe use SRCU instead of RCU for this purpose? > > >> > > > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > > From: Ani Sinha > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > > warning in sysrq generated crash. > > > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > > replaced spin_lock_irqsave() calls with > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > > disable preemption, faulthandler_disabled() in > > > __do_page_fault() in x86/fault.c returns false. When the code > > > later calls might_sleep() in the pagefault handler, we get the > > > following warning: > > > > > > BUG: sleeping function called from invalid context at > > > ../arch/x86/mm/fault.c:1187 > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > Preemption disabled at:[] printk+0x48/0x4a > > > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > > > The sysrq code can be called from irq context. > > > > Trying to use SRCU from an irq context sounds like it could > > be a bad idea, though admittedly I do not know enough about > > SRCU to know for sure :) > > Indeed, not the best idea! ;-) > > I could imagine something like this: > > if (in_irq()) > rcu_read_lock(); > else > idx = srcu_read_lock(_rcu); > > And ditto for unlock. Then, for the update: > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > Where: > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) > { > call_srcu(_rcu, head, func); > } > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 From: Ani Sinha Date: Fri, 11 Dec 2015 12:07:42 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, replace RCU call in handle_sysrq() to use SRCU in non-irq context. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index
Re: new warning on sysrq kernel crash trigger
I backported your ee376dbdf277 ("rcu: Consolidate rcu_synchronize and wakeme_after_rcu()" & ec90a194ae2cb8b8e("rcu: Create a synchronize_rcu_mult()") and tested this on our 3.18 kernel running on our board. The sysrq kernel crash seems to have been fixed (behavior as per our old 3.4 kernel). I will send in a patch as per your former suggestion ... On Fri, Dec 11, 2015 at 4:02 PM, Paul E. McKenney wrote: > On Fri, Dec 11, 2015 at 03:41:04PM -0800, Ani Sinha wrote: >> On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenney >> wrote: >> > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> >> On 12/11/2015 03:44 PM, Ani Sinha wrote: >> >> > >> >> > >> >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> >> > >> >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> >> >>> Hi guys >> >> >>> >> >> >>> I am noticing a new warning in linux 3.18 which we did not see before >> >> >>> in linux 3.4 : >> >> >>> >> >> >>> bash-4.1# echo c > /proc/sysrq-trigger >> >> >>> [ 978.807185] BUG: sleeping function called from invalid context at >> >> >>> ../arch/x86/mm/fault.c:1187 >> >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: >> >> >>> bash >> >> >>> [ 978.987358] Preemption disabled at:[] >> >> >>> printk+0x48/0x4a >> >> >>> >> >> >>> >> >> >>> I have bisected this to the following change : >> >> >>> >> >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> >> >>> Author: Rik van Riel >> >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> >> >>> >> >> >>> sysrq: rcu-ify __handle_sysrq >> >> >>> >> >> >>> >> >> >>> the rcu_read_lock() in handle_sysrq() bumps up >> >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> >> >>> calls might_sleep() in x86/mm/fault.c line 1191, >> >> >>> preempt_count_equals(0) returns false and hence the warning is >> >> >>> printed. >> >> >>> >> >> >>> One way to handle this would be to do something like this: >> >> >>> >> >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> >> >>> index eef44d9..d4dbe22 100644 >> >> >>> --- a/arch/x86/mm/fault.c >> >> >>> +++ b/arch/x86/mm/fault.c >> >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> >> >>> long error_code, >> >> >>> * If we're in an interrupt, have no user context or are running >> >> >>> * in a region with pagefaults disabled then we must not take the >> >> >>> fault >> >> >>> */ >> >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || >> >> >>> !mm)) { >> >> >> >> >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> >> >> rcu_preempt_depth() unconditionally returns zero. And if >> >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> >> >> the might_sleep() splat. >> >> >> >> >> >> Maybe use SRCU instead of RCU for this purpose? >> >> >> >> >> > >> >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> >> > From: Ani Sinha >> >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> >> > warning in sysrq generated crash. >> >> > >> >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> >> > replaced spin_lock_irqsave() calls with >> >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> >> > disable preemption, faulthandler_disabled() in >> >> > __do_page_fault() in x86/fault.c returns false. When the code >> >> > later calls might_sleep() in the pagefault handler, we get the >> >> > following warning: >> >> > >> >> > BUG: sleeping function called from invalid context at >> >> > ../arch/x86/mm/fault.c:1187 >> >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> >> > Preemption disabled at:[] printk+0x48/0x4a >> >> > >> >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> >> >> >> The sysrq code can be called from irq context. >> >> >> >> Trying to use SRCU from an irq context sounds like it could >> >> be a bad idea, though admittedly I do not know enough about >> >> SRCU to know for sure :) >> > >> > Indeed, not the best idea! ;-) >> > >> > I could imagine something like this: >> > >> > if (in_irq()) >> > rcu_read_lock(); >> > else >> > idx = srcu_read_lock(_rcu); >> > >> > And ditto for unlock. Then, for the update: >> > >> > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); >> >> This won't work on 3.18 as this api was introduced in linux 4.3. > > Then do this: > > synchronize_rcu(); > synchronize_srcu(_rcu); > >> > Where: >> > >> > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t >> > func) >> > { >> > call_srcu(_rcu, head, func); >> > } >> > >> > Here I presume that the page-fault code avoids the might_sleep if invoked >> > from irq context. >> >> Quick look at the code seems to indicate that this is true. > >
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 03:41:04PM -0800, Ani Sinha wrote: > On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenney > wrote: > > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > >> On 12/11/2015 03:44 PM, Ani Sinha wrote: > >> > > >> > > >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > >> > > >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > >> >>> Hi guys > >> >>> > >> >>> I am noticing a new warning in linux 3.18 which we did not see before > >> >>> in linux 3.4 : > >> >>> > >> >>> bash-4.1# echo c > /proc/sysrq-trigger > >> >>> [ 978.807185] BUG: sleeping function called from invalid context at > >> >>> ../arch/x86/mm/fault.c:1187 > >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: > >> >>> bash > >> >>> [ 978.987358] Preemption disabled at:[] > >> >>> printk+0x48/0x4a > >> >>> > >> >>> > >> >>> I have bisected this to the following change : > >> >>> > >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > >> >>> Author: Rik van Riel > >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 > >> >>> > >> >>> sysrq: rcu-ify __handle_sysrq > >> >>> > >> >>> > >> >>> the rcu_read_lock() in handle_sysrq() bumps up > >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > >> >>> calls might_sleep() in x86/mm/fault.c line 1191, > >> >>> preempt_count_equals(0) returns false and hence the warning is > >> >>> printed. > >> >>> > >> >>> One way to handle this would be to do something like this: > >> >>> > >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > >> >>> index eef44d9..d4dbe22 100644 > >> >>> --- a/arch/x86/mm/fault.c > >> >>> +++ b/arch/x86/mm/fault.c > >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > >> >>> long error_code, > >> >>> * If we're in an interrupt, have no user context or are running > >> >>> * in a region with pagefaults disabled then we must not take the > >> >>> fault > >> >>> */ > >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { > >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) > >> >>> { > >> >> > >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > >> >> rcu_preempt_depth() unconditionally returns zero. And if > >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > >> >> the might_sleep() splat. > >> >> > >> >> Maybe use SRCU instead of RCU for this purpose? > >> >> > >> > > >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > >> > From: Ani Sinha > >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 > >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > >> > warning in sysrq generated crash. > >> > > >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > >> > replaced spin_lock_irqsave() calls with > >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > >> > disable preemption, faulthandler_disabled() in > >> > __do_page_fault() in x86/fault.c returns false. When the code > >> > later calls might_sleep() in the pagefault handler, we get the > >> > following warning: > >> > > >> > BUG: sleeping function called from invalid context at > >> > ../arch/x86/mm/fault.c:1187 > >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > >> > Preemption disabled at:[] printk+0x48/0x4a > >> > > >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. > >> > >> The sysrq code can be called from irq context. > >> > >> Trying to use SRCU from an irq context sounds like it could > >> be a bad idea, though admittedly I do not know enough about > >> SRCU to know for sure :) > > > > Indeed, not the best idea! ;-) > > > > I could imagine something like this: > > > > if (in_irq()) > > rcu_read_lock(); > > else > > idx = srcu_read_lock(_rcu); > > > > And ditto for unlock. Then, for the update: > > > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > This won't work on 3.18 as this api was introduced in linux 4.3. Then do this: synchronize_rcu(); synchronize_srcu(_rcu); > > Where: > > > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t > > func) > > { > > call_srcu(_rcu, head, func); > > } > > > > Here I presume that the page-fault code avoids the might_sleep if invoked > > from irq context. > > Quick look at the code seems to indicate that this is true. Good! ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenney wrote: > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> On 12/11/2015 03:44 PM, Ani Sinha wrote: >> > >> > >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> > >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> >>> Hi guys >> >>> >> >>> I am noticing a new warning in linux 3.18 which we did not see before >> >>> in linux 3.4 : >> >>> >> >>> bash-4.1# echo c > /proc/sysrq-trigger >> >>> [ 978.807185] BUG: sleeping function called from invalid context at >> >>> ../arch/x86/mm/fault.c:1187 >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> >>> [ 978.987358] Preemption disabled at:[] >> >>> printk+0x48/0x4a >> >>> >> >>> >> >>> I have bisected this to the following change : >> >>> >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> >>> Author: Rik van Riel >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> >>> >> >>> sysrq: rcu-ify __handle_sysrq >> >>> >> >>> >> >>> the rcu_read_lock() in handle_sysrq() bumps up >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> >>> calls might_sleep() in x86/mm/fault.c line 1191, >> >>> preempt_count_equals(0) returns false and hence the warning is >> >>> printed. >> >>> >> >>> One way to handle this would be to do something like this: >> >>> >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> >>> index eef44d9..d4dbe22 100644 >> >>> --- a/arch/x86/mm/fault.c >> >>> +++ b/arch/x86/mm/fault.c >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> >>> long error_code, >> >>> * If we're in an interrupt, have no user context or are running >> >>> * in a region with pagefaults disabled then we must not take the fault >> >>> */ >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { >> >> >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> >> rcu_preempt_depth() unconditionally returns zero. And if >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> >> the might_sleep() splat. >> >> >> >> Maybe use SRCU instead of RCU for this purpose? >> >> >> > >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> > From: Ani Sinha >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> > warning in sysrq generated crash. >> > >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> > replaced spin_lock_irqsave() calls with >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> > disable preemption, faulthandler_disabled() in >> > __do_page_fault() in x86/fault.c returns false. When the code >> > later calls might_sleep() in the pagefault handler, we get the >> > following warning: >> > >> > BUG: sleeping function called from invalid context at >> > ../arch/x86/mm/fault.c:1187 >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> > Preemption disabled at:[] printk+0x48/0x4a >> > >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> >> The sysrq code can be called from irq context. >> >> Trying to use SRCU from an irq context sounds like it could >> be a bad idea, though admittedly I do not know enough about >> SRCU to know for sure :) > > Indeed, not the best idea! ;-) > > I could imagine something like this: > > if (in_irq()) > rcu_read_lock(); > else > idx = srcu_read_lock(_rcu); > > And ditto for unlock. Then, for the update: > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); This won't work on 3.18 as this api was introduced in linux 4.3. > > Where: > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t > func) > { > call_srcu(_rcu, head, func); > } > > Here I presume that the page-fault code avoids the might_sleep if invoked > from irq context. Quick look at the code seems to indicate that this is true. > > Thoughts? > > Thanx, Paul > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > >>> Hi guys > >>> > >>> I am noticing a new warning in linux 3.18 which we did not see before > >>> in linux 3.4 : > >>> > >>> bash-4.1# echo c > /proc/sysrq-trigger > >>> [ 978.807185] BUG: sleeping function called from invalid context at > >>> ../arch/x86/mm/fault.c:1187 > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > >>> [ 978.987358] Preemption disabled at:[] > >>> printk+0x48/0x4a > >>> > >>> > >>> I have bisected this to the following change : > >>> > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > >>> Author: Rik van Riel > >>> Date: Fri Jun 6 14:38:13 2014 -0700 > >>> > >>> sysrq: rcu-ify __handle_sysrq > >>> > >>> > >>> the rcu_read_lock() in handle_sysrq() bumps up > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > >>> calls might_sleep() in x86/mm/fault.c line 1191, > >>> preempt_count_equals(0) returns false and hence the warning is > >>> printed. > >>> > >>> One way to handle this would be to do something like this: > >>> > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > >>> index eef44d9..d4dbe22 100644 > >>> --- a/arch/x86/mm/fault.c > >>> +++ b/arch/x86/mm/fault.c > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > >>> long error_code, > >>> * If we're in an interrupt, have no user context or are running > >>> * in a region with pagefaults disabled then we must not take the fault > >>> */ > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > >> > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > >> rcu_preempt_depth() unconditionally returns zero. And if > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > >> the might_sleep() splat. > >> > >> Maybe use SRCU instead of RCU for this purpose? > >> > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > From: Ani Sinha > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > warning in sysrq generated crash. > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > replaced spin_lock_irqsave() calls with > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > disable preemption, faulthandler_disabled() in > > __do_page_fault() in x86/fault.c returns false. When the code > > later calls might_sleep() in the pagefault handler, we get the > > following warning: > > > > BUG: sleeping function called from invalid context at > > ../arch/x86/mm/fault.c:1187 > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > Preemption disabled at:[] printk+0x48/0x4a > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > The sysrq code can be called from irq context. > > Trying to use SRCU from an irq context sounds like it could > be a bad idea, though admittedly I do not know enough about > SRCU to know for sure :) Indeed, not the best idea! ;-) I could imagine something like this: if (in_irq()) rcu_read_lock(); else idx = srcu_read_lock(_rcu); And ditto for unlock. Then, for the update: synchronize_rcu_mult(call_rcu, call_sysrq_srcu); Where: static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) { call_srcu(_rcu, head, func); } Here I presume that the page-fault code avoids the might_sleep if invoked from irq context. Thoughts? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >>> Hi guys >>> >>> I am noticing a new warning in linux 3.18 which we did not see before >>> in linux 3.4 : >>> >>> bash-4.1# echo c > /proc/sysrq-trigger >>> [ 978.807185] BUG: sleeping function called from invalid context at >>> ../arch/x86/mm/fault.c:1187 >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >>> [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a >>> >>> >>> I have bisected this to the following change : >>> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >>> Author: Rik van Riel >>> Date: Fri Jun 6 14:38:13 2014 -0700 >>> >>> sysrq: rcu-ify __handle_sysrq >>> >>> >>> the rcu_read_lock() in handle_sysrq() bumps up >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >>> calls might_sleep() in x86/mm/fault.c line 1191, >>> preempt_count_equals(0) returns false and hence the warning is >>> printed. >>> >>> One way to handle this would be to do something like this: >>> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >>> index eef44d9..d4dbe22 100644 >>> --- a/arch/x86/mm/fault.c >>> +++ b/arch/x86/mm/fault.c >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >>> long error_code, >>> * If we're in an interrupt, have no user context or are running >>> * in a region with pagefaults disabled then we must not take the fault >>> */ >>> - if (unlikely(faulthandler_disabled() || !mm)) { >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> rcu_preempt_depth() unconditionally returns zero. And if >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> the might_sleep() splat. >> >> Maybe use SRCU instead of RCU for this purpose? >> > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, replace RCU call in handle_sysrq() to use SRCU. The sysrq code can be called from irq context. Trying to use SRCU from an irq context sounds like it could be a bad idea, though admittedly I do not know enough about SRCU to know for sure :) -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 12:44:13PM -0800, Ani Sinha wrote: > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > > Hi guys > > > > > > I am noticing a new warning in linux 3.18 which we did not see before > > > in linux 3.4 : > > > > > > bash-4.1# echo c > /proc/sysrq-trigger > > > [ 978.807185] BUG: sleeping function called from invalid context at > > > ../arch/x86/mm/fault.c:1187 > > > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > [ 978.987358] Preemption disabled at:[] > > > printk+0x48/0x4a > > > > > > > > > I have bisected this to the following change : > > > > > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > > Author: Rik van Riel > > > Date: Fri Jun 6 14:38:13 2014 -0700 > > > > > > sysrq: rcu-ify __handle_sysrq > > > > > > > > > the rcu_read_lock() in handle_sysrq() bumps up > > > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > > calls might_sleep() in x86/mm/fault.c line 1191, > > > preempt_count_equals(0) returns false and hence the warning is > > > printed. > > > > > > One way to handle this would be to do something like this: > > > > > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > > index eef44d9..d4dbe22 100644 > > > --- a/arch/x86/mm/fault.c > > > +++ b/arch/x86/mm/fault.c > > > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > > long error_code, > > > * If we're in an interrupt, have no user context or are running > > > * in a region with pagefaults disabled then we must not take the fault > > > */ > > > - if (unlikely(faulthandler_disabled() || !mm)) { > > > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > rcu_preempt_depth() unconditionally returns zero. And if > > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > the might_sleep() splat. > > > > Maybe use SRCU instead of RCU for this purpose? > > > > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha Hello, Ani, This patch looks incomplete. The synchronize_rcu() that Rik added in __sysrq_swap_key_ops() needs to become synchronize_srcu(). Which means that it needs to use the sysrq_rcu structure, which means that this structure cannot be local to __handle_sysrq(). Please see below... > --- > drivers/tty/sysrq.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c > index 5381a72..904865f 100644 > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -519,10 +519,12 @@ void __handle_sysrq(int key, bool check_mask) > { > struct sysrq_key_op *op_p; > int orig_log_level; > - int i; > + int i, idx; > + struct srcu_struct sysrq_rcu; > > + init_srcu_struct(_rcu); Use DEFINE_STATIC_SRCU() to define sysrq_rcu at the global level, and then get rid of the two lines above. Thanx, Paul > rcu_sysrq_start(); > - rcu_read_lock(); > + idx = srcu_read_lock(_rcu); > /* >* Raise the apparent loglevel to maximum so that the sysrq header >* is shown to provide the user with positive feedback. We do not > @@ -564,7 +566,7 @@ void __handle_sysrq(int key, bool check_mask) > pr_cont("\n"); > console_loglevel = orig_log_level; > } > - rcu_read_unlock(); > + srcu_read_unlock(_rcu, idx); > rcu_sysrq_end(); > } > > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Thu, 10 Dec 2015, Paul E. McKenney wrote: > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > Hi guys > > > > I am noticing a new warning in linux 3.18 which we did not see before > > in linux 3.4 : > > > > bash-4.1# echo c > /proc/sysrq-trigger > > [ 978.807185] BUG: sleeping function called from invalid context at > > ../arch/x86/mm/fault.c:1187 > > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a > > > > > > I have bisected this to the following change : > > > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > Author: Rik van Riel > > Date: Fri Jun 6 14:38:13 2014 -0700 > > > > sysrq: rcu-ify __handle_sysrq > > > > > > the rcu_read_lock() in handle_sysrq() bumps up > > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > calls might_sleep() in x86/mm/fault.c line 1191, > > preempt_count_equals(0) returns false and hence the warning is > > printed. > > > > One way to handle this would be to do something like this: > > > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > index eef44d9..d4dbe22 100644 > > --- a/arch/x86/mm/fault.c > > +++ b/arch/x86/mm/fault.c > > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > long error_code, > > * If we're in an interrupt, have no user context or are running > > * in a region with pagefaults disabled then we must not take the fault > > */ > > - if (unlikely(faulthandler_disabled() || !mm)) { > > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > rcu_preempt_depth() unconditionally returns zero. And if > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > the might_sleep() splat. > > Maybe use SRCU instead of RCU for this purpose? > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 From: Ani Sinha Date: Fri, 11 Dec 2015 12:07:42 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, replace RCU call in handle_sysrq() to use SRCU. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- drivers/tty/sysrq.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 5381a72..904865f 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -519,10 +519,12 @@ void __handle_sysrq(int key, bool check_mask) { struct sysrq_key_op *op_p; int orig_log_level; - int i; + int i, idx; + struct srcu_struct sysrq_rcu; + init_srcu_struct(_rcu); rcu_sysrq_start(); - rcu_read_lock(); + idx = srcu_read_lock(_rcu); /* * Raise the apparent loglevel to maximum so that the sysrq header * is shown to provide the user with positive feedback. We do not @@ -564,7 +566,7 @@ void __handle_sysrq(int key, bool check_mask) pr_cont("\n"); console_loglevel = orig_log_level; } - rcu_read_unlock(); + srcu_read_unlock(_rcu, idx); rcu_sysrq_end(); } -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Well I can certainly send a patch but I wonder if simply using SRCU for this one instance in Rik's original patch will not break anything else. Rik, please provide your thoughts. On Thu, Dec 10, 2015 at 9:26 PM, Paul E. McKenney wrote: > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> Hi guys >> >> I am noticing a new warning in linux 3.18 which we did not see before >> in linux 3.4 : >> >> bash-4.1# echo c > /proc/sysrq-trigger >> [ 978.807185] BUG: sleeping function called from invalid context at >> ../arch/x86/mm/fault.c:1187 >> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a >> >> >> I have bisected this to the following change : >> >> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> Author: Rik van Riel >> Date: Fri Jun 6 14:38:13 2014 -0700 >> >> sysrq: rcu-ify __handle_sysrq >> >> >> the rcu_read_lock() in handle_sysrq() bumps up >> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> calls might_sleep() in x86/mm/fault.c line 1191, >> preempt_count_equals(0) returns false and hence the warning is >> printed. >> >> One way to handle this would be to do something like this: >> >> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> index eef44d9..d4dbe22 100644 >> --- a/arch/x86/mm/fault.c >> +++ b/arch/x86/mm/fault.c >> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> long error_code, >> * If we're in an interrupt, have no user context or are running >> * in a region with pagefaults disabled then we must not take the fault >> */ >> - if (unlikely(faulthandler_disabled() || !mm)) { >> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > rcu_preempt_depth() unconditionally returns zero. And if > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > the might_sleep() splat. > > Maybe use SRCU instead of RCU for this purpose? > > Thanx, Paul > >> bad_area_nosemaphore(regs, error_code, address); >> return; >> } >> >> I am wondering if this would be the right approach. I have tested that >> this patch does indeed suppress the warning. If you guys agree, I will >> send a patch. It's true that this is a trivial issue since we are >> intentionally crashing the kernel but in our case, this additional >> complaint from the kernel is confusing our test scripts and they are >> generating false positives. >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 12:44:13PM -0800, Ani Sinha wrote: > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > > Hi guys > > > > > > I am noticing a new warning in linux 3.18 which we did not see before > > > in linux 3.4 : > > > > > > bash-4.1# echo c > /proc/sysrq-trigger > > > [ 978.807185] BUG: sleeping function called from invalid context at > > > ../arch/x86/mm/fault.c:1187 > > > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > [ 978.987358] Preemption disabled at:[] > > > printk+0x48/0x4a > > > > > > > > > I have bisected this to the following change : > > > > > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > > Author: Rik van Riel> > > Date: Fri Jun 6 14:38:13 2014 -0700 > > > > > > sysrq: rcu-ify __handle_sysrq > > > > > > > > > the rcu_read_lock() in handle_sysrq() bumps up > > > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > > calls might_sleep() in x86/mm/fault.c line 1191, > > > preempt_count_equals(0) returns false and hence the warning is > > > printed. > > > > > > One way to handle this would be to do something like this: > > > > > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > > index eef44d9..d4dbe22 100644 > > > --- a/arch/x86/mm/fault.c > > > +++ b/arch/x86/mm/fault.c > > > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > > long error_code, > > > * If we're in an interrupt, have no user context or are running > > > * in a region with pagefaults disabled then we must not take the fault > > > */ > > > - if (unlikely(faulthandler_disabled() || !mm)) { > > > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > rcu_preempt_depth() unconditionally returns zero. And if > > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > the might_sleep() splat. > > > > Maybe use SRCU instead of RCU for this purpose? > > > > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > Tested this patch on linux 3.18 by booting off one of our boards. > > Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > Signed-off-by: Ani Sinha Hello, Ani, This patch looks incomplete. The synchronize_rcu() that Rik added in __sysrq_swap_key_ops() needs to become synchronize_srcu(). Which means that it needs to use the sysrq_rcu structure, which means that this structure cannot be local to __handle_sysrq(). Please see below... > --- > drivers/tty/sysrq.c | 8 +--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c > index 5381a72..904865f 100644 > --- a/drivers/tty/sysrq.c > +++ b/drivers/tty/sysrq.c > @@ -519,10 +519,12 @@ void __handle_sysrq(int key, bool check_mask) > { > struct sysrq_key_op *op_p; > int orig_log_level; > - int i; > + int i, idx; > + struct srcu_struct sysrq_rcu; > > + init_srcu_struct(_rcu); Use DEFINE_STATIC_SRCU() to define sysrq_rcu at the global level, and then get rid of the two lines above. Thanx, Paul > rcu_sysrq_start(); > - rcu_read_lock(); > + idx = srcu_read_lock(_rcu); > /* >* Raise the apparent loglevel to maximum so that the sysrq header >* is shown to provide the user with positive feedback. We do not > @@ -564,7 +566,7 @@ void __handle_sysrq(int key, bool check_mask) > pr_cont("\n"); > console_loglevel = orig_log_level; > } > - rcu_read_unlock(); > + srcu_read_unlock(_rcu, idx); > rcu_sysrq_end(); > } > > -- > 1.8.1.4 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenneywrote: > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> On 12/11/2015 03:44 PM, Ani Sinha wrote: >> > >> > >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> > >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> >>> Hi guys >> >>> >> >>> I am noticing a new warning in linux 3.18 which we did not see before >> >>> in linux 3.4 : >> >>> >> >>> bash-4.1# echo c > /proc/sysrq-trigger >> >>> [ 978.807185] BUG: sleeping function called from invalid context at >> >>> ../arch/x86/mm/fault.c:1187 >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> >>> [ 978.987358] Preemption disabled at:[] >> >>> printk+0x48/0x4a >> >>> >> >>> >> >>> I have bisected this to the following change : >> >>> >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> >>> Author: Rik van Riel >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> >>> >> >>> sysrq: rcu-ify __handle_sysrq >> >>> >> >>> >> >>> the rcu_read_lock() in handle_sysrq() bumps up >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> >>> calls might_sleep() in x86/mm/fault.c line 1191, >> >>> preempt_count_equals(0) returns false and hence the warning is >> >>> printed. >> >>> >> >>> One way to handle this would be to do something like this: >> >>> >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> >>> index eef44d9..d4dbe22 100644 >> >>> --- a/arch/x86/mm/fault.c >> >>> +++ b/arch/x86/mm/fault.c >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> >>> long error_code, >> >>> * If we're in an interrupt, have no user context or are running >> >>> * in a region with pagefaults disabled then we must not take the fault >> >>> */ >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { >> >> >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> >> rcu_preempt_depth() unconditionally returns zero. And if >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> >> the might_sleep() splat. >> >> >> >> Maybe use SRCU instead of RCU for this purpose? >> >> >> > >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> > From: Ani Sinha >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> > warning in sysrq generated crash. >> > >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> > replaced spin_lock_irqsave() calls with >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> > disable preemption, faulthandler_disabled() in >> > __do_page_fault() in x86/fault.c returns false. When the code >> > later calls might_sleep() in the pagefault handler, we get the >> > following warning: >> > >> > BUG: sleeping function called from invalid context at >> > ../arch/x86/mm/fault.c:1187 >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> > Preemption disabled at:[] printk+0x48/0x4a >> > >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> >> The sysrq code can be called from irq context. >> >> Trying to use SRCU from an irq context sounds like it could >> be a bad idea, though admittedly I do not know enough about >> SRCU to know for sure :) > > Indeed, not the best idea! ;-) > > I could imagine something like this: > > if (in_irq()) > rcu_read_lock(); > else > idx = srcu_read_lock(_rcu); > > And ditto for unlock. Then, for the update: > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); This won't work on 3.18 as this api was introduced in linux 4.3. > > Where: > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t > func) > { > call_srcu(_rcu, head, func); > } > > Here I presume that the page-fault code avoids the might_sleep if invoked > from irq context. Quick look at the code seems to indicate that this is true. > > Thoughts? > > Thanx, Paul > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 03:41:04PM -0800, Ani Sinha wrote: > On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenney >wrote: > > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > >> On 12/11/2015 03:44 PM, Ani Sinha wrote: > >> > > >> > > >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > >> > > >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > >> >>> Hi guys > >> >>> > >> >>> I am noticing a new warning in linux 3.18 which we did not see before > >> >>> in linux 3.4 : > >> >>> > >> >>> bash-4.1# echo c > /proc/sysrq-trigger > >> >>> [ 978.807185] BUG: sleeping function called from invalid context at > >> >>> ../arch/x86/mm/fault.c:1187 > >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: > >> >>> bash > >> >>> [ 978.987358] Preemption disabled at:[] > >> >>> printk+0x48/0x4a > >> >>> > >> >>> > >> >>> I have bisected this to the following change : > >> >>> > >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > >> >>> Author: Rik van Riel > >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 > >> >>> > >> >>> sysrq: rcu-ify __handle_sysrq > >> >>> > >> >>> > >> >>> the rcu_read_lock() in handle_sysrq() bumps up > >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > >> >>> calls might_sleep() in x86/mm/fault.c line 1191, > >> >>> preempt_count_equals(0) returns false and hence the warning is > >> >>> printed. > >> >>> > >> >>> One way to handle this would be to do something like this: > >> >>> > >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > >> >>> index eef44d9..d4dbe22 100644 > >> >>> --- a/arch/x86/mm/fault.c > >> >>> +++ b/arch/x86/mm/fault.c > >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > >> >>> long error_code, > >> >>> * If we're in an interrupt, have no user context or are running > >> >>> * in a region with pagefaults disabled then we must not take the > >> >>> fault > >> >>> */ > >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { > >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) > >> >>> { > >> >> > >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > >> >> rcu_preempt_depth() unconditionally returns zero. And if > >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > >> >> the might_sleep() splat. > >> >> > >> >> Maybe use SRCU instead of RCU for this purpose? > >> >> > >> > > >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > >> > From: Ani Sinha > >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 > >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > >> > warning in sysrq generated crash. > >> > > >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > >> > replaced spin_lock_irqsave() calls with > >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > >> > disable preemption, faulthandler_disabled() in > >> > __do_page_fault() in x86/fault.c returns false. When the code > >> > later calls might_sleep() in the pagefault handler, we get the > >> > following warning: > >> > > >> > BUG: sleeping function called from invalid context at > >> > ../arch/x86/mm/fault.c:1187 > >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > >> > Preemption disabled at:[] printk+0x48/0x4a > >> > > >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. > >> > >> The sysrq code can be called from irq context. > >> > >> Trying to use SRCU from an irq context sounds like it could > >> be a bad idea, though admittedly I do not know enough about > >> SRCU to know for sure :) > > > > Indeed, not the best idea! ;-) > > > > I could imagine something like this: > > > > if (in_irq()) > > rcu_read_lock(); > > else > > idx = srcu_read_lock(_rcu); > > > > And ditto for unlock. Then, for the update: > > > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > This won't work on 3.18 as this api was introduced in linux 4.3. Then do this: synchronize_rcu(); synchronize_srcu(_rcu); > > Where: > > > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t > > func) > > { > > call_srcu(_rcu, head, func); > > } > > > > Here I presume that the page-fault code avoids the might_sleep if invoked > > from irq context. > > Quick look at the code seems to indicate that this is true. Good! ;-) Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
I backported your ee376dbdf277 ("rcu: Consolidate rcu_synchronize and wakeme_after_rcu()" & ec90a194ae2cb8b8e("rcu: Create a synchronize_rcu_mult()") and tested this on our 3.18 kernel running on our board. The sysrq kernel crash seems to have been fixed (behavior as per our old 3.4 kernel). I will send in a patch as per your former suggestion ... On Fri, Dec 11, 2015 at 4:02 PM, Paul E. McKenneywrote: > On Fri, Dec 11, 2015 at 03:41:04PM -0800, Ani Sinha wrote: >> On Fri, Dec 11, 2015 at 2:27 PM, Paul E. McKenney >> wrote: >> > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: >> >> On 12/11/2015 03:44 PM, Ani Sinha wrote: >> >> > >> >> > >> >> > On Thu, 10 Dec 2015, Paul E. McKenney wrote: >> >> > >> >> >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> >> >>> Hi guys >> >> >>> >> >> >>> I am noticing a new warning in linux 3.18 which we did not see before >> >> >>> in linux 3.4 : >> >> >>> >> >> >>> bash-4.1# echo c > /proc/sysrq-trigger >> >> >>> [ 978.807185] BUG: sleeping function called from invalid context at >> >> >>> ../arch/x86/mm/fault.c:1187 >> >> >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: >> >> >>> bash >> >> >>> [ 978.987358] Preemption disabled at:[] >> >> >>> printk+0x48/0x4a >> >> >>> >> >> >>> >> >> >>> I have bisected this to the following change : >> >> >>> >> >> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> >> >>> Author: Rik van Riel >> >> >>> Date: Fri Jun 6 14:38:13 2014 -0700 >> >> >>> >> >> >>> sysrq: rcu-ify __handle_sysrq >> >> >>> >> >> >>> >> >> >>> the rcu_read_lock() in handle_sysrq() bumps up >> >> >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> >> >>> calls might_sleep() in x86/mm/fault.c line 1191, >> >> >>> preempt_count_equals(0) returns false and hence the warning is >> >> >>> printed. >> >> >>> >> >> >>> One way to handle this would be to do something like this: >> >> >>> >> >> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> >> >>> index eef44d9..d4dbe22 100644 >> >> >>> --- a/arch/x86/mm/fault.c >> >> >>> +++ b/arch/x86/mm/fault.c >> >> >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> >> >>> long error_code, >> >> >>> * If we're in an interrupt, have no user context or are running >> >> >>> * in a region with pagefaults disabled then we must not take the >> >> >>> fault >> >> >>> */ >> >> >>> - if (unlikely(faulthandler_disabled() || !mm)) { >> >> >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || >> >> >>> !mm)) { >> >> >> >> >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> >> >> rcu_preempt_depth() unconditionally returns zero. And if >> >> >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> >> >> the might_sleep() splat. >> >> >> >> >> >> Maybe use SRCU instead of RCU for this purpose? >> >> >> >> >> > >> >> > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 >> >> > From: Ani Sinha >> >> > Date: Fri, 11 Dec 2015 12:07:42 -0800 >> >> > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' >> >> > warning in sysrq generated crash. >> >> > >> >> > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") >> >> > replaced spin_lock_irqsave() calls with >> >> > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not >> >> > disable preemption, faulthandler_disabled() in >> >> > __do_page_fault() in x86/fault.c returns false. When the code >> >> > later calls might_sleep() in the pagefault handler, we get the >> >> > following warning: >> >> > >> >> > BUG: sleeping function called from invalid context at >> >> > ../arch/x86/mm/fault.c:1187 >> >> > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> >> > Preemption disabled at:[] printk+0x48/0x4a >> >> > >> >> > To fix this, replace RCU call in handle_sysrq() to use SRCU. >> >> >> >> The sysrq code can be called from irq context. >> >> >> >> Trying to use SRCU from an irq context sounds like it could >> >> be a bad idea, though admittedly I do not know enough about >> >> SRCU to know for sure :) >> > >> > Indeed, not the best idea! ;-) >> > >> > I could imagine something like this: >> > >> > if (in_irq()) >> > rcu_read_lock(); >> > else >> > idx = srcu_read_lock(_rcu); >> > >> > And ditto for unlock. Then, for the update: >> > >> > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); >> >> This won't work on 3.18 as this api was introduced in linux 4.3. > > Then do this: > > synchronize_rcu(); > synchronize_srcu(_rcu); > >> > Where: >> > >> > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t >> > func) >> > { >> > call_srcu(_rcu, head, func); >> > } >> > >> > Here I presume that the page-fault code avoids the might_sleep if invoked >> >
Re: new warning on sysrq kernel crash trigger
On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >>> Hi guys >>> >>> I am noticing a new warning in linux 3.18 which we did not see before >>> in linux 3.4 : >>> >>> bash-4.1# echo c > /proc/sysrq-trigger >>> [ 978.807185] BUG: sleeping function called from invalid context at >>> ../arch/x86/mm/fault.c:1187 >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >>> [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a >>> >>> >>> I have bisected this to the following change : >>> >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >>> Author: Rik van Riel>>> Date: Fri Jun 6 14:38:13 2014 -0700 >>> >>> sysrq: rcu-ify __handle_sysrq >>> >>> >>> the rcu_read_lock() in handle_sysrq() bumps up >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >>> calls might_sleep() in x86/mm/fault.c line 1191, >>> preempt_count_equals(0) returns false and hence the warning is >>> printed. >>> >>> One way to handle this would be to do something like this: >>> >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >>> index eef44d9..d4dbe22 100644 >>> --- a/arch/x86/mm/fault.c >>> +++ b/arch/x86/mm/fault.c >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >>> long error_code, >>> * If we're in an interrupt, have no user context or are running >>> * in a region with pagefaults disabled then we must not take the fault >>> */ >>> - if (unlikely(faulthandler_disabled() || !mm)) { >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { >> >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then >> rcu_preempt_depth() unconditionally returns zero. And if >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see >> the might_sleep() splat. >> >> Maybe use SRCU instead of RCU for this purpose? >> > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > Preemption disabled at:[] printk+0x48/0x4a > > To fix this, replace RCU call in handle_sysrq() to use SRCU. The sysrq code can be called from irq context. Trying to use SRCU from an irq context sounds like it could be a bad idea, though admittedly I do not know enough about SRCU to know for sure :) -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, 11 Dec 2015, Paul E. McKenney wrote: > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > >>> Hi guys > > >>> > > >>> I am noticing a new warning in linux 3.18 which we did not see before > > >>> in linux 3.4 : > > >>> > > >>> bash-4.1# echo c > /proc/sysrq-trigger > > >>> [ 978.807185] BUG: sleeping function called from invalid context at > > >>> ../arch/x86/mm/fault.c:1187 > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > >>> [ 978.987358] Preemption disabled at:[] > > >>> printk+0x48/0x4a > > >>> > > >>> > > >>> I have bisected this to the following change : > > >>> > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > >>> Author: Rik van Riel> > >>> Date: Fri Jun 6 14:38:13 2014 -0700 > > >>> > > >>> sysrq: rcu-ify __handle_sysrq > > >>> > > >>> > > >>> the rcu_read_lock() in handle_sysrq() bumps up > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > >>> calls might_sleep() in x86/mm/fault.c line 1191, > > >>> preempt_count_equals(0) returns false and hence the warning is > > >>> printed. > > >>> > > >>> One way to handle this would be to do something like this: > > >>> > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > >>> index eef44d9..d4dbe22 100644 > > >>> --- a/arch/x86/mm/fault.c > > >>> +++ b/arch/x86/mm/fault.c > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > >>> long error_code, > > >>> * If we're in an interrupt, have no user context or are running > > >>> * in a region with pagefaults disabled then we must not take the fault > > >>> */ > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > >> > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > >> rcu_preempt_depth() unconditionally returns zero. And if > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > >> the might_sleep() splat. > > >> > > >> Maybe use SRCU instead of RCU for this purpose? > > >> > > > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > > From: Ani Sinha > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > > warning in sysrq generated crash. > > > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > > replaced spin_lock_irqsave() calls with > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > > disable preemption, faulthandler_disabled() in > > > __do_page_fault() in x86/fault.c returns false. When the code > > > later calls might_sleep() in the pagefault handler, we get the > > > following warning: > > > > > > BUG: sleeping function called from invalid context at > > > ../arch/x86/mm/fault.c:1187 > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > Preemption disabled at:[] printk+0x48/0x4a > > > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > > > The sysrq code can be called from irq context. > > > > Trying to use SRCU from an irq context sounds like it could > > be a bad idea, though admittedly I do not know enough about > > SRCU to know for sure :) > > Indeed, not the best idea! ;-) > > I could imagine something like this: > > if (in_irq()) > rcu_read_lock(); > else > idx = srcu_read_lock(_rcu); > > And ditto for unlock. Then, for the update: > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > Where: > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) > { > call_srcu(_rcu, head, func); > } > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 From: Ani Sinha Date: Fri, 11 Dec 2015 12:07:42 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, replace RCU call in handle_sysrq() to use SRCU in non-irq context. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- diff
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > >>> Hi guys > >>> > >>> I am noticing a new warning in linux 3.18 which we did not see before > >>> in linux 3.4 : > >>> > >>> bash-4.1# echo c > /proc/sysrq-trigger > >>> [ 978.807185] BUG: sleeping function called from invalid context at > >>> ../arch/x86/mm/fault.c:1187 > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > >>> [ 978.987358] Preemption disabled at:[] > >>> printk+0x48/0x4a > >>> > >>> > >>> I have bisected this to the following change : > >>> > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > >>> Author: Rik van Riel> >>> Date: Fri Jun 6 14:38:13 2014 -0700 > >>> > >>> sysrq: rcu-ify __handle_sysrq > >>> > >>> > >>> the rcu_read_lock() in handle_sysrq() bumps up > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > >>> calls might_sleep() in x86/mm/fault.c line 1191, > >>> preempt_count_equals(0) returns false and hence the warning is > >>> printed. > >>> > >>> One way to handle this would be to do something like this: > >>> > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > >>> index eef44d9..d4dbe22 100644 > >>> --- a/arch/x86/mm/fault.c > >>> +++ b/arch/x86/mm/fault.c > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > >>> long error_code, > >>> * If we're in an interrupt, have no user context or are running > >>> * in a region with pagefaults disabled then we must not take the fault > >>> */ > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > >> > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > >> rcu_preempt_depth() unconditionally returns zero. And if > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > >> the might_sleep() splat. > >> > >> Maybe use SRCU instead of RCU for this purpose? > >> > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > From: Ani Sinha > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > warning in sysrq generated crash. > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > replaced spin_lock_irqsave() calls with > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > disable preemption, faulthandler_disabled() in > > __do_page_fault() in x86/fault.c returns false. When the code > > later calls might_sleep() in the pagefault handler, we get the > > following warning: > > > > BUG: sleeping function called from invalid context at > > ../arch/x86/mm/fault.c:1187 > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > Preemption disabled at:[] printk+0x48/0x4a > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > The sysrq code can be called from irq context. > > Trying to use SRCU from an irq context sounds like it could > be a bad idea, though admittedly I do not know enough about > SRCU to know for sure :) Indeed, not the best idea! ;-) I could imagine something like this: if (in_irq()) rcu_read_lock(); else idx = srcu_read_lock(_rcu); And ditto for unlock. Then, for the update: synchronize_rcu_mult(call_rcu, call_sysrq_srcu); Where: static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) { call_srcu(_rcu, head, func); } Here I presume that the page-fault code avoids the might_sleep if invoked from irq context. Thoughts? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Fri, Dec 11, 2015 at 04:16:37PM -0800, Ani Sinha wrote: > On Fri, 11 Dec 2015, Paul E. McKenney wrote: > > On Fri, Dec 11, 2015 at 05:10:43PM -0500, Rik van Riel wrote: > > > On 12/11/2015 03:44 PM, Ani Sinha wrote: > > > > > > > > > > > > On Thu, 10 Dec 2015, Paul E. McKenney wrote: > > > > > > > >> On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > > >>> Hi guys > > > >>> > > > >>> I am noticing a new warning in linux 3.18 which we did not see before > > > >>> in linux 3.4 : > > > >>> > > > >>> bash-4.1# echo c > /proc/sysrq-trigger > > > >>> [ 978.807185] BUG: sleeping function called from invalid context at > > > >>> ../arch/x86/mm/fault.c:1187 > > > >>> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: > > > >>> bash > > > >>> [ 978.987358] Preemption disabled at:[] > > > >>> printk+0x48/0x4a > > > >>> > > > >>> > > > >>> I have bisected this to the following change : > > > >>> > > > >>> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > > >>> Author: Rik van Riel> > > >>> Date: Fri Jun 6 14:38:13 2014 -0700 > > > >>> > > > >>> sysrq: rcu-ify __handle_sysrq > > > >>> > > > >>> > > > >>> the rcu_read_lock() in handle_sysrq() bumps up > > > >>> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > > >>> calls might_sleep() in x86/mm/fault.c line 1191, > > > >>> preempt_count_equals(0) returns false and hence the warning is > > > >>> printed. > > > >>> > > > >>> One way to handle this would be to do something like this: > > > >>> > > > >>> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > > >>> index eef44d9..d4dbe22 100644 > > > >>> --- a/arch/x86/mm/fault.c > > > >>> +++ b/arch/x86/mm/fault.c > > > >>> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > > >>> long error_code, > > > >>> * If we're in an interrupt, have no user context or are running > > > >>> * in a region with pagefaults disabled then we must not take the > > > >>> fault > > > >>> */ > > > >>> - if (unlikely(faulthandler_disabled() || !mm)) { > > > >>> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || > > > >>> !mm)) { > > > >> > > > >> This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > > > >> rcu_preempt_depth() unconditionally returns zero. And if > > > >> CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > > > >> the might_sleep() splat. > > > >> > > > >> Maybe use SRCU instead of RCU for this purpose? > > > >> > > > > > > > > From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > > > > From: Ani Sinha > > > > Date: Fri, 11 Dec 2015 12:07:42 -0800 > > > > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > > > > warning in sysrq generated crash. > > > > > > > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > > > > replaced spin_lock_irqsave() calls with > > > > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > > > > disable preemption, faulthandler_disabled() in > > > > __do_page_fault() in x86/fault.c returns false. When the code > > > > later calls might_sleep() in the pagefault handler, we get the > > > > following warning: > > > > > > > > BUG: sleeping function called from invalid context at > > > > ../arch/x86/mm/fault.c:1187 > > > > in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > > > Preemption disabled at:[] printk+0x48/0x4a > > > > > > > > To fix this, replace RCU call in handle_sysrq() to use SRCU. > > > > > > The sysrq code can be called from irq context. > > > > > > Trying to use SRCU from an irq context sounds like it could > > > be a bad idea, though admittedly I do not know enough about > > > SRCU to know for sure :) > > > > Indeed, not the best idea! ;-) > > > > I could imagine something like this: > > > > if (in_irq()) > > rcu_read_lock(); > > else > > idx = srcu_read_lock(_rcu); > > > > And ditto for unlock. Then, for the update: > > > > synchronize_rcu_mult(call_rcu, call_sysrq_srcu); > > > > Where: > > > > static void call_sysrq_srcu(struct rcu_head *head, rcu_callback_t func) > > { > > call_srcu(_rcu, head, func); > > } > > > > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 > From: Ani Sinha > Date: Fri, 11 Dec 2015 12:07:42 -0800 > Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' > warning in sysrq generated crash. > > Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") > replaced spin_lock_irqsave() calls with > rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not > disable preemption, faulthandler_disabled() in > __do_page_fault() in x86/fault.c returns false. When the code > later calls might_sleep() in the pagefault handler, we get the > following warning: > > BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > in_atomic(): 0, irqs_disabled(): 0, pid:
Re: new warning on sysrq kernel crash trigger
On Thu, 10 Dec 2015, Paul E. McKenney wrote: > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > > Hi guys > > > > I am noticing a new warning in linux 3.18 which we did not see before > > in linux 3.4 : > > > > bash-4.1# echo c > /proc/sysrq-trigger > > [ 978.807185] BUG: sleeping function called from invalid context at > > ../arch/x86/mm/fault.c:1187 > > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > > [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a > > > > > > I have bisected this to the following change : > > > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > > Author: Rik van Riel> > Date: Fri Jun 6 14:38:13 2014 -0700 > > > > sysrq: rcu-ify __handle_sysrq > > > > > > the rcu_read_lock() in handle_sysrq() bumps up > > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > > calls might_sleep() in x86/mm/fault.c line 1191, > > preempt_count_equals(0) returns false and hence the warning is > > printed. > > > > One way to handle this would be to do something like this: > > > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > > index eef44d9..d4dbe22 100644 > > --- a/arch/x86/mm/fault.c > > +++ b/arch/x86/mm/fault.c > > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > > long error_code, > > * If we're in an interrupt, have no user context or are running > > * in a region with pagefaults disabled then we must not take the fault > > */ > > - if (unlikely(faulthandler_disabled() || !mm)) { > > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > rcu_preempt_depth() unconditionally returns zero. And if > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > the might_sleep() splat. > > Maybe use SRCU instead of RCU for this purpose? > >From ae232ce3fb167b2ad363bfac7aab69001bc55a50 Mon Sep 17 00:00:00 2001 From: Ani Sinha Date: Fri, 11 Dec 2015 12:07:42 -0800 Subject: [PATCH 1/1] Fix 'sleeping function called from invalid context' warning in sysrq generated crash. Commit 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") replaced spin_lock_irqsave() calls with rcu_read_lock() calls in sysrq. Since rcu_read_lock() does not disable preemption, faulthandler_disabled() in __do_page_fault() in x86/fault.c returns false. When the code later calls might_sleep() in the pagefault handler, we get the following warning: BUG: sleeping function called from invalid context at ../arch/x86/mm/fault.c:1187 in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash Preemption disabled at:[] printk+0x48/0x4a To fix this, replace RCU call in handle_sysrq() to use SRCU. Tested this patch on linux 3.18 by booting off one of our boards. Fixes: 984d74a72076a1 ("sysrq: rcu-ify __handle_sysrq") Signed-off-by: Ani Sinha --- drivers/tty/sysrq.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c index 5381a72..904865f 100644 --- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -519,10 +519,12 @@ void __handle_sysrq(int key, bool check_mask) { struct sysrq_key_op *op_p; int orig_log_level; - int i; + int i, idx; + struct srcu_struct sysrq_rcu; + init_srcu_struct(_rcu); rcu_sysrq_start(); - rcu_read_lock(); + idx = srcu_read_lock(_rcu); /* * Raise the apparent loglevel to maximum so that the sysrq header * is shown to provide the user with positive feedback. We do not @@ -564,7 +566,7 @@ void __handle_sysrq(int key, bool check_mask) pr_cont("\n"); console_loglevel = orig_log_level; } - rcu_read_unlock(); + srcu_read_unlock(_rcu, idx); rcu_sysrq_end(); } -- 1.8.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
Well I can certainly send a patch but I wonder if simply using SRCU for this one instance in Rik's original patch will not break anything else. Rik, please provide your thoughts. On Thu, Dec 10, 2015 at 9:26 PM, Paul E. McKenneywrote: > On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: >> Hi guys >> >> I am noticing a new warning in linux 3.18 which we did not see before >> in linux 3.4 : >> >> bash-4.1# echo c > /proc/sysrq-trigger >> [ 978.807185] BUG: sleeping function called from invalid context at >> ../arch/x86/mm/fault.c:1187 >> [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash >> [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a >> >> >> I have bisected this to the following change : >> >> commit 984d74a72076a12b400339973e8c98fd2fcd90e5 >> Author: Rik van Riel >> Date: Fri Jun 6 14:38:13 2014 -0700 >> >> sysrq: rcu-ify __handle_sysrq >> >> >> the rcu_read_lock() in handle_sysrq() bumps up >> current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it >> calls might_sleep() in x86/mm/fault.c line 1191, >> preempt_count_equals(0) returns false and hence the warning is >> printed. >> >> One way to handle this would be to do something like this: >> >> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c >> index eef44d9..d4dbe22 100644 >> --- a/arch/x86/mm/fault.c >> +++ b/arch/x86/mm/fault.c >> @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned >> long error_code, >> * If we're in an interrupt, have no user context or are running >> * in a region with pagefaults disabled then we must not take the fault >> */ >> - if (unlikely(faulthandler_disabled() || !mm)) { >> + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { > > This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then > rcu_preempt_depth() unconditionally returns zero. And if > CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see > the might_sleep() splat. > > Maybe use SRCU instead of RCU for this purpose? > > Thanx, Paul > >> bad_area_nosemaphore(regs, error_code, address); >> return; >> } >> >> I am wondering if this would be the right approach. I have tested that >> this patch does indeed suppress the warning. If you guys agree, I will >> send a patch. It's true that this is a trivial issue since we are >> intentionally crashing the kernel but in our case, this additional >> complaint from the kernel is confusing our test scripts and they are >> generating false positives. >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > Hi guys > > I am noticing a new warning in linux 3.18 which we did not see before > in linux 3.4 : > > bash-4.1# echo c > /proc/sysrq-trigger > [ 978.807185] BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a > > > I have bisected this to the following change : > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > Author: Rik van Riel > Date: Fri Jun 6 14:38:13 2014 -0700 > > sysrq: rcu-ify __handle_sysrq > > > the rcu_read_lock() in handle_sysrq() bumps up > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > calls might_sleep() in x86/mm/fault.c line 1191, > preempt_count_equals(0) returns false and hence the warning is > printed. > > One way to handle this would be to do something like this: > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index eef44d9..d4dbe22 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > long error_code, > * If we're in an interrupt, have no user context or are running > * in a region with pagefaults disabled then we must not take the fault > */ > - if (unlikely(faulthandler_disabled() || !mm)) { > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then rcu_preempt_depth() unconditionally returns zero. And if CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see the might_sleep() splat. Maybe use SRCU instead of RCU for this purpose? Thanx, Paul > bad_area_nosemaphore(regs, error_code, address); > return; > } > > I am wondering if this would be the right approach. I have tested that > this patch does indeed suppress the warning. If you guys agree, I will > send a patch. It's true that this is a trivial issue since we are > intentionally crashing the kernel but in our case, this additional > complaint from the kernel is confusing our test scripts and they are > generating false positives. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: new warning on sysrq kernel crash trigger
On Thu, Dec 10, 2015 at 03:57:09PM -0800, Ani Sinha wrote: > Hi guys > > I am noticing a new warning in linux 3.18 which we did not see before > in linux 3.4 : > > bash-4.1# echo c > /proc/sysrq-trigger > [ 978.807185] BUG: sleeping function called from invalid context at > ../arch/x86/mm/fault.c:1187 > [ 978.909816] in_atomic(): 0, irqs_disabled(): 0, pid: 4706, name: bash > [ 978.987358] Preemption disabled at:[] printk+0x48/0x4a > > > I have bisected this to the following change : > > commit 984d74a72076a12b400339973e8c98fd2fcd90e5 > Author: Rik van Riel> Date: Fri Jun 6 14:38:13 2014 -0700 > > sysrq: rcu-ify __handle_sysrq > > > the rcu_read_lock() in handle_sysrq() bumps up > current->rcu_read_lock_nesting. Hence, in __do_page_fault() when it > calls might_sleep() in x86/mm/fault.c line 1191, > preempt_count_equals(0) returns false and hence the warning is > printed. > > One way to handle this would be to do something like this: > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c > index eef44d9..d4dbe22 100644 > --- a/arch/x86/mm/fault.c > +++ b/arch/x86/mm/fault.c > @@ -1132,7 +1132,7 @@ __do_page_fault(struct pt_regs *regs, unsigned > long error_code, > * If we're in an interrupt, have no user context or are running > * in a region with pagefaults disabled then we must not take the fault > */ > - if (unlikely(faulthandler_disabled() || !mm)) { > + if (unlikely(faulthandler_disabled() || rcu_preempt_depth() || !mm)) { This works if CONFIG_PREEMPT=y, but if CONFIG_PREEMPT=n, then rcu_preempt_depth() unconditionally returns zero. And if CONFIG_PREEMPT_COUNT=y && CONFIG_PREEMPT=n, you would still see the might_sleep() splat. Maybe use SRCU instead of RCU for this purpose? Thanx, Paul > bad_area_nosemaphore(regs, error_code, address); > return; > } > > I am wondering if this would be the right approach. I have tested that > this patch does indeed suppress the warning. If you guys agree, I will > send a patch. It's true that this is a trivial issue since we are > intentionally crashing the kernel but in our case, this additional > complaint from the kernel is confusing our test scripts and they are > generating false positives. > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/