Hi Keith,
While working on a ppc64 issue, I hit a problem where we hit the
BUG in kmem_cache_create() when RECURSE is set to a positive
value. This looks to be due to _local_bh_enable() not being
called when the RECURSE flag is set.
However, when kdb_init() calls kdb() and RECURSE flag is set, the
system goes (incorrectly?) into recursive mode since kdb_initial_cpu
is set upon kdb_init() entry. Here is a log illustrating the problem:
--------------------
Calibrating delay loop... 189.44 BogoMIPS
kdb version 4.3 by Keith Owens, Scott Lurndal. Copyright SGI, All Rights Reserved
kdb: Debugger re-entered on cpu 0, new reason = 12
Not executing a kdb command
No longjmp available for recovery
Attempting recursive mode
Entering kdb (current=0xc0000000003ba298, pid 0) due to Recursion @ 0xc00000000012e800
kdb> bt
Stack traceback for pid 0
0xc0000000003ba298 0 0 1 0 R 0xc0000000003ba700 *swapper
SP(esp) PC(eip) Function(args)
0xc00000000032bd80 0xc00000000012e800 .kdb +0x440
0xc00000000032be50 0xc0000000002d9e5c .kdb_init +0x4ec
0xc00000000032bef0 0xc0000000002c6564 .start_kernel +0x16c
0xc00000000032bf90 0xc00000000000c098 .__setup_cpu_power3 +0x0
kmem_cache_create: Early error in slab task_struct
kernel BUG in kmem_cache_create at mm/slab.c:1121!
Entering kdb (current=0xc0000000003ba298, pid 0) due to KDB_ENTER()
kdb> bt
Stack traceback for pid 0
0xc0000000003ba298 0 0 1 0 R 0xc0000000003ba700 *swapper
SP(esp) PC(eip) Function(args)
0xc00000000032bd70 0xc00000000006ca04 .kmem_cache_create +0xcc
0xc00000000032be60 0xc0000000002d4bb0 .fork_init +0x3c
0xc00000000032bef0 0xc0000000002c6584 .start_kernel +0x18c
0xc00000000032bf90 0xc00000000000c098 .__setup_cpu_power3 +0x0
kdb>
-----------------
Verified that the issue exists on x86 too.
I worked around this problem by using a KDB_STATE_INITIALIZE flag
which I use to take care of the recursion during intialization.
Inlined is the patch (against kdb-v4.3 sources). Is there a better
way of doing this? Am I missing something here?
Thanks,
Ananth
--
Ananth Narayan
Linux Technology Center,
IBM Software Lab, INDIA
--- temp/ameslab/kdb/kdbmain.c 2004-05-04 02:07:51.000000000 -0700
+++ ameslab/kdb/kdbmain.c 2004-05-28 06:34:08.826005680 -0700
@@ -1242,6 +1242,10 @@
}
break;
case KDB_REASON_RECURSE:
+ if (KDB_STATE(INITIALIZE)) {
+ kdb_printf("due to recursion during initialization..
ignoring\n");
+ return 1;
+ }
kdb_printf("due to Recursion @ " kdb_machreg_fmt "\n",
kdba_getpc(regs));
break;
case KDB_REASON_SILENT:
@@ -1969,7 +1973,8 @@
#endif
/* Only do this work if we are really leaving kdb */
- if (!(KDB_STATE(DOING_SS) || KDB_STATE(SSBPT) || KDB_STATE(RECURSE))) {
+ if (KDB_STATE(INITIALIZE) ||
+ !(KDB_STATE(DOING_SS) || KDB_STATE(SSBPT) || KDB_STATE(RECURSE))) {
KDB_DEBUG_STATE("kdb 15", result);
kdb_bp_install_local(regs);
kdba_enable_lbr();
@@ -3498,7 +3503,9 @@
KDB_MAJOR_VERSION, KDB_MINOR_VERSION, KDB_TEST_VERSION);
kdb_cmd_init(); /* Preset commands from kdb_cmds */
+ KDB_STATE_SET(INITIALIZE);
kdb(KDB_REASON_SILENT, 0, 0); /* Activate any preset breakpoints on boot cpu
*/
+ KDB_STATE_CLEAR(INITIALIZE);
notifier_chain_register(&panic_notifier_list, &kdb_block);
#ifdef KDB_HAVE_LONGJMP
--- temp/ameslab/include/linux/kdb.h 2004-05-04 02:07:51.000000000 -0700
+++ ameslab/include/linux/kdb.h 2004-05-28 06:29:30.912053560 -0700
@@ -143,6 +143,7 @@
#define KDB_STATE_RECURSE 0x00004000 /* Recursive entry to kdb */
#define KDB_STATE_IP_ADJUSTED 0x00008000 /* Restart IP has been adjusted */
#define KDB_STATE_GO1 0x00010000 /* go only releases one cpu */
+#define KDB_STATE_INITIALIZE 0x00020000 /* called from kdb_init */
#define KDB_STATE_ARCH 0xff000000 /* Reserved for arch specific use */
#define KDB_STATE_CPU(flag,cpu) (kdb_state[cpu] & KDB_STATE_##flag)
---------------------------
Use http://oss.sgi.com/ecartis to modify your settings or to unsubscribe.