Re: BUG: spinlock bad magic on CPU#0 on BeagleBone
On 12/19/2012 10:44 PM, Bedia, Vaibhav wrote: > On Thu, Dec 20, 2012 at 11:55:24, Stephen Boyd wrote: >> On 12/19/2012 8:48 PM, Bedia, Vaibhav wrote: >>> I tried out 3 variants of AM335x boards - 2 of these (BeagleBone and EVM) >>> have DDR2 >>> and 1 has DDR3 (EVM-SK). The BUG is triggered on all of these at the same >>> point. >>> >>> With Stephen's change I don't see this on any of the board variants :) >>> New bootlog below. >> Great! Can I have your Tested-by then? I'll wrap it up into a patch. Is >> this is a new regression? From a glance at the code it looks to have >> existed for quite a while now. > I went back to a branch based off 3.7-rc4 and don't see the issue there. Not > sure > what is triggering this now. > > Tested-by: Vaibhav Bedia Thanks. I was thrown off by the author date of this patch which introduced your problem commit 8823c079ba7136dc1948d6f6dcb5f8022bde438e Author: Eric W. Biederman AuthorDate: Sun Mar 7 18:49:36 2010 -0800 Commit: Eric W. Biederman CommitDate: Mon Nov 19 05:59:18 2012 -0800 vfs: Add setns support for the mount namespace It seems to have a 2 year gap between commit date and author date. Either way, it looks to be isolated to the 3.8 merge window but affects quite a few architectures. Patch to follow shortly. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: BUG: spinlock bad magic on CPU#0 on BeagleBone
On Thu, Dec 20, 2012 at 11:55:24, Stephen Boyd wrote: > On 12/19/2012 8:48 PM, Bedia, Vaibhav wrote: > > I tried out 3 variants of AM335x boards - 2 of these (BeagleBone and EVM) > > have DDR2 > > and 1 has DDR3 (EVM-SK). The BUG is triggered on all of these at the same > > point. > > > > With Stephen's change I don't see this on any of the board variants :) > > New bootlog below. > > Great! Can I have your Tested-by then? I'll wrap it up into a patch. Is > this is a new regression? From a glance at the code it looks to have > existed for quite a while now. I went back to a branch based off 3.7-rc4 and don't see the issue there. Not sure what is triggering this now. Tested-by: Vaibhav Bedia -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG: spinlock bad magic on CPU#0 on BeagleBone
On 12/19/2012 8:48 PM, Bedia, Vaibhav wrote: > I tried out 3 variants of AM335x boards - 2 of these (BeagleBone and EVM) > have DDR2 > and 1 has DDR3 (EVM-SK). The BUG is triggered on all of these at the same > point. > > With Stephen's change I don't see this on any of the board variants :) > New bootlog below. Great! Can I have your Tested-by then? I'll wrap it up into a patch. Is this is a new regression? From a glance at the code it looks to have existed for quite a while now. -- Sent by an employee of the Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum. -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: BUG: spinlock bad magic on CPU#0 on BeagleBone
On Thu, Dec 20, 2012 at 01:53:42, Stephen Boyd wrote: > On 12/19/12 08:53, Paul Walmsley wrote: > > On Wed, 19 Dec 2012, Bedia, Vaibhav wrote: > > > >> Current mainline on Beaglebone using the omap2plus_defconfig + 3 build > >> fixes > >> is triggering a BUG() > > ... > > > >> [0.109688] Security Framework initialized > >> [0.109889] Mount-cache hash table entries: 512 > >> [0.112674] BUG: spinlock bad magic on CPU#0, swapper/0/0 > >> [0.112724] lock: atomic64_lock+0x240/0x400, .magic: , .owner: > >> /-1, .owner_cpu: 0 > >> [0.112782] [] (unwind_backtrace+0x0/0xf0) from [] > >> (do_raw_spin_lock+0x158/0x198) > >> [0.112813] [] (do_raw_spin_lock+0x158/0x198) from > >> [] (_raw_spin_lock_irqsave+0x4c/0x58) > >> [0.112844] [] (_raw_spin_lock_irqsave+0x4c/0x58) from > >> [] (atomic64_add_return+0x30/0x5c) > >> [0.112886] [] (atomic64_add_return+0x30/0x5c) from > >> [] (alloc_mnt_ns.clone.14+0x44/0xac) > >> [0.112914] [] (alloc_mnt_ns.clone.14+0x44/0xac) from > >> [] (create_mnt_ns+0xc/0x54) > >> [0.112951] [] (create_mnt_ns+0xc/0x54) from [] > >> (mnt_init+0x120/0x1d4) > >> [0.112978] [] (mnt_init+0x120/0x1d4) from [] > >> (vfs_caches_init+0xe0/0x10c) > >> [0.113005] [] (vfs_caches_init+0xe0/0x10c) from [] > >> (start_kernel+0x29c/0x300) > >> [0.113029] [] (start_kernel+0x29c/0x300) from [<80008078>] > >> (0x80008078) > >> [0.118290] CPU: Testing write buffer coherency: ok > >> [0.118968] CPU0: thread -1, cpu 0, socket -1, mpidr 0 > >> [0.119053] Setting up static identity map for 0x804de2c8 - 0x804de338 > >> [0.120698] Brought up 1 CPUs > > This is probably a memory corruption bug, there's probably some code > > executing early that's writing outside its own data and trashing some > > previously-allocated memory. > > I'm not so sure. It looks like atomic64s use spinlocks on processors > that don't have 64-bit atomic instructions (see lib/atomic64.c). And > those spinlocks are not initialized until a pure initcall runs, > init_atomic64_lock(). Pure initcalls don't run until after > vfs_caches_init() and so you get this BUG() warning that the spinlock is > not initialized. > > How about we initialize the locks statically? Does that fix your problem? > > >8- > > diff --git a/lib/atomic64.c b/lib/atomic64.c > index 9785378..08a4f06 100644 > --- a/lib/atomic64.c > +++ b/lib/atomic64.c > @@ -31,7 +31,11 @@ > static union { > raw_spinlock_t lock; > char pad[L1_CACHE_BYTES]; > -} atomic64_lock[NR_LOCKS] __cacheline_aligned_in_smp; > +} atomic64_lock[NR_LOCKS] __cacheline_aligned_in_smp = { > + [0 ... (NR_LOCKS - 1)] = { > + .lock = __RAW_SPIN_LOCK_UNLOCKED(atomic64_lock.lock), > + }, > +}; > > static inline raw_spinlock_t *lock_addr(const atomic64_t *v) > { > @@ -173,14 +177,3 @@ int atomic64_add_unless(atomic64_t *v, long long a, long > long u) > return ret; > } > EXPORT_SYMBOL(atomic64_add_unless); > - > -static int init_atomic64_lock(void) > -{ > - int i; > - > - for (i = 0; i < NR_LOCKS; ++i) > - raw_spin_lock_init(&atomic64_lock[i].lock); > - return 0; > -} > - > -pure_initcall(init_atomic64_lock); > I tried out 3 variants of AM335x boards - 2 of these (BeagleBone and EVM) have DDR2 and 1 has DDR3 (EVM-SK). The BUG is triggered on all of these at the same point. With Stephen's change I don't see this on any of the board variants :) New bootlog below. Thanks, Vaibhav --- [0.00] Booting Linux on physical CPU 0x0 [0.00] Linux version 3.7.0-01415-g55bc169-dirty (a0393953@psplinux063) (gcc version 4.5.3 20110311 (prerelease) (GCC) ) #4 SMP Thu Dec 20 09:59:12 IST 2012 [0.00] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c53c7d [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache [0.00] Machine: Generic AM33XX (Flattened Device Tree), model: TI AM335x BeagleBone [0.00] Memory policy: ECC disabled, Data cache writeback [0.00] AM335X ES1.0 (neon ) [0.00] PERCPU: Embedded 9 pages/cpu @c0f1a000 s12992 r8192 d15680 u36864 [0.00] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 64768 [0.00] Kernel command line: console=ttyO0,115200n8 mem=256M root=/dev/ram rw initrd=0x8200,16MB ramdisk_size=65536 earlyprintk=serial [0.00] PID hash table entries: 1024 (order: 0, 4096 bytes) [0.00] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes) [0.00] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes) [0.00] __ex_table already sorted, skipping sort [0.00] Memory: 255MB = 255MB total [0.00] Memory: 229012k/229012k available, 33132k reserved, 0K highmem [0.00] Virtual kernel memory layout: [0.00] vector : 0x - 0x1000 ( 4 kB) [0.00] fixmap : 0xfff0 - 0xfffe ( 896 kB) [
Re: BUG: spinlock bad magic on CPU#0 on BeagleBone
On 12/19/12 08:53, Paul Walmsley wrote: > On Wed, 19 Dec 2012, Bedia, Vaibhav wrote: > >> Current mainline on Beaglebone using the omap2plus_defconfig + 3 build fixes >> is triggering a BUG() > ... > >> [0.109688] Security Framework initialized >> [0.109889] Mount-cache hash table entries: 512 >> [0.112674] BUG: spinlock bad magic on CPU#0, swapper/0/0 >> [0.112724] lock: atomic64_lock+0x240/0x400, .magic: , .owner: >> /-1, .owner_cpu: 0 >> [0.112782] [] (unwind_backtrace+0x0/0xf0) from [] >> (do_raw_spin_lock+0x158/0x198) >> [0.112813] [] (do_raw_spin_lock+0x158/0x198) from [] >> (_raw_spin_lock_irqsave+0x4c/0x58) >> [0.112844] [] (_raw_spin_lock_irqsave+0x4c/0x58) from >> [] (atomic64_add_return+0x30/0x5c) >> [0.112886] [] (atomic64_add_return+0x30/0x5c) from >> [] (alloc_mnt_ns.clone.14+0x44/0xac) >> [0.112914] [] (alloc_mnt_ns.clone.14+0x44/0xac) from >> [] (create_mnt_ns+0xc/0x54) >> [0.112951] [] (create_mnt_ns+0xc/0x54) from [] >> (mnt_init+0x120/0x1d4) >> [0.112978] [] (mnt_init+0x120/0x1d4) from [] >> (vfs_caches_init+0xe0/0x10c) >> [0.113005] [] (vfs_caches_init+0xe0/0x10c) from [] >> (start_kernel+0x29c/0x300) >> [0.113029] [] (start_kernel+0x29c/0x300) from [<80008078>] >> (0x80008078) >> [0.118290] CPU: Testing write buffer coherency: ok >> [0.118968] CPU0: thread -1, cpu 0, socket -1, mpidr 0 >> [0.119053] Setting up static identity map for 0x804de2c8 - 0x804de338 >> [0.120698] Brought up 1 CPUs > This is probably a memory corruption bug, there's probably some code > executing early that's writing outside its own data and trashing some > previously-allocated memory. I'm not so sure. It looks like atomic64s use spinlocks on processors that don't have 64-bit atomic instructions (see lib/atomic64.c). And those spinlocks are not initialized until a pure initcall runs, init_atomic64_lock(). Pure initcalls don't run until after vfs_caches_init() and so you get this BUG() warning that the spinlock is not initialized. How about we initialize the locks statically? Does that fix your problem? >8- diff --git a/lib/atomic64.c b/lib/atomic64.c index 9785378..08a4f06 100644 --- a/lib/atomic64.c +++ b/lib/atomic64.c @@ -31,7 +31,11 @@ static union { raw_spinlock_t lock; char pad[L1_CACHE_BYTES]; -} atomic64_lock[NR_LOCKS] __cacheline_aligned_in_smp; +} atomic64_lock[NR_LOCKS] __cacheline_aligned_in_smp = { + [0 ... (NR_LOCKS - 1)] = { + .lock = __RAW_SPIN_LOCK_UNLOCKED(atomic64_lock.lock), + }, +}; static inline raw_spinlock_t *lock_addr(const atomic64_t *v) { @@ -173,14 +177,3 @@ int atomic64_add_unless(atomic64_t *v, long long a, long long u) return ret; } EXPORT_SYMBOL(atomic64_add_unless); - -static int init_atomic64_lock(void) -{ - int i; - - for (i = 0; i < NR_LOCKS; ++i) - raw_spin_lock_init(&atomic64_lock[i].lock); - return 0; -} - -pure_initcall(init_atomic64_lock); -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BUG: spinlock bad magic on CPU#0 on BeagleBone
On Wed, 19 Dec 2012, Bedia, Vaibhav wrote: > Current mainline on Beaglebone using the omap2plus_defconfig + 3 build fixes > is triggering a BUG() ... > [0.109688] Security Framework initialized > [0.109889] Mount-cache hash table entries: 512 > [0.112674] BUG: spinlock bad magic on CPU#0, swapper/0/0 > [0.112724] lock: atomic64_lock+0x240/0x400, .magic: , .owner: > /-1, .owner_cpu: 0 > [0.112782] [] (unwind_backtrace+0x0/0xf0) from [] > (do_raw_spin_lock+0x158/0x198) > [0.112813] [] (do_raw_spin_lock+0x158/0x198) from [] > (_raw_spin_lock_irqsave+0x4c/0x58) > [0.112844] [] (_raw_spin_lock_irqsave+0x4c/0x58) from > [] (atomic64_add_return+0x30/0x5c) > [0.112886] [] (atomic64_add_return+0x30/0x5c) from [] > (alloc_mnt_ns.clone.14+0x44/0xac) > [0.112914] [] (alloc_mnt_ns.clone.14+0x44/0xac) from > [] (create_mnt_ns+0xc/0x54) > [0.112951] [] (create_mnt_ns+0xc/0x54) from [] > (mnt_init+0x120/0x1d4) > [0.112978] [] (mnt_init+0x120/0x1d4) from [] > (vfs_caches_init+0xe0/0x10c) > [0.113005] [] (vfs_caches_init+0xe0/0x10c) from [] > (start_kernel+0x29c/0x300) > [0.113029] [] (start_kernel+0x29c/0x300) from [<80008078>] > (0x80008078) > [0.118290] CPU: Testing write buffer coherency: ok > [0.118968] CPU0: thread -1, cpu 0, socket -1, mpidr 0 > [0.119053] Setting up static identity map for 0x804de2c8 - 0x804de338 > [0.120698] Brought up 1 CPUs This is probably a memory corruption bug, there's probably some code executing early that's writing outside its own data and trashing some previously-allocated memory. - Paul -- To unsubscribe from this list: send the line "unsubscribe linux-omap" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html