Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Thu, 26 Feb 2009, Mark Nelson wrote: On Thu, 26 Feb 2009 09:45:41 am Mark Nelson wrote: On Thu, 26 Feb 2009 12:31:20 am Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: Does the following patch fix the errors you're seeing? (it applies the same fix as the previous patch but this time to copy_tofrom_user, which I updated in a4e22f02f5b6518c1484faea1f88d81802b9feac) Thanks, but I still get crashes in copy_page_range(). Hmmm... I'm out of ideas for the moment, but thanks for testing anyway! If you revert both 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556 and a4e22f02f5b6518c1484faea1f88d81802b9feac, does it help? You could also try to revert 57dda6ef5bd5b9e60410477ad29e654097e2cca1 just in case I need to keep wearing the brown paper bag for a bit longer :) Still doesn't help. However, I noticed I never enabled CONFIG_DEBUG_PAGEALLOC before 2.6.29-rc5. So far I tried 2.6.2[5-8], and they all crash with CONFIG_DEBUG_PAGEALLOC. I guess it never actually worked on PS3. With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: geert.uytterhoe...@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009, Mark Nelson wrote: On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Unfortunately not, now it crashes while accessing the memory pointed to by GPR16, in NIP: copy_page_range+x0608/0x628 LR: dup_mm+0x2e4/0x428 Trace: debug_table+0xcc70/0x1afe0 (unreliable) dup_mm+0x2e4/0x428 copy_process+0x86c/0xf9c do_fork+0x188/0x39c sys_clone+0x58/0x70 ppc_clone+0x8/0xc However, after reverting 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, I still see similar problems as above (crash in copy_page_range()). Which makes me think that 1. Your new patch fixes the problem introduced by 25d6e2d7, 2. There's still another issue than the one introduced by 25d6e2d7. With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: geert.uytterhoe...@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009, Mark Nelson wrote: On Wed, 25 Feb 2009 05:01:59 am Geert Uytterhoeven wrote: On Mon, 23 Feb 2009, Paul Mackerras wrote: Andrew Morton writes: It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? I'm seeing a similar thing on PS3, but not in ext3. During early userspace setup (udevd), it crashes accessing a 0xc00* address in: | NIP setup+0x20/0x130 | LR copy_user_page+0x18/0x6c | Call trace: | do_wp_page+0x5b4/0x89c | do_page_fault+0x3a8/0x58c | handle_page_fault+0x20/0x5c I have CONFIG_DEBUG_PAGEALLOC=y. If I disable it, the system boots fine. If needed, I can probably bisect this tomorrow. It definitely didn't happen in 2.6.29-rc5. No need to bisect - it was 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, my commit that optimised 64bit memcpy() for Power6 and Cell. The bug was in -rc1, but if your copies were 8-byte aligned with respect to the source the problem wouldn't have been seen... Could this have been why you didn't see it in -rc5? Hmm... I just started seeing it on older kernels (-rc5+), too... With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: geert.uytterhoe...@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Mark Nelson wrote: Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). This patch fixes the issue at my side. I tried booting the system few times and every single time it came up clean. Thanks -Sachin -- - Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India - ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009 08:50:46 pm Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Unfortunately not, now it crashes while accessing the memory pointed to by GPR16, in NIP: copy_page_range+x0608/0x628 LR: dup_mm+0x2e4/0x428 Trace: debug_table+0xcc70/0x1afe0 (unreliable) dup_mm+0x2e4/0x428 copy_process+0x86c/0xf9c do_fork+0x188/0x39c sys_clone+0x58/0x70 ppc_clone+0x8/0xc However, after reverting 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, I still see similar problems as above (crash in copy_page_range()). Which makes me think that 1. Your new patch fixes the problem introduced by 25d6e2d7, 2. There's still another issue than the one introduced by 25d6e2d7. Does the following patch fix the errors you're seeing? (it applies the same fix as the previous patch but this time to copy_tofrom_user, which I updated in a4e22f02f5b6518c1484faea1f88d81802b9feac) Thanks! Mark --- arch/powerpc/lib/copyuser_64.S | 38 +++--- 1 file changed, 31 insertions(+), 7 deletions(-) Index: upstream/arch/powerpc/lib/copyuser_64.S === --- upstream.orig/arch/powerpc/lib/copyuser_64.S +++ upstream/arch/powerpc/lib/copyuser_64.S @@ -62,18 +62,19 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 72:std r8,8(r3) beq+3f addir3,r3,16 -23:ld r9,8(r4) .Ldo_tail: bf cr7*4+1,1f - rotldi r9,r9,32 +23:lwz r9,8(r4) + addir4,r4,4 73:stw r9,0(r3) addir3,r3,4 1: bf cr7*4+2,2f - rotldi r9,r9,16 +44:lhz r9,8(r4) + addir4,r4,2 74:sth r9,0(r3) addir3,r3,2 2: bf cr7*4+3,3f - rotldi r9,r9,8 +45:lbz r9,8(r4) 75:stb r9,0(r3) 3: li r3,0 blr @@ -141,11 +142,24 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 6: cmpwi cr1,r5,8 addir3,r3,32 sld r9,r9,r10 - ble cr1,.Ldo_tail + ble cr1,7f 34:ld r0,8(r4) srd r7,r0,r11 or r9,r7,r9 - b .Ldo_tail +7: + bf cr7*4+1,1f + rotldi r9,r9,32 +94:stw r9,0(r3) + addir3,r3,4 +1: bf cr7*4+2,2f + rotldi r9,r9,16 +95:sth r9,0(r3) + addir3,r3,2 +2: bf cr7*4+3,3f + rotldi r9,r9,8 +96:stb r9,0(r3) +3: li r3,0 + blr .Ldst_unaligned: PPC_MTOCRF 0x01,r6 /* put #bytes to 8B bdry into cr7 */ @@ -218,7 +232,6 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 121: 132: addir3,r3,8 -123: 134: 135: 138: @@ -226,6 +239,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 140: 141: 142: +123: +144: +145: /* * here we have had a fault on a load and r3 points to the first @@ -309,6 +325,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 187: 188: 189: +194: +195: +196: 1: ld r6,-24(r1) ld r5,-8(r1) @@ -329,7 +348,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ .llong 72b,172b .llong 23b,123b .llong 73b,173b + .llong 44b,144b .llong 74b,174b + .llong 45b,145b .llong 75b,175b .llong 24b,124b .llong 25b,125b @@ -347,6 +368,9 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ .llong 79b,179b .llong 80b,180b .llong 34b,134b + .llong 94b,194b + .llong 95b,195b + .llong 96b,196b .llong 35b,135b .llong 81b,181b .llong 36b,136b ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009 10:08:22 pm Sachin P. Sant wrote: Mark Nelson wrote: Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). This patch fixes the issue at my side. I tried booting the system few times and every single time it came up clean. Good to hear. Thanks for testing Sanchin! Mark ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009, Mark Nelson wrote: On Wed, 25 Feb 2009 08:50:46 pm Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Unfortunately not, now it crashes while accessing the memory pointed to by GPR16, in NIP: copy_page_range+x0608/0x628 LR: dup_mm+0x2e4/0x428 Trace: debug_table+0xcc70/0x1afe0 (unreliable) dup_mm+0x2e4/0x428 copy_process+0x86c/0xf9c do_fork+0x188/0x39c sys_clone+0x58/0x70 ppc_clone+0x8/0xc However, after reverting 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, I still see similar problems as above (crash in copy_page_range()). Which makes me think that 1. Your new patch fixes the problem introduced by 25d6e2d7, 2. There's still another issue than the one introduced by 25d6e2d7. Does the following patch fix the errors you're seeing? (it applies the same fix as the previous patch but this time to copy_tofrom_user, which I updated in a4e22f02f5b6518c1484faea1f88d81802b9feac) Thanks, but I still get crashes in copy_page_range(). With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: geert.uytterhoe...@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Thu, 26 Feb 2009 12:31:20 am Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Wed, 25 Feb 2009 08:50:46 pm Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Unfortunately not, now it crashes while accessing the memory pointed to by GPR16, in NIP: copy_page_range+x0608/0x628 LR: dup_mm+0x2e4/0x428 Trace: debug_table+0xcc70/0x1afe0 (unreliable) dup_mm+0x2e4/0x428 copy_process+0x86c/0xf9c do_fork+0x188/0x39c sys_clone+0x58/0x70 ppc_clone+0x8/0xc However, after reverting 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, I still see similar problems as above (crash in copy_page_range()). Which makes me think that 1. Your new patch fixes the problem introduced by 25d6e2d7, 2. There's still another issue than the one introduced by 25d6e2d7. Does the following patch fix the errors you're seeing? (it applies the same fix as the previous patch but this time to copy_tofrom_user, which I updated in a4e22f02f5b6518c1484faea1f88d81802b9feac) Thanks, but I still get crashes in copy_page_range(). Hmmm... I'm out of ideas for the moment, but thanks for testing anyway! Mark ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Thu, 26 Feb 2009 09:45:41 am Mark Nelson wrote: On Thu, 26 Feb 2009 12:31:20 am Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Wed, 25 Feb 2009 08:50:46 pm Geert Uytterhoeven wrote: On Wed, 25 Feb 2009, Mark Nelson wrote: On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Unfortunately not, now it crashes while accessing the memory pointed to by GPR16, in NIP: copy_page_range+x0608/0x628 LR: dup_mm+0x2e4/0x428 Trace: debug_table+0xcc70/0x1afe0 (unreliable) dup_mm+0x2e4/0x428 copy_process+0x86c/0xf9c do_fork+0x188/0x39c sys_clone+0x58/0x70 ppc_clone+0x8/0xc However, after reverting 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, I still see similar problems as above (crash in copy_page_range()). Which makes me think that 1. Your new patch fixes the problem introduced by 25d6e2d7, 2. There's still another issue than the one introduced by 25d6e2d7. Does the following patch fix the errors you're seeing? (it applies the same fix as the previous patch but this time to copy_tofrom_user, which I updated in a4e22f02f5b6518c1484faea1f88d81802b9feac) Thanks, but I still get crashes in copy_page_range(). Hmmm... I'm out of ideas for the moment, but thanks for testing anyway! Mark ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev If you revert both 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556 and a4e22f02f5b6518c1484faea1f88d81802b9feac, does it help? You could also try to revert 57dda6ef5bd5b9e60410477ad29e654097e2cca1 just in case I need to keep wearing the brown paper bag for a bit longer :) Thanks! Mark ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Hello, On Tue 24-02-09 12:08:37, Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. of the page just before the faulting address (i.e., for current fault it would be 0xc0003f37-0xc0003f37). As far as I can remember powerpc monitor could dump it. Here is the page dump. This time it crashed while accessing address 0xc0002d67. Thanks for the dump. Unable to handle kernel paging request for data at address 0xc 0002d67 Faulting instruction address: 0xc0039574 cpu 0x1: Vector: 300 (Data Access) at [c0004288b0b0] pc: c0039574: .memcpy+0x74/0x244 lr: c01b497c: .ext3_xattr_get+0x288/0x2f4 sp: c0004288b330 msr: 80009032 1:mon d 0xc0002d66 ... SNIP ... c0002d66efd0 || c0002d66efe0 || c0002d66eff0 || c0002d66f000 02ea0004 0100e200d20a || c0002d66f010 || c0002d66f020 0706e40f 1b00e200d20a || c0002d66f030 73656c696e757800 |selinux.| c0002d66f040 || c0002d66f050 || c0002d66f060 || ... SNIP ... c0002d66ff60 || c0002d66ff70 || c0002d66ff80 || c0002d66ff90 || c0002d66ffa0 || c0002d66ffb0 || c0002d66ffc0 || c0002d66ffd0 || c0002d66ffe0 73797374 656d5f753a6f626a |system_u:obj| c0002d66fff0 6563745f723a7573 725f743a7330 |ect_r:usr_t:s0..| c0002d67 || 1:mon r R00 = e40f R16 = 005d R01 = c0004288b330 R17 = R02 = c09f59b8 R18 = fffbfe9e R03 = c00044aa34a0 R19 = 10042638 R04 = c0002d66fff4 R20 = 10041610 R05 = 0003 R21 = 00ff R06 = R22 = 0006 R07 = 0001 R23 = c07d27c1 R08 = 723a7573725f743a R24 = c0002c0cd758 R09 = 3a6f626a6563745f R25 = c00044aa3488 R10 = c017b43c R26 = c0002c0cd6f0 R11 = c0002d66f020 R27 = c0002c0cd860 R12 = d23c14b0 R28 = c0002c0b0840 R13 = c0a93680 R29 = 001b R14 = 41ed R30 = c09880b0 R15 = 1004 R31 = ffde pc = c0039574 .memcpy+0x74/0x244 lr = c01b497c .ext3_xattr_get+0x288/0x2f4 msr = 80009032 cr = 4400044b ctr = xer = 2001 trap = 300 dar = c0002d67 dsisr = 4000 1:mon zr BTW, I suppose you use 4KB blocksize on the filesystem, right? Yes. dumpe2fs /dev/sda3 | grep -i block size dumpe2fs 1.39 (29-May-2006) Block size: 4096 OK. The xattr block causing oops is completely correct. To me it seems more like some problem in powerpc memcpy() (I saw there went some changes into in in the end of December) - we call it to copy 27 bytes from address 0xc0002d66ffe4 (which is one byte before end of the page). Could some of the powerpc guys have a look whether this could be the case? I'm not quite fluent in the powerpc assembly so it would take me ages ;). Honza -- Jan Kara j...@suse.cz SUSE Labs, CR ___ Linuxppc-dev mailing list
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Andrew Morton wrote: hm, I wonder what could have caused that - we haven't altered fs/ext3/xattr.c in ages. What is the most recent kernel version you know of which didn't do this? Bear in mind that this crash might be triggered by the current contents of the filesystem, so if possible, please test some other kernel versions on that disk. I am trying to boot a vanilla kernel on this machine for the first time. Haven't tried any other kernels. Will give it a try. It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? Some more information if this could be of any help. 0:mon di 0xc0039574 c0039574 e9240008 ld r9,8(r4) c0039578 409d0010 ble cr7,c0039588# .memcpy+0x88/0x244 c003957c 79290002 rotldi r9,r9,32 c0039580 9123 stw r9,0(r3) c0039584 38630004 addir3,r3,4 c0039588 409e0010 bne cr7,c0039598# .memcpy+0x98/0x244 c003958c 79298000 rotldi r9,r9,16 c0039590 b123 sth r9,0(r3) c0039594 38630002 addir3,r3,2 c0039598 409f000c bns cr7,c00395a4# .memcpy+0xa4/0x244 c003959c 79294000 rotldi r9,r9,8 c00395a0 9923 stb r9,0(r3) c00395a4 e8610030 ld r3,48(r1) c00395a8 4e800020 blr c00395ac 78a6e8c2 rldicl r6,r5,61,3 c00395b0 38a5fff0 addir5,r5,-16 0:mon r R00 = e40f R16 = 100edbc8 R01 = c0003e59b3e0 R17 = 100b R02 = c09c2110 R18 = 0005 R03 = c00044bc90e0 R19 = fff0d7a8 R04 = c00039c4 R20 = fff0d708 R05 = 0003 R21 = 00ff R06 = R22 = 0006 R07 = 0001 R23 = c079ab49 R08 = 723a7573725f743a R24 = c000372fe2a8 R09 = 3a6f626a6563745f R25 = c00044bc90c8 R10 = c0003b250968 R26 = c000372fe240 R11 = c0039500 R27 = c000372fe3b0 R12 = d244c590 R28 = c000372c5280 R13 = c0a53480 R29 = 001b R14 = 100d R30 = d24654d0 R15 = R31 = ffde pc = c0039574 .memcpy+0x74/0x244 lr = d244916c .ext3_xattr_get+0x288/0x2f4 [ext3] msr = 80009032 cr = 4400844b ctr = xer = 0001 trap = 300 dar = c00039d0 dsisr = 4000 0:mon Yes, this makes me even more suspitious that memcpy() on powerpc could be at fault. The instruction (ld r9,8(r4)) is loading last 8 bytes to copy, but in fact it should load only 3 bytes in our case because remaining 5 bytes are not in the range we specified and thus larger load can cause page fault... Honza -- Jan Kara j...@suse.cz SuSE CR Labs ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Mon, 23 Feb 2009, Paul Mackerras wrote: Andrew Morton writes: It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? I'm seeing a similar thing on PS3, but not in ext3. During early userspace setup (udevd), it crashes accessing a 0xc00* address in: | NIP setup+0x20/0x130 | LR copy_user_page+0x18/0x6c | Call trace: | do_wp_page+0x5b4/0x89c | do_page_fault+0x3a8/0x58c | handle_page_fault+0x20/0x5c I have CONFIG_DEBUG_PAGEALLOC=y. If I disable it, the system boots fine. If needed, I can probably bisect this tomorrow. It definitely didn't happen in 2.6.29-rc5. With kind regards, Geert Uytterhoeven Software Architect Sony Techsoft Centre Europe The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium Phone:+32 (0)2 700 8453 Fax: +32 (0)2 700 8622 E-mail: geert.uytterhoe...@sonycom.com Internet: http://www.sony-europe.com/ A division of Sony Europe (Belgium) N.V. VAT BE 0413.825.160 · RPR Brussels Fortis · BIC GEBABEBB · IBAN BE41293037680010 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009 02:51:20 am Jan Kara wrote: Hello, On Tue 24-02-09 12:08:37, Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. of the page just before the faulting address (i.e., for current fault it would be 0xc0003f37-0xc0003f37). As far as I can remember powerpc monitor could dump it. Here is the page dump. This time it crashed while accessing address 0xc0002d67. Thanks for the dump. Unable to handle kernel paging request for data at address 0xc 0002d67 Faulting instruction address: 0xc0039574 cpu 0x1: Vector: 300 (Data Access) at [c0004288b0b0] pc: c0039574: .memcpy+0x74/0x244 lr: c01b497c: .ext3_xattr_get+0x288/0x2f4 sp: c0004288b330 msr: 80009032 1:mon d 0xc0002d66 ... SNIP ... c0002d66efd0 || c0002d66efe0 || c0002d66eff0 || c0002d66f000 02ea0004 0100e200d20a || c0002d66f010 || c0002d66f020 0706e40f 1b00e200d20a || c0002d66f030 73656c696e757800 |selinux.| c0002d66f040 || c0002d66f050 || c0002d66f060 || ... SNIP ... c0002d66ff60 || c0002d66ff70 || c0002d66ff80 || c0002d66ff90 || c0002d66ffa0 || c0002d66ffb0 || c0002d66ffc0 || c0002d66ffd0 || c0002d66ffe0 73797374 656d5f753a6f626a |system_u:obj| c0002d66fff0 6563745f723a7573 725f743a7330 |ect_r:usr_t:s0..| c0002d67 || 1:mon r R00 = e40f R16 = 005d R01 = c0004288b330 R17 = R02 = c09f59b8 R18 = fffbfe9e R03 = c00044aa34a0 R19 = 10042638 R04 = c0002d66fff4 R20 = 10041610 R05 = 0003 R21 = 00ff R06 = R22 = 0006 R07 = 0001 R23 = c07d27c1 R08 = 723a7573725f743a R24 = c0002c0cd758 R09 = 3a6f626a6563745f R25 = c00044aa3488 R10 = c017b43c R26 = c0002c0cd6f0 R11 = c0002d66f020 R27 = c0002c0cd860 R12 = d23c14b0 R28 = c0002c0b0840 R13 = c0a93680 R29 = 001b R14 = 41ed R30 = c09880b0 R15 = 1004 R31 = ffde pc = c0039574 .memcpy+0x74/0x244 lr = c01b497c .ext3_xattr_get+0x288/0x2f4 msr = 80009032 cr = 4400044b ctr = xer = 2001 trap = 300 dar = c0002d67 dsisr = 4000 1:mon zr BTW, I suppose you use 4KB blocksize on the filesystem, right? Yes. dumpe2fs /dev/sda3 | grep -i block size dumpe2fs 1.39 (29-May-2006) Block size: 4096 OK. The xattr block causing oops is completely correct. To me it seems more like some problem in powerpc memcpy() (I saw there went some changes into in in the end of December) - we call it to copy 27 bytes from address 0xc0002d66ffe4 (which is one byte before end of the page). Could some of the powerpc guys have a look whether this could be the case? I'm not quite fluent in the powerpc assembly so it would take me ages ;). You're right - it's a problem with the 64bit
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Wed, 25 Feb 2009 05:01:59 am Geert Uytterhoeven wrote: On Mon, 23 Feb 2009, Paul Mackerras wrote: Andrew Morton writes: It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? I'm seeing a similar thing on PS3, but not in ext3. During early userspace setup (udevd), it crashes accessing a 0xc00* address in: | NIP setup+0x20/0x130 | LR copy_user_page+0x18/0x6c | Call trace: | do_wp_page+0x5b4/0x89c | do_page_fault+0x3a8/0x58c | handle_page_fault+0x20/0x5c I have CONFIG_DEBUG_PAGEALLOC=y. If I disable it, the system boots fine. If needed, I can probably bisect this tomorrow. It definitely didn't happen in 2.6.29-rc5. No need to bisect - it was 25d6e2d7c58ddc4a3b614fc5381591c0cfe66556, my commit that optimised 64bit memcpy() for Power6 and Cell. The bug was in -rc1, but if your copies were 8-byte aligned with respect to the source the problem wouldn't have been seen... Could this have been why you didn't see it in -rc5? I'll work on a fix now. Thanks! Mark ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Tue, 24 Feb 2009 05:38:37 pm Sachin P. Sant wrote: Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. Hi Sanchin and Geert, Does the patch below fix the problems you're seeing? If it does I'll send a properly written up and formatted patch to linuxppc-dev (as well as another one to fix the same problem in copy_tofrom_user()). Thanks and sorry again! Mark --- arch/powerpc/lib/memcpy_64.S | 26 -- 1 file changed, 20 insertions(+), 6 deletions(-) Index: upstream/arch/powerpc/lib/memcpy_64.S === --- upstream.orig/arch/powerpc/lib/memcpy_64.S +++ upstream/arch/powerpc/lib/memcpy_64.S @@ -53,18 +53,19 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ 3: std r8,8(r3) beq 3f addir3,r3,16 - ld r9,8(r4) .Ldo_tail: bf cr7*4+1,1f - rotldi r9,r9,32 + lwz r9,8(r4) + addir4,r4,4 stw r9,0(r3) addir3,r3,4 1: bf cr7*4+2,2f - rotldi r9,r9,16 + lhz r9,8(r4) + addir4,r4,2 sth r9,0(r3) addir3,r3,2 2: bf cr7*4+3,3f - rotldi r9,r9,8 + lbz r9,8(r4) stb r9,0(r3) 3: ld r3,48(r1) /* return dest pointer */ blr @@ -133,11 +134,24 @@ END_FTR_SECTION_IFCLR(CPU_FTR_UNALIGNED_ cmpwi cr1,r5,8 addir3,r3,32 sld r9,r9,r10 - ble cr1,.Ldo_tail + ble cr1,6f ld r0,8(r4) srd r7,r0,r11 or r9,r7,r9 - b .Ldo_tail +6: + bf cr7*4+1,1f + rotldi r9,r9,32 + stw r9,0(r3) + addir3,r3,4 +1: bf cr7*4+2,2f + rotldi r9,r9,16 + sth r9,0(r3) + addir3,r3,2 +2: bf cr7*4+3,3f + rotldi r9,r9,8 + stb r9,0(r3) +3: ld r3,48(r1) /* return dest pointer */ + blr .Ldst_unaligned: PPC_MTOCRF 0x01,r6 # put #bytes to 8B bdry into cr7 ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
On Mon, 23 Feb 2009 15:16:05 +0530 Sachin P. Sant sach...@in.ibm.com wrote: 2.6.29-rc6 bootup on a powerpc box failed with Unable to handle kernel paging request for data at address 0xc0003f38 Faulting instruction address: 0xc0039574 cpu 0x1: Vector: 300 (Data Access) at [c0003baf3020] pc: c0039574: .memcpy+0x74/0x244 lr: d244916c: .ext3_xattr_get+0x288/0x2f4 [ext3] sp: c0003baf32a0 msr: 80009032 dar: c0003f38 dsisr: 4000 current = 0xc0003e54b010 paca= 0xc0a53680 pid = 1840, comm = readahead enter ? for help [link register ] d244916c .ext3_xattr_get+0x288/0x2f4 [ext3] [c0003baf32a0] d2449104 .ext3_xattr_get+0x220/0x2f4 [ext3] (unreliab le) [c0003baf3390] d244a6e8 .ext3_xattr_security_get+0x40/0x5c [ext3] [c0003baf3400] c0148154 .generic_getxattr+0x74/0x9c [c0003baf34a0] c0333400 .inode_doinit_with_dentry+0x1c4/0x678 [c0003baf3560] c032c6b0 .security_d_instantiate+0x50/0x68 [c0003baf35e0] c013c818 .d_instantiate+0x78/0x9c [c0003baf3680] c013ced0 .d_splice_alias+0xf0/0x120 [c0003baf3720] d243e05c .ext3_lookup+0xec/0x134 [ext3] [c0003baf37c0] c0131e74 .do_lookup+0x110/0x260 [c0003baf3880] c0134ed0 .__link_path_walk+0xa98/0x1010 [c0003baf3970] c01354a0 .path_walk+0x58/0xc4 [c0003baf3a20] c0135720 .do_path_lookup+0x138/0x1e4 [c0003baf3ad0] c013645c .path_lookup_open+0x6c/0xc8 [c0003baf3b70] c0136780 .do_filp_open+0xcc/0x874 [c0003baf3d10] c01251e0 .do_sys_open+0x80/0x140 [c0003baf3dc0] c016aaec .compat_sys_open+0x24/0x38 [c0003baf3e30] c000855c syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 0ff0ef18 SP (ffc6f4b0) is in userspace 1:mon Following EXT3 related options were enabled in the config. CONFIG_EXT3_FS=m CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y CONFIG_EXT3_FS_SECURITY=y hm, I wonder what could have caused that - we haven't altered fs/ext3/xattr.c in ages. What is the most recent kernel version you know of which didn't do this? Bear in mind that this crash might be triggered by the current contents of the filesystem, so if possible, please test some other kernel versions on that disk. It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Andrew Morton writes: It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? Paul. ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Andrew Morton wrote: hm, I wonder what could have caused that - we haven't altered fs/ext3/xattr.c in ages. What is the most recent kernel version you know of which didn't do this? Bear in mind that this crash might be triggered by the current contents of the filesystem, so if possible, please test some other kernel versions on that disk. I am trying to boot a vanilla kernel on this machine for the first time. Haven't tried any other kernels. Will give it a try. It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? Some more information if this could be of any help. 0:mon di 0xc0039574 c0039574 e9240008 ld r9,8(r4) c0039578 409d0010 ble cr7,c0039588# .memcpy+0x88/0x244 c003957c 79290002 rotldi r9,r9,32 c0039580 9123 stw r9,0(r3) c0039584 38630004 addir3,r3,4 c0039588 409e0010 bne cr7,c0039598# .memcpy+0x98/0x244 c003958c 79298000 rotldi r9,r9,16 c0039590 b123 sth r9,0(r3) c0039594 38630002 addir3,r3,2 c0039598 409f000c bns cr7,c00395a4# .memcpy+0xa4/0x244 c003959c 79294000 rotldi r9,r9,8 c00395a0 9923 stb r9,0(r3) c00395a4 e8610030 ld r3,48(r1) c00395a8 4e800020 blr c00395ac 78a6e8c2 rldicl r6,r5,61,3 c00395b0 38a5fff0 addir5,r5,-16 0:mon r R00 = e40f R16 = 100edbc8 R01 = c0003e59b3e0 R17 = 100b R02 = c09c2110 R18 = 0005 R03 = c00044bc90e0 R19 = fff0d7a8 R04 = c00039c4 R20 = fff0d708 R05 = 0003 R21 = 00ff R06 = R22 = 0006 R07 = 0001 R23 = c079ab49 R08 = 723a7573725f743a R24 = c000372fe2a8 R09 = 3a6f626a6563745f R25 = c00044bc90c8 R10 = c0003b250968 R26 = c000372fe240 R11 = c0039500 R27 = c000372fe3b0 R12 = d244c590 R28 = c000372c5280 R13 = c0a53480 R29 = 001b R14 = 100d R30 = d24654d0 R15 = R31 = ffde pc = c0039574 .memcpy+0x74/0x244 lr = d244916c .ext3_xattr_get+0x288/0x2f4 [ext3] msr = 80009032 cr = 4400844b ctr = xer = 0001 trap = 300 dar = c00039d0 dsisr = 4000 0:mon So the other thing i noticed was that this machine was running a kernel with selinux enabled. I turned off selinux and there were no issues during bootup. It was a clean boot. Thanks -Sachin -- - Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India - ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Paul Mackerras wrote: It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? Yes CONFIG_DEBUG_PAGEALLOC is enabled and the page size is 64K. CONFIG_DEBUG_PAGEALLOC=y CONFIG_PPC_64K_PAGES=y Thanks -Sachin -- - Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India - ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Andrew Morton writes: It looks like we died in ext3_xattr_block_get(): memcpy(buffer, bh-b_data + le16_to_cpu(entry-e_value_offs), size); Perhaps entry-e_value_offs is no good. I wonder if the filesystem is corrupted and this snuck through the defenses. I also wonder if there is enough info in that trace for a ppc person to be able to determine whether the faulting address is in the source or destination of the memcpy() (please)? It appears to have faulted on a load, implicating the source. The address being referenced (0xc0003f38) doesn't look outlandish. I wonder if this kernel has CONFIG_DEBUG_PAGEALLOC turned on, and what page size is selected? Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents of the page just before the faulting address (i.e., for current fault it would be 0xc0003f37-0xc0003f37). As far as I can remember powerpc monitor could dump it. BTW, I suppose you use 4KB blocksize on the filesystem, right? Honza -- Jan Kara j...@suse.cz SuSE CR Labs ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev
Re: Crash (ext3 ) during 2.6.29-rc6 boot
Jan Kara wrote: Hmm, OK. But then I'm not sure how that can happen. Obviously, memcpy somehow got beyond end of the page referenced by bh-b_data. So it means that le16_to_cpu(entry-e_value_offs) + size page_size. But ext3_xattr_find_entry() calls ext3_xattr_check_entry() which in particular checks whether e_value_offs + e_value_size isn't greater than bh-b_size. So I see no way how memcpy can get beyond end of the page. Sachin, is the problem reproducible? If yes, can you send us contents Yes, i am able to recreate this problem easily. As i had mentioned if the earlier kernel is booted with selinux enabled and then 2.6.29-rc6 is booted i get this crash. But if i specify selinux=0 at command line, 2.6.29-rc6 boots without any problem. of the page just before the faulting address (i.e., for current fault it would be 0xc0003f37-0xc0003f37). As far as I can remember powerpc monitor could dump it. Here is the page dump. This time it crashed while accessing address 0xc0002d67. Unable to handle kernel paging request for data at address 0xc 0002d67 Faulting instruction address: 0xc0039574 cpu 0x1: Vector: 300 (Data Access) at [c0004288b0b0] pc: c0039574: .memcpy+0x74/0x244 lr: c01b497c: .ext3_xattr_get+0x288/0x2f4 sp: c0004288b330 msr: 80009032 1:mon d 0xc0002d66 ... SNIP ... c0002d66efd0 || c0002d66efe0 || c0002d66eff0 || c0002d66f000 02ea0004 0100e200d20a || c0002d66f010 || c0002d66f020 0706e40f 1b00e200d20a || c0002d66f030 73656c696e757800 |selinux.| c0002d66f040 || c0002d66f050 || c0002d66f060 || ... SNIP ... c0002d66ff60 || c0002d66ff70 || c0002d66ff80 || c0002d66ff90 || c0002d66ffa0 || c0002d66ffb0 || c0002d66ffc0 || c0002d66ffd0 || c0002d66ffe0 73797374 656d5f753a6f626a |system_u:obj| c0002d66fff0 6563745f723a7573 725f743a7330 |ect_r:usr_t:s0..| c0002d67 || 1:mon r R00 = e40f R16 = 005d R01 = c0004288b330 R17 = R02 = c09f59b8 R18 = fffbfe9e R03 = c00044aa34a0 R19 = 10042638 R04 = c0002d66fff4 R20 = 10041610 R05 = 0003 R21 = 00ff R06 = R22 = 0006 R07 = 0001 R23 = c07d27c1 R08 = 723a7573725f743a R24 = c0002c0cd758 R09 = 3a6f626a6563745f R25 = c00044aa3488 R10 = c017b43c R26 = c0002c0cd6f0 R11 = c0002d66f020 R27 = c0002c0cd860 R12 = d23c14b0 R28 = c0002c0b0840 R13 = c0a93680 R29 = 001b R14 = 41ed R30 = c09880b0 R15 = 1004 R31 = ffde pc = c0039574 .memcpy+0x74/0x244 lr = c01b497c .ext3_xattr_get+0x288/0x2f4 msr = 80009032 cr = 4400044b ctr = xer = 2001 trap = 300 dar = c0002d67 dsisr = 4000 1:mon zr BTW, I suppose you use 4KB blocksize on the filesystem, right? Yes. dumpe2fs /dev/sda3 | grep -i block size dumpe2fs 1.39 (29-May-2006) Block size: 4096 Thanks -Sachin -- - Sachin Sant IBM Linux Technology Center India Systems and Technology Labs Bangalore, India - ___ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev