I'm trying to boot linux 2.6.22.9 on an mpc860c rev d4. When init trys to spawn sh, during the exec, the kernel oopses as seen below:
## Starting application at 0x00400000 ... loaded at: 00400000 004EF15C board data at: 03F9FBC0 03F9FBFC relocated to: 00404044 00404080 zimage at: 00404E74 004EC662 avail ram: 004F0000 04000000 Linux/PPC load: console=ttyCPM,38400 Uncompressing Linux...done. Now booting the kernel Linux version 2.6.22.9 ([EMAIL PROTECTED]) (gcc version 4.2.1) #113 Wed Nov 21 10:49:36 PST 2007 Zone PFN ranges: DMA 0 -> 16384 Normal 16384 -> 16384 early_node_map[1] active PFN ranges 0: 0 -> 16384 Built 1 zonelists. Total pages: 16256 Kernel command line: console=ttyCPM,38400 PID hash table entries: 256 (order: 8, 1024 bytes) Decrementer Frequency = 183750000/60 Console: colour dummy device 80x25 cpm_uart: console: compat mode Dentry cache hash table entries: 8192 (order: 3, 32768 bytes) Inode-cache hash table entries: 4096 (order: 2, 16384 bytes) Memory: 63244k available (880k kernel code, 268k data, 444k init, 0k highmem) Mount-cache hash table entries: 512 ADDSI: Init io scheduler noop registered (default) Serial: CPM driver $Revision: 0.02 $ ttyCPM0 at MMIO 0xc5000a80 (irq = 20) is a CPM UART mice: PS/2 mouse device common for all mice Freeing unused kernel memory: 444k init init started: BusyBox v1.8.0 (2007-11-16 14:24:51 PST) starting pid 103, tty '': '/bin/sh' Oops: kernel access of bad area, sig: 11 [#1] NIP: c0044ed0 LR: c0044ff0 CTR: 00000001 REGS: c3c0bd00 TRAP: 0300 Not tainted (2.6.22.9) MSR: 00009032 <EE,ME,IR,DR> CR: 30099099 XER: a0008c7f DAR: ff80103f, DSISR: c0000000 TASK = c0288070[103] 'init' THREAD: c3c0a000 GPR00: c0044ff0 c3c0bdb0 c0288070 ff800fff 00000000 7faf8000 00000000 00000000 GPR08: c01a8f58 c017d91c 00000002 c0179cd0 30099093 1007687c 00000002 c00f8744 GPR16: 00000000 c00f0a64 c011d1ac c00f0aa4 c00f0a90 c0120000 00000001 00000003 GPR24: c3c1ce00 00000000 c0180000 c0247550 00000000 c3c0bdc8 c0179cd0 ff800fff NIP [c0044ed0] remove_vma+0x14/0x70 LR [c0044ff0] exit_mmap+0xc4/0xf0 Call Trace: [c3c0bdb0] [c3c0bdc8] 0xc3c0bdc8 (unreliable) [c3c0bdc0] [c0044ff0] exit_mmap+0xc4/0xf0 [c3c0bdf0] [c000f74c] mmput+0x50/0xd4 [c3c0be00] [c00591f4] flush_old_exec+0x3b8/0x7a8 [c3c0be50] [c0086cc0] load_elf_binary+0x2e8/0x1454 [c3c0bee0] [c005892c] search_binary_handler+0x58/0x12c [c3c0bf00] [c0059d64] do_execve+0x13c/0x1f0 [c3c0bf20] [c00089b4] sys_execve+0x50/0x90 [c3c0bf40] [c0002a40] ret_from_syscall+0x0/0x38 Instruction dump: 7d808120 38210040 4e800020 83c30000 4bffff18 38a00000 4bffff9c 7c0802a6 9421fff0 bfc10008 90010014 7c7f1b78 <81230040> 83c3000c 2f890000 419e0018 The interesting thing is that r3 points to something funny. While tracing this problem down, I replaced the remove_vma function with the following: /* * Close a vm structure and free it, returning the next. */ static struct vm_area_struct * __attribute__((__noinline__)) __remove_vma(struct vm_area_struct *vma) { struct vm_area_struct *next = vma->vm_next; might_sleep(); if (vma->vm_ops && vma->vm_ops->close) vma->vm_ops->close(vma); if (vma->vm_file) fput(vma->vm_file); mpol_free(vma_policy(vma)); kmem_cache_free(vm_area_cachep, vma); return next; } static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) { asm volatile ( "lis 4,-128\n" "ori 4,4,4095\n" "tweq 3,4\n" "lwz 5,0(1)\n" "tweq 3,4\n" ); return __remove_vma( vma ); } With this code, the kernel oopses on the *second* tweq instruction: Kernel BUG at c0045fd4 [verbose debug info unavailable] Oops: Exception in kernel mode, sig: 5 [#1] NIP: c0045fd4 LR: c00460a0 CTR: 00000001 REGS: c3c0bd10 TRAP: 0700 Not tainted (2.6.22.9) MSR: 00029032 <EE,ME,IR,DR> CR: 30099099 XER: a0008c7f TASK = c0292b40[103] 'init' THREAD: c3c0a000 GPR00: 00000001 c3c0bdc0 c0292b40 ff800fff ff800fff c3c0bdf0 00000000 00000000 GPR08: c0219398 c017d91c 00000002 c0179cd0 30099093 1007687c 00000002 c00f8744 GPR16: 00000000 c00f0a64 c011d1ac c00f0aa4 c00f0a90 c0120000 00000001 00000003 GPR24: c3c32e00 00000000 c0180000 c0247080 00000000 c3c0bdc8 c0179cd0 c017641c NIP [c0045fd4] remove_vma+0x10/0x18 LR [c00460a0] exit_mmap+0xc4/0xf0 Call Trace: [c3c0bdc0] [c0046074] exit_mmap+0x98/0xf0 (unreliable) [c3c0bdf0] [c000f74c] mmput+0x50/0xd4 [c3c0be00] [c005920c] flush_old_exec+0x3b8/0x7a8 [c3c0be50] [c0086cd8] load_elf_binary+0x2e8/0x1454 [c3c0bee0] [c0058944] search_binary_handler+0x58/0x12c [c3c0bf00] [c0059d7c] do_execve+0x13c/0x1f0 [c3c0bf20] [c00089b4] sys_execve+0x50/0x90 [c3c0bf40] [c0002a40] ret_from_syscall+0x0/0x38 Instruction dump: 7fe4fb78 4800a0ed 80010014 7fc3f378 7c0803a6 bbc10008 38210010 4e800020 3c80ff80 60840fff 7c832008 80a10000 <7c832008> 4bffff7c 7c0802a6 9421ffd0 The access of memory through r1 seems to corrupt r3, and always with the same value. The problem isn't necessarily here, though. If I modify my remove_vma function to cause and correct the problem (by saving r3 prior to the memory access and restoring it afterwards), I just get the same problem in some other part of the code, but the oops is always caused because the base register for some memory access is set to ff800fff. I applied a recent patch I found that corrects the address returned by cpm_dpram_addr and its use in cpu_uart_cpm1.h, and I've created my own platform setup file by copying the mpc866ads setup enough to get the console uart (SMC1) to work. If there is any other information I can or need to provide, let me know. Any help would be greatly appreciated. Thanks, John _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev