Re: system without RAM on node0 boot fail
On Feb 1, 2008 10:11 AM, dean gaudet <[EMAIL PROTECTED]> wrote: > actually yeah i've seen this... in a bizarre failure situation in a system > which physically had RAM in the boot node but it was never enumerated for > the kernel (other nodes had RAM which was enumerated). > > so technically there was boot node RAM but the kernel never saw it. BIOS sometime disabled some dimms on node that it thought that dimm was bad and caused mce error in last boot. YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
actually yeah i've seen this... in a bizarre failure situation in a system which physically had RAM in the boot node but it was never enumerated for the kernel (other nodes had RAM which was enumerated). so technically there was boot node RAM but the kernel never saw it. -dean On Wed, 30 Jan 2008, Christoph Lameter wrote: > x86 supports booting from a node without RAM? > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
On Thursday 31 January 2008 08:22:15 H. Peter Anvin wrote: > Yinghai Lu wrote: > > On Jan 30, 2008 10:09 PM, H. Peter Anvin <[EMAIL PROTECTED]> wrote: > >> Christoph Lameter wrote: > >>> x86 supports booting from a node without RAM? > > > > it is a two sockets system. only 4G RAM installed on node1. > > > > "Node 1" is the boot CPU, though, right? > > I don't know if the spec requires node 0 to be the boot node. Probably not. There is no spec I know of that completely defines "nodes" on x86. Actually I think on Linux calls them that. There is the ACPI 3.0 SRAT spec that defines memory affinity, but I don't think it has any requirements about where the memory must be. Even if there was a spec people who actually put in DIMMs tend to violate it. It seems to be not totally uncommon to just stuff them all into the same corner of the motherboard to give a "tidy appearance" (for non physicists :-) and that usually results in memory less nodes. Anyways this area is something that regresses regularly. I had fixed it several times and tested all cases on SimNow, but after some time it tends to bit rot again unfortunately. The people who usually test kernels probably know where to put the DIMMs in. Probably just happened again. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
Yinghai Lu wrote: On Jan 30, 2008 10:09 PM, H. Peter Anvin <[EMAIL PROTECTED]> wrote: Christoph Lameter wrote: x86 supports booting from a node without RAM? it is a two sockets system. only 4G RAM installed on node1. "Node 1" is the boot CPU, though, right? I don't know if the spec requires node 0 to be the boot node. Probably not. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
On Jan 30, 2008 10:09 PM, H. Peter Anvin <[EMAIL PROTECTED]> wrote: > Christoph Lameter wrote: > > x86 supports booting from a node without RAM? it is a two sockets system. only 4G RAM installed on node1. > > From the looks of it I would say he probably has the boot node numbered 1. > > The e820 map is also "interesting" - doesn't list the first 256 bytes, > which corresponds to the first quarter(!) of the real-mode exception table. i kexec that from 2.6.24 (with discontinuous and slab) so that e820 is passed by kexec from first kernel. normal pxeboot will have start for 0. YH -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
Christoph Lameter wrote: x86 supports booting from a node without RAM? From the looks of it I would say he probably has the boot node numbered 1. The e820 map is also "interesting" - doesn't list the first 256 bytes, which corresponds to the first quarter(!) of the real-mode exception table. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
x86 supports booting from a node without RAM? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: system without RAM on node0 boot fail
current x86.git Command line: apic=debug acpi.debug_level=0x000F debug initcall_debug pci=routeirq ramdisk_size=131072 root=/dev/ram0 rw ip=dhcp console=uart8250,io,0x3f8,115200n8 BIOS-provided physical RAM map: BIOS-e820: 0100 - 0009bc00 (usable) BIOS-e820: 0009bc00 - 000a (reserved) BIOS-e820: 000e - 0010 (reserved) BIOS-e820: 0010 - dcff (usable) BIOS-e820: dcff - dcffe000 (ACPI data) BIOS-e820: dcffe000 - dd00 (ACPI NVS) BIOS-e820: e000 - f000 (reserved) BIOS-e820: fec0 - fec01000 (reserved) BIOS-e820: ff70 - 0001 (reserved) BIOS-e820: 0001 - 00012300 (usable) Early serial console at I/O port 0x3f8 (options '115200n8') console [uart0] enabled end_pfn_map = 1191936 DMI present. ... SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 1 PXM 1 0-a SRAT: Node 1 PXM 1 0-dd00 SRAT: Node 1 PXM 1 0-12300 ACPI: SLIT: nodes = 2 10 13 13 10 mapped APIC to ff5fb000 (fee0) Bootmem setup node 1 -00012300 NODE_DATA [e000 - 00014fff] bootmap [00015000 - 000395ff] pages 25 early res: 0 [0-fff] BIOS data page early res: 1 [6000-7fff] SMP_TRAMPOLINE early res: 2 [20-d9c273] TEXT DATA BSS early res: 3 [7e6f4000-7fff3a25] RAMDISK early res: 4 [9bc00-9dbff] EBDA early res: 5 [8000-dfff] PGTABLE Could not find start_pfn for node 0 Pid: 0, comm: swapper Not tainted 2.6.24-smp-04921-gbce08dc-dirty #43 Call Trace: [] free_area_init_node+0x22/0x381 [] generic_swap+0x0/0x17 [] find_zone_movable_pfns_for_nodes+0x54/0x271 [] free_area_init_nodes+0x239/0x287 [] paging_init+0x46/0x4c [] setup_arch+0x3c3/0x44e [] start_kernel+0x6f/0x2c7 [] _sinittext+0x1e1/0x1e8 RIP 0x10 2.6.24 discontinuous and slab works well 2.6.24 sparse and slub will get oops ehci_hcd :00:02.1: EHCI Host Controller Unable to handle kernel paging request at 3078 RIP: [] __alloc_pages+0x7d/0x33a PGD 0 Oops: [1] SMP CPU 3 Modules linked in: Pid: 1, comm: swapper Not tainted 2.6.24-smp #1 RIP: 0010:[] [] __alloc_pages+0x7d/0x33a RSP: 0018:810122a55bc0 EFLAGS: 00010246 RAX: RBX: RCX: 0002 RDX: 3070 RSI: RDI: 00d0 RBP: 00d0 R08: 810001025be0 R09: 810122a55b00 R10: 8085e2a0 R11: 00a0 R12: 3070 R13: R14: 00d0 R15: 810122a52000 FS: () GS:810122c02f00() knlGS: CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b CR2: 3078 CR3: 00201000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process swapper (pid: 1, threadinfo 810122a54000, task 810122a52000) Stack: 810122a55d20 80455b55 0080 3078 22a55d10 001200d0 81011f5ea000 0020 fff2 Call Trace: [] new_slab+0xdd/0x236 [] __slab_alloc+0x1a7/0x397 [] dma_pool_create+0x86/0x147 [] kmem_cache_alloc_node+0x3e/0x6d [] dma_pool_create+0x86/0x147 [] hcd_buffer_create+0x57/0x89 [] compat_blkdev_ioctl+0xd72/0x11f4 [] usb_add_hcd+0x72/0x59f [] usb_hcd_pci_probe+0x1e4/0x28b [] pci_device_probe+0xd1/0x136 [] driver_probe_device+0xd3/0x150 [] __driver_attach+0x0/0x93 [] __driver_attach+0x5a/0x93 [] bus_for_each_dev+0x43/0x6e [] bus_add_driver+0x79/0x1bd [] __pci_register_driver+0x5b/0x8d [] kernel_init+0x175/0x2e1 [] child_rip+0xa/0x12 [] kernel_init+0x0/0x2e1 [] child_rip+0x0/0x12 Code: 49 83 7c 24 08 00 75 0e 48 c7 44 24 38 00 00 00 00 e9 93 02 RIP [] __alloc_pages+0x7d/0x33a RSP CR2: 3078 ---[ end trace c08baa60a7f2ad32 ]--- Kernel panic - not syncing: Attempted to kill init! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
system without RAM on node0 boot fail
one system not install RAM on node0 got SRAT: PXM 0 -> APIC 0 -> Node 0 SRAT: PXM 0 -> APIC 1 -> Node 0 SRAT: PXM 1 -> APIC 2 -> Node 1 SRAT: PXM 1 -> APIC 3 -> Node 1 SRAT: Node 1 PXM 1 0-a SRAT: Node 1 PXM 1 0-dd00 SRAT: Node 1 PXM 1 0-12300 Bootmem setup node 1 -00012300 NODE_DATA [e000 - 00014fff] bootmap [00015000 - 000395ff] pages 25 early res: 0 [0-fff] BIOS data page early res: 1 [6000-7fff] SMP_TRAMPOLINE early res: 2 [20-d9c273] TEXT DATA BSS early res: 3 [7e70-7a25] RAMDISK early res: 4 [9bc00-9dbff] EBDA early res: 5 [8000-dfff] PGTABLE Could not find start_pfn for node 0 Pid: 0, comm: swapper Not tainted 2.6.24-smp-04921-gbce08dc-dirty #42 Call Trace: [] free_area_init_node+0x22/0x381 [] generic_swap+0x0/0x17 [] find_zone_movable_pfns_for_nodes+0x54/0x271 [] free_area_init_nodes+0x239/0x287 [] paging_init+0x46/0x4c [] setup_arch+0x3c3/0x44e [] start_kernel+0x6f/0x2c7 [] _sinittext+0x1e1/0x1e8 RIP 0x10 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/