To analyze one needs the logs. And a bugzilla is a good place holder for the logs.
On Dec 1, 2011, at 6:05 PM, Tony Rios <t...@tonyrios.com> wrote: > Sunil, > Is submitting a bug report the only answer? > I'm happy to send in this information, but can I take the cluster down > entirely and sort of reset it so we can get these servers back online and > talking again in the meanwhile? > Tony > > On Dec 1, 2011, at 5:05 PM, Sunil Mushran wrote: > >> Node 3 is joining the domain. It is having problms getting the superblock >> cluster lock. >> Create a bugzilla on oss.oracle.com and attach the /var/logs/messages from >> all nodes. >> If you have netconsole setup, attach those logs too. >> >> On 12/01/2011 04:55 PM, Tony Rios wrote: >>> I'm having an issue today where I just can't seem to keep all the servers >>> in the cluster online. >>> They aren't losing network connectivity and I can ping the iSCSI host just >>> fine and the host is logged in. >>> >>> These are the errors form the dmesg when I try to mount the filesystem: >>> >>> root@pedge36:~# dmesg >>> [ 0.000000] Initializing cgroup subsys cpuset >>> [ 0.000000] Initializing cgroup subsys cpu >>> [ 0.000000] Linux version 2.6.38-10-generic (buildd@yellow) (gcc version >>> 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #46-Ubuntu SMP Tue Jun 28 15:07:17 >>> UTC 2011 (Ubuntu 2.6.38-10.46-generic 2.6.38.7) >>> [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic >>> root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel >>> [ 0.000000] BIOS-provided physical RAM map: >>> [ 0.000000] BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) >>> [ 0.000000] BIOS-e820: 0000000000100000 - 00000000effc0000 (usable) >>> [ 0.000000] BIOS-e820: 00000000effc0000 - 00000000effcfc00 (ACPI data) >>> [ 0.000000] BIOS-e820: 00000000effcfc00 - 00000000effff000 (reserved) >>> [ 0.000000] BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved) >>> [ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved) >>> [ 0.000000] BIOS-e820: 00000000fed13000 - 00000000feda0000 (reserved) >>> [ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved) >>> [ 0.000000] BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved) >>> [ 0.000000] BIOS-e820: 0000000100000000 - 00000001ffffe000 (usable) >>> [ 0.000000] BIOS-e820: 00000001ffffe000 - 0000000200000000 (reserved) >>> [ 0.000000] BIOS-e820: 0000000200000000 - 0000000210000000 (usable) >>> [ 0.000000] debug: ignoring loglevel setting. >>> [ 0.000000] NX (Execute Disable) protection: active >>> [ 0.000000] DMI 2.3 present. >>> [ 0.000000] DMI: Dell Computer Corporation PowerEdge 850/0Y8628, BIOS >>> A04 08/22/2006 >>> [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 >>> (usable) ==> (reserved) >>> [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 >>> (usable) >>> [ 0.000000] No AGP bridge found >>> [ 0.000000] last_pfn = 0x210000 max_arch_pfn = 0x400000000 >>> [ 0.000000] MTRR default type: uncachable >>> [ 0.000000] MTRR fixed ranges enabled: >>> [ 0.000000] 00000-9FFFF write-back >>> [ 0.000000] A0000-BFFFF uncachable >>> [ 0.000000] C0000-CBFFF write-protect >>> [ 0.000000] CC000-EBFFF uncachable >>> [ 0.000000] EC000-FFFFF write-protect >>> [ 0.000000] MTRR variable ranges enabled: >>> [ 0.000000] 0 base 000000000 mask E00000000 write-back >>> [ 0.000000] 1 base 200000000 mask FF0000000 write-back >>> [ 0.000000] 2 base 0F0000000 mask FF0000000 uncachable >>> [ 0.000000] 3 disabled >>> [ 0.000000] 4 disabled >>> [ 0.000000] 5 disabled >>> [ 0.000000] 6 disabled >>> [ 0.000000] 7 disabled >>> [ 0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new >>> 0x7010600070106 >>> [ 0.000000] e820 update range: 00000000f0000000 - 0000000100000000 >>> (usable) ==> (reserved) >>> [ 0.000000] last_pfn = 0xeffc0 max_arch_pfn = 0x400000000 >>> [ 0.000000] found SMP MP-table at [ffff8800000fe710] fe710 >>> [ 0.000000] initial memory mapped : 0 - 20000000 >>> [ 0.000000] init_memory_mapping: 0000000000000000-00000000effc0000 >>> [ 0.000000] 0000000000 - 00efe00000 page 2M >>> [ 0.000000] 00efe00000 - 00effc0000 page 4k >>> [ 0.000000] kernel direct mapping tables up to effc0000 @ >>> 1fffa000-20000000 >>> [ 0.000000] init_memory_mapping: 0000000100000000-0000000210000000 >>> [ 0.000000] 0100000000 - 0210000000 page 2M >>> [ 0.000000] kernel direct mapping tables up to 210000000 @ >>> effb6000-effc0000 >>> [ 0.000000] RAMDISK: 366d0000 - 37360000 >>> [ 0.000000] ACPI: RSDP 00000000000fd160 00014 (v00 DELL ) >>> [ 0.000000] ACPI: RSDT 00000000000fd174 00038 (v01 DELL PE850 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: FACP 00000000000fd1b8 00074 (v01 DELL PE850 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: DSDT 00000000effc0000 01C19 (v01 DELL PE830 >>> 00000001 MSFT 0100000E) >>> [ 0.000000] ACPI: FACS 00000000effcfc00 00040 >>> [ 0.000000] ACPI: APIC 00000000000fd22c 00074 (v01 DELL PE850 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: SPCR 00000000000fd2a0 00050 (v01 DELL PE850 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: HPET 00000000000fd2f0 00038 (v01 DELL PE830 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: MCFG 00000000000fd328 0003C (v01 DELL PE830 >>> 00000001 MSFT 0100000A) >>> [ 0.000000] ACPI: Local APIC address 0xfee00000 >>> [ 0.000000] No NUMA configuration found >>> [ 0.000000] Faking a node at 0000000000000000-0000000210000000 >>> [ 0.000000] Initmem setup node 0 0000000000000000-0000000210000000 >>> [ 0.000000] NODE_DATA [00000001ffff9000 - 00000001ffffdfff] >>> [ 0.000000] [ffffea0000000000-ffffea00073fffff] PMD -> >>> [ffff8801f7e00000-ffff8801feffffff] on node 0 >>> [ 0.000000] Zone PFN ranges: >>> [ 0.000000] DMA 0x00000010 -> 0x00001000 >>> [ 0.000000] DMA32 0x00001000 -> 0x00100000 >>> [ 0.000000] Normal 0x00100000 -> 0x00210000 >>> [ 0.000000] Movable zone start PFN for each node >>> [ 0.000000] early_node_map[4] active PFN ranges >>> [ 0.000000] 0: 0x00000010 -> 0x000000a0 >>> [ 0.000000] 0: 0x00000100 -> 0x000effc0 >>> [ 0.000000] 0: 0x00100000 -> 0x001ffffe >>> [ 0.000000] 0: 0x00200000 -> 0x00210000 >>> [ 0.000000] On node 0 totalpages: 2096974 >>> [ 0.000000] DMA zone: 56 pages used for memmap >>> [ 0.000000] DMA zone: 7 pages reserved >>> [ 0.000000] DMA zone: 3921 pages, LIFO batch:0 >>> [ 0.000000] DMA32 zone: 14280 pages used for memmap >>> [ 0.000000] DMA32 zone: 964600 pages, LIFO batch:31 >>> [ 0.000000] Normal zone: 15232 pages used for memmap >>> [ 0.000000] Normal zone: 1098878 pages, LIFO batch:31 >>> [ 0.000000] ACPI: PM-Timer IO Port: 0x808 >>> [ 0.000000] ACPI: Local APIC address 0xfee00000 >>> [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled) >>> [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled) >>> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) >>> [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) >>> [ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) >>> [ 0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI >>> 0-23 >>> [ 0.000000] ACPI: IOAPIC (id[0x03] address[0xfec10000] gsi_base[32]) >>> [ 0.000000] IOAPIC[1]: apic_id 3, version 32, address 0xfec10000, GSI >>> 32-55 >>> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) >>> [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) >>> [ 0.000000] ACPI: IRQ0 used by override. >>> [ 0.000000] ACPI: IRQ2 used by override. >>> [ 0.000000] ACPI: IRQ9 used by override. >>> [ 0.000000] Using ACPI (MADT) for SMP configuration information >>> [ 0.000000] ACPI: HPET id: 0xffffffff base: 0xfed00000 >>> [ 0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs >>> [ 0.000000] nr_irqs_gsi: 72 >>> [ 0.000000] PM: Registered nosave memory: 00000000000a0000 - >>> 0000000000100000 >>> [ 0.000000] PM: Registered nosave memory: 00000000effc0000 - >>> 00000000effcf000 >>> [ 0.000000] PM: Registered nosave memory: 00000000effcf000 - >>> 00000000effd0000 >>> [ 0.000000] PM: Registered nosave memory: 00000000effd0000 - >>> 00000000effff000 >>> [ 0.000000] PM: Registered nosave memory: 00000000effff000 - >>> 00000000f0000000 >>> [ 0.000000] PM: Registered nosave memory: 00000000f0000000 - >>> 00000000f4000000 >>> [ 0.000000] PM: Registered nosave memory: 00000000f4000000 - >>> 00000000fec00000 >>> [ 0.000000] PM: Registered nosave memory: 00000000fec00000 - >>> 00000000fed00000 >>> [ 0.000000] PM: Registered nosave memory: 00000000fed00000 - >>> 00000000fed13000 >>> [ 0.000000] PM: Registered nosave memory: 00000000fed13000 - >>> 00000000feda0000 >>> [ 0.000000] PM: Registered nosave memory: 00000000feda0000 - >>> 00000000fee00000 >>> [ 0.000000] PM: Registered nosave memory: 00000000fee00000 - >>> 00000000fee10000 >>> [ 0.000000] PM: Registered nosave memory: 00000000fee10000 - >>> 00000000ffb00000 >>> [ 0.000000] PM: Registered nosave memory: 00000000ffb00000 - >>> 0000000100000000 >>> [ 0.000000] PM: Registered nosave memory: 00000001ffffe000 - >>> 0000000200000000 >>> [ 0.000000] Allocating PCI resources starting at f4000000 (gap: >>> f4000000:ac00000) >>> [ 0.000000] Booting paravirtualized kernel on bare hardware >>> [ 0.000000] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:2 >>> nr_node_ids:1 >>> [ 0.000000] PERCPU: Embedded 28 pages/cpu @ffff8800efc00000 s84416 r8192 >>> d22080 u1048576 >>> [ 0.000000] pcpu-alloc: s84416 r8192 d22080 u1048576 alloc=1*2097152 >>> [ 0.000000] pcpu-alloc: [0] 0 1 >>> [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. >>> Total pages: 2067399 >>> [ 0.000000] Policy zone: Normal >>> [ 0.000000] Kernel command line: >>> BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic >>> root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel >>> [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) >>> [ 0.000000] Checking aperture... >>> [ 0.000000] No AGP bridge found >>> [ 0.000000] Calgary: detecting Calgary via BIOS EBDA area >>> [ 0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing! >>> [ 0.000000] Memory: 8178472k/8650752k available (5941k kernel code, >>> 262856k absent, 209424k reserved, 5016k data, 956k init) >>> [ 0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, >>> CPUs=2, Nodes=1 >>> [ 0.000000] Hierarchical RCU implementation. >>> [ 0.000000] RCU dyntick-idle grace-period acceleration is enabled. >>> [ 0.000000] RCU-based detection of stalled CPUs is disabled. >>> [ 0.000000] NR_IRQS:16640 nr_irqs:512 16 >>> [ 0.000000] Console: colour dummy device 80x25 >>> [ 0.000000] console [tty0] enabled >>> [ 0.000000] allocated 83886080 bytes of page_cgroup >>> [ 0.000000] please try 'cgroup_disable=memory' option if you don't want >>> memory cgroups >>> [ 0.000000] hpet clockevent registered >>> [ 0.000000] Fast TSC calibration using PIT >>> [ 0.000000] Detected 3000.094 MHz processor. >>> [ 0.010004] Calibrating delay loop (skipped), value calculated using >>> timer frequency.. 6000.18 BogoMIPS (lpj=30000940) >>> [ 0.010017] pid_max: default: 32768 minimum: 301 >>> [ 0.010056] Security Framework initialized >>> [ 0.010082] AppArmor: AppArmor initialized >>> [ 0.010088] Yama: becoming mindful. >>> [ 0.012092] Dentry cache hash table entries: 1048576 (order: 11, 8388608 >>> bytes) >>> [ 0.022482] Inode-cache hash table entries: 524288 (order: 10, 4194304 >>> bytes) >>> [ 0.024244] Mount-cache hash table entries: 256 >>> [ 0.024453] Initializing cgroup subsys ns >>> [ 0.024463] ns_cgroup deprecated: consider using the 'clone_children' >>> flag without the ns_cgroup. >>> [ 0.024472] Initializing cgroup subsys cpuacct >>> [ 0.024481] Initializing cgroup subsys memory >>> [ 0.024495] Initializing cgroup subsys devices >>> [ 0.024501] Initializing cgroup subsys freezer >>> [ 0.024507] Initializing cgroup subsys net_cls >>> [ 0.024512] Initializing cgroup subsys blkio >>> [ 0.024574] CPU: Physical Processor ID: 0 >>> [ 0.024580] CPU: Processor Core ID: 0 >>> [ 0.024586] mce: CPU supports 4 MCE banks >>> [ 0.024603] CPU0: Thermal monitoring enabled (TM1) >>> [ 0.024612] using mwait in idle threads. >>> [ 0.027748] ACPI: Core revision 20110112 >>> [ 0.029308] ftrace: allocating 24323 entries in 96 pages >>> [ 0.030085] Setting APIC routing to flat >>> [ 0.030516] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 >>> [ 0.136419] CPU0: Intel(R) Pentium(R) D CPU 3.00GHz stepping 04 >>> [ 0.140000] Performance Events: Netburst events, Netburst P4/Xeon PMU >>> driver. >>> [ 0.140000] ... version: 0 >>> [ 0.140000] ... bit width: 40 >>> [ 0.140000] ... generic registers: 18 >>> [ 0.140000] ... value mask: 000000ffffffffff >>> [ 0.140000] ... max period: 0000007fffffffff >>> [ 0.140000] ... fixed-purpose events: 0 >>> [ 0.140000] ... event mask: 000000000003ffff >>> [ 0.140000] Booting Node 0, Processors #1 Ok. >>> [ 0.300021] Brought up 2 CPUs >>> [ 0.300030] Total of 2 processors activated (12000.49 BogoMIPS). >>> [ 0.300847] devtmpfs: initialized >>> [ 0.302451] print_constraints: dummy: >>> [ 0.302485] Time: 0:41:31 Date: 12/02/11 >>> [ 0.302546] NET: Registered protocol family 16 >>> [ 0.302672] Trying to unpack rootfs image as initramfs... >>> [ 0.310474] ACPI: bus type pci registered >>> [ 0.310570] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem >>> 0xf0000000-0xf3ffffff] (base 0xf0000000) >>> [ 0.310580] PCI: MMCONFIG at [mem 0xf0000000-0xf3ffffff] reserved in E820 >>> [ 0.340577] PCI: Using configuration type 1 for base access >>> [ 0.342112] bio: create slab<bio-0> at 0 >>> [ 0.342934] ACPI: EC: Look up EC in DSDT >>> [ 0.345243] ACPI: Interpreter enabled >>> [ 0.345252] ACPI: (supports S0 S4 S5) >>> [ 0.345278] ACPI: Using IOAPIC for interrupt routing >>> [ 0.349231] ACPI: No dock devices found. >>> [ 0.349239] HEST: Table not found. >>> [ 0.349246] PCI: Ignoring host bridge windows from ACPI; if necessary, >>> use "pci=use_crs" and report a bug >>> [ 0.349794] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) >>> [ 0.350838] pci_root PNP0A03:00: host bridge window [io 0x0000-0x0cf7] >>> (ignored) >>> [ 0.350848] pci_root PNP0A03:00: host bridge window [io 0x0d00-0xffff] >>> (ignored) >>> [ 0.350856] pci_root PNP0A03:00: host bridge window [mem >>> 0x000a0000-0x000bffff] (ignored) >>> [ 0.350864] pci_root PNP0A03:00: host bridge window [mem >>> 0xf0000000-0xfebfffff] (ignored) >>> [ 0.350884] pci 0000:00:00.0: [8086:2778] type 0 class 0x000600 >>> [ 0.350946] pci 0000:00:01.0: [8086:2779] type 1 class 0x000604 >>> [ 0.350996] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold >>> [ 0.351005] pci 0000:00:01.0: PME# disabled >>> [ 0.351066] pci 0000:00:1c.0: [8086:27d0] type 1 class 0x000604 >>> [ 0.351137] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold >>> [ 0.351145] pci 0000:00:1c.0: PME# disabled >>> [ 0.351178] pci 0000:00:1c.4: [8086:27e0] type 1 class 0x000604 >>> [ 0.351248] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold >>> [ 0.351256] pci 0000:00:1c.4: PME# disabled >>> [ 0.351285] pci 0000:00:1c.5: [8086:27e2] type 1 class 0x000604 >>> [ 0.351355] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold >>> [ 0.351363] pci 0000:00:1c.5: PME# disabled >>> [ 0.351391] pci 0000:00:1d.0: [8086:27c8] type 0 class 0x000c03 >>> [ 0.351443] pci 0000:00:1d.0: reg 20: [io 0xbce0-0xbcff] >>> [ 0.351484] pci 0000:00:1d.1: [8086:27c9] type 0 class 0x000c03 >>> [ 0.351537] pci 0000:00:1d.1: reg 20: [io 0xbcc0-0xbcdf] >>> [ 0.351577] pci 0000:00:1d.2: [8086:27ca] type 0 class 0x000c03 >>> [ 0.351629] pci 0000:00:1d.2: reg 20: [io 0xbca0-0xbcbf] >>> [ 0.351680] pci 0000:00:1d.7: [ _______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users