hast, zfs, unable to flush disk cache
I've setup a 2 node cluster using HAST. I'd previously had this running 9.0 and then rebuilt the nodes using 9.1. Under 9.0 I used vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to work in 9.1. The other difference, previous build had root disk on UFS, while this build has only ZFS based file systems. FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 Mar 1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that diddn't appear to work either. I get a lot of lines in my log file because of this. I have also tried zfs set sync=disabled pool but HAST still outputs those lines in the log file. Any suggestions? Thank you, Chad ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: hast, zfs, unable to flush disk cache
On Fri, Mar 01, 2013 at 11:39:23AM -0600, Chad M Stewart wrote: I've setup a 2 node cluster using HAST. I'd previously had this running 9.0 and then rebuilt the nodes using 9.1. Under 9.0 I used vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to work in 9.1. The other difference, previous build had root disk on UFS, while this build has only ZFS based file systems. FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 Mar 1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that diddn't appear to work either. I get a lot of lines in my log file because of this. I have also tried zfs set sync=disabled pool but HAST still outputs those lines in the log file. These flushes were generated by HAST itself when it tried to flush activemap updates, that is why disabling BIO_FLUSH for ZFS dis not help much. Setting metaflush off in hast.conf should help though. BTW, hastd tries to detect devices that do not support BIO_FLUSH, checking for the returned errno, and automatically disable flushes if the errno is EOPNOTSUPP (Operation not supported). Unfortunately, your device returned ENODEV (Operation not supported by device). What device do you have? Pawel, do you think it would be a good idea to automatically disable activemap flush for ENODEV case too? -- Mikolaj Golub ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: hast, zfs, unable to flush disk cache
On Mar 1, 2013, at 2:25 PM, Mikolaj Golub wrote: What device do you have? Its an older HP GL380 server, I think. dmesg below, if you need output from something else I'm willing to provide. Copyright (c) 1992-2012 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2784.59-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf27 Family = f Model = 2 Stepping = 7 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x4400CNXT-ID,xTPR real memory = 3221225472 (3072 MB) avail memory = 3136851968 (2991 MB) Event timer LAPIC quality 400 ACPI APIC Table: COMPAQ 0083 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 2 package(s) x 1 core(s) x 2 HTT threads cpu0 (BSP): APIC ID: 0 cpu1 (AP/HT): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP/HT): APIC ID: 7 ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 (20110527/tbfadt-638) ACPI Warning: Invalid length for Pm1bControlBlock: 32, using default 16 (20110527/tbfadt-638) MADT: Forcing active-low polarity and level trigger for SCI ioapic0 Version 1.1 irqs 0-15 on motherboard ioapic1 Version 1.1 irqs 16-31 on motherboard ioapic2 Version 1.1 irqs 32-47 on motherboard ioapic3 Version 1.1 irqs 48-63 on motherboard kbd1 at kbdmux0 acpi0: COMPAQ P29 on motherboard acpi0: Power Button (fixed) cpu0: ACPI CPU on acpi0 cpu1: ACPI CPU on acpi0 cpu2: ACPI CPU on acpi0 cpu3: ACPI CPU on acpi0 attimer0: AT timer port 0x40-0x43 irq 0 on acpi0 Timecounter i8254 frequency 1193182 Hz quality 0 Event timer i8254 frequency 1193182 Hz quality 100 Timecounter ACPI-fast frequency 3579545 Hz quality 900 acpi_timer0: 32-bit timer at 3.579545MHz port 0x920-0x923 on acpi0 pcib0: ACPI Host-PCI bridge on acpi0 pcib0: Length mismatch for 4 range: 2900 vs 28ff pci0: ACPI PCI bus on pcib0 vgapci0: VGA-compatible display port 0x2400-0x24ff mem 0xf100-0xf1ff,0xf0ff-0xf0ff0fff at device 3.0 on pci0 pci0: base peripheral at device 4.0 (no driver attached) pci0: base peripheral at device 4.2 (no driver attached) isab0: PCI-ISA bridge at device 15.0 on pci0 isa0: ISA bus on isab0 atapci0: ServerWorks CSB5 UDMA100 controller port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2000-0x200f at device 15.1 on pci0 ata0: ATA channel at channel 0 on atapci0 ata1: ATA channel at channel 1 on atapci0 ohci0: OHCI (generic) USB controller mem 0xf0ef-0xf0ef0fff irq 7 at device 15.2 on pci0 usbus0 on ohci0 pcib1: ACPI Host-PCI bridge on acpi0 pcib1: Length mismatch for 4 range: 100 vs ff pci1: ACPI PCI bus on pcib1 ciss0: Compaq Smart Array 5i port 0x3000-0x30ff mem 0xf2cc-0xf2cf,0xf2bf-0xf2bf3fff irq 30 at device 3.0 on pci1 ciss0: PERFORMANT Transport pcib2: ACPI Host-PCI bridge on acpi0 pci2: ACPI PCI bus on pcib2 bge0: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 0xf2df-0xf2df irq 29 at device 1.0 on pci2 bge0: CHIP ID 0x1002; ASIC REV 0x01; CHIP REV 0x10; PCI-X 100 MHz miibus0: MII bus on bge0 brgphy0: BCM5703 1000BASE-T media interface PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:0b:cd:3c:fe:7f bge1: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 0xf2de-0xf2de irq 31 at device 2.0 on pci2 bge1: CHIP ID 0x1002; ASIC REV 0x01; CHIP REV 0x10; PCI-X 100 MHz miibus1: MII bus on bge1 brgphy1: BCM5703 1000BASE-T media interface PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:0b:cd:3c:fe:7e pcib3: ACPI Host-PCI bridge on acpi0 pci3: ACPI PCI bus on pcib3 pcib4: ACPI Host-PCI bridge on acpi0 pcib4: Length mismatch for 4 range: 2000 vs 1fff pci6: ACPI PCI bus on pcib4 bge2: Compaq NC7770 Gigabit Server Adapter, ASIC rev. 0x000105 mem 0xf7ff-0xf7ff irq 26 at device 2.0 on pci6 bge2: CHIP ID 0x0105; ASIC REV 0x00; CHIP REV 0x01; PCI-X 33 MHz miibus2: MII bus on bge2 brgphy2: BCM5701 1000BASE-T media interface PHY 1 on miibus2 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge2: Ethernet address: 00:0b:cd:52:66:f1 pci6: base peripheral, PCI hot-plug controller at device 30.0 (no driver attached) acpi_tz0: Thermal Zone on acpi0 atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 atkbd0: AT Keyboard irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] uart0: 16550 or
Re: hast, zfs, unable to flush disk cache
On Fri, Mar 01, 2013 at 10:25:50PM +0200, Mikolaj Golub wrote: On Fri, Mar 01, 2013 at 11:39:23AM -0600, Chad M Stewart wrote: I've setup a 2 node cluster using HAST. I'd previously had this running 9.0 and then rebuilt the nodes using 9.1. Under 9.0 I used vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to work in 9.1. The other difference, previous build had root disk on UFS, while this build has only ZFS based file systems. FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec 4 06:55:39 UTC 2012 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 Mar 1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. Mar 1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk cache on activemap update: Operation not supported by device. I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that diddn't appear to work either. I get a lot of lines in my log file because of this. I have also tried zfs set sync=disabled pool but HAST still outputs those lines in the log file. These flushes were generated by HAST itself when it tried to flush activemap updates, that is why disabling BIO_FLUSH for ZFS dis not help much. Setting metaflush off in hast.conf should help though. BTW, hastd tries to detect devices that do not support BIO_FLUSH, checking for the returned errno, and automatically disable flushes if the errno is EOPNOTSUPP (Operation not supported). Unfortunately, your device returned ENODEV (Operation not supported by device). What device do you have? Pawel, do you think it would be a good idea to automatically disable activemap flush for ENODEV case too? It would be better to find the driver that returns ENODEV and fix it, IMHO. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl pgpUGDh098WMq.pgp Description: PGP signature