hast, zfs, unable to flush disk cache

2013-03-01 Thread Chad M Stewart

I've setup a 2 node cluster using HAST.  I'd previously had this running 9.0 
and then rebuilt the nodes using 9.1.  Under 9.0 I used 
vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to work in 
9.1.  The other difference, previous build had root disk on UFS, while this 
build has only ZFS based file systems.


FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec  4 
06:55:39 UTC 2012 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  
i386


Mar  1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.
Mar  1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.
Mar  1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.
Mar  1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.
Mar  1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.
Mar  1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk cache 
on activemap update: Operation not supported by device.


I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that diddn't appear 
to work either.  I get a lot of lines in my log file because of this.  I have 
also tried  zfs set sync=disabled pool but HAST still outputs those lines in 
the log file.

Any suggestions?


Thank you,
Chad

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hast, zfs, unable to flush disk cache

2013-03-01 Thread Mikolaj Golub
On Fri, Mar 01, 2013 at 11:39:23AM -0600, Chad M Stewart wrote:

 I've setup a 2 node cluster using HAST.  I'd previously had this
 running 9.0 and then rebuilt the nodes using 9.1.  Under 9.0 I used
 vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to
 work in 9.1.  The other difference, previous build had root disk on
 UFS, while this build has only ZFS based file systems.
 
 FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec  4 
 06:55:39 UTC 2012 
 r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386
 
 
 Mar  1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 Mar  1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 Mar  1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 Mar  1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 Mar  1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 Mar  1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk 
 cache on activemap update: Operation not supported by device.
 
 I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that
 diddn't appear to work either.  I get a lot of lines in my log file
 because of this.  I have also tried zfs set sync=disabled pool but
 HAST still outputs those lines in the log file.

These flushes were generated by HAST itself when it tried to flush
activemap updates, that is why disabling BIO_FLUSH for ZFS dis not
help much.

Setting metaflush off in hast.conf should help though.

BTW, hastd tries to detect devices that do not support BIO_FLUSH,
checking for the returned errno, and automatically disable flushes if
the errno is EOPNOTSUPP (Operation not supported). Unfortunately, your
device returned ENODEV (Operation not supported by device).

What device do you have?

Pawel, do you think it would be a good idea to automatically disable
activemap flush for ENODEV case too?

-- 
Mikolaj Golub
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hast, zfs, unable to flush disk cache

2013-03-01 Thread Chad M Stewart

On Mar 1, 2013, at 2:25 PM, Mikolaj Golub wrote:

 What device do you have?

Its an older HP GL380 server, I think.   dmesg below, if you need output from 
something else I'm willing to provide.

Copyright (c) 1992-2012 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.1-RELEASE #0 r243826: Tue Dec  4 06:55:39 UTC 2012
r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2784.59-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf27  Family = f  Model = 2  Stepping = 7
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x4400CNXT-ID,xTPR
real memory  = 3221225472 (3072 MB)
avail memory = 3136851968 (2991 MB)
Event timer LAPIC quality 400
ACPI APIC Table: COMPAQ 0083
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s) x 2 HTT threads
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP/HT): APIC ID:  1
 cpu2 (AP): APIC ID:  6
 cpu3 (AP/HT): APIC ID:  7
ACPI Warning: Invalid length for Pm1aControlBlock: 32, using default 16 
(20110527/tbfadt-638)
ACPI Warning: Invalid length for Pm1bControlBlock: 32, using default 16 
(20110527/tbfadt-638)
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 Version 1.1 irqs 0-15 on motherboard
ioapic1 Version 1.1 irqs 16-31 on motherboard
ioapic2 Version 1.1 irqs 32-47 on motherboard
ioapic3 Version 1.1 irqs 48-63 on motherboard
kbd1 at kbdmux0
acpi0: COMPAQ P29 on motherboard
acpi0: Power Button (fixed)
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
cpu2: ACPI CPU on acpi0
cpu3: ACPI CPU on acpi0
attimer0: AT timer port 0x40-0x43 irq 0 on acpi0
Timecounter i8254 frequency 1193182 Hz quality 0
Event timer i8254 frequency 1193182 Hz quality 100
Timecounter ACPI-fast frequency 3579545 Hz quality 900
acpi_timer0: 32-bit timer at 3.579545MHz port 0x920-0x923 on acpi0
pcib0: ACPI Host-PCI bridge on acpi0
pcib0: Length mismatch for 4 range: 2900 vs 28ff
pci0: ACPI PCI bus on pcib0
vgapci0: VGA-compatible display port 0x2400-0x24ff mem 
0xf100-0xf1ff,0xf0ff-0xf0ff0fff at device 3.0 on pci0
pci0: base peripheral at device 4.0 (no driver attached)
pci0: base peripheral at device 4.2 (no driver attached)
isab0: PCI-ISA bridge at device 15.0 on pci0
isa0: ISA bus on isab0
atapci0: ServerWorks CSB5 UDMA100 controller port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2000-0x200f at device 15.1 on pci0
ata0: ATA channel at channel 0 on atapci0
ata1: ATA channel at channel 1 on atapci0
ohci0: OHCI (generic) USB controller mem 0xf0ef-0xf0ef0fff irq 7 at 
device 15.2 on pci0
usbus0 on ohci0
pcib1: ACPI Host-PCI bridge on acpi0
pcib1: Length mismatch for 4 range: 100 vs ff
pci1: ACPI PCI bus on pcib1
ciss0: Compaq Smart Array 5i port 0x3000-0x30ff mem 
0xf2cc-0xf2cf,0xf2bf-0xf2bf3fff irq 30 at device 3.0 on pci1
ciss0: PERFORMANT Transport
pcib2: ACPI Host-PCI bridge on acpi0
pci2: ACPI PCI bus on pcib2
bge0: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 
0xf2df-0xf2df irq 29 at device 1.0 on pci2
bge0: CHIP ID 0x1002; ASIC REV 0x01; CHIP REV 0x10; PCI-X 100 MHz
miibus0: MII bus on bge0
brgphy0: BCM5703 1000BASE-T media interface PHY 1 on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge0: Ethernet address: 00:0b:cd:3c:fe:7f
bge1: Compaq NC7781 Gigabit Server Adapter, ASIC rev. 0x001002 mem 
0xf2de-0xf2de irq 31 at device 2.0 on pci2
bge1: CHIP ID 0x1002; ASIC REV 0x01; CHIP REV 0x10; PCI-X 100 MHz
miibus1: MII bus on bge1
brgphy1: BCM5703 1000BASE-T media interface PHY 1 on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge1: Ethernet address: 00:0b:cd:3c:fe:7e
pcib3: ACPI Host-PCI bridge on acpi0
pci3: ACPI PCI bus on pcib3
pcib4: ACPI Host-PCI bridge on acpi0
pcib4: Length mismatch for 4 range: 2000 vs 1fff
pci6: ACPI PCI bus on pcib4
bge2: Compaq NC7770 Gigabit Server Adapter, ASIC rev. 0x000105 mem 
0xf7ff-0xf7ff irq 26 at device 2.0 on pci6
bge2: CHIP ID 0x0105; ASIC REV 0x00; CHIP REV 0x01; PCI-X 33 MHz
miibus2: MII bus on bge2
brgphy2: BCM5701 1000BASE-T media interface PHY 1 on miibus2
brgphy2:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
bge2: Ethernet address: 00:0b:cd:52:66:f1
pci6: base peripheral, PCI hot-plug controller at device 30.0 (no driver 
attached)
acpi_tz0: Thermal Zone on acpi0
atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0
atkbd0: AT Keyboard irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
uart0: 16550 or 

Re: hast, zfs, unable to flush disk cache

2013-03-01 Thread Pawel Jakub Dawidek
On Fri, Mar 01, 2013 at 10:25:50PM +0200, Mikolaj Golub wrote:
 On Fri, Mar 01, 2013 at 11:39:23AM -0600, Chad M Stewart wrote:
 
  I've setup a 2 node cluster using HAST.  I'd previously had this
  running 9.0 and then rebuilt the nodes using 9.1.  Under 9.0 I used
  vfs.zfs.vdev.bio_flush_disable=1 but that setting does not appear to
  work in 9.1.  The other difference, previous build had root disk on
  UFS, while this build has only ZFS based file systems.
  
  FreeBSD node1.san 9.1-RELEASE FreeBSD 9.1-RELEASE #0 r243826: Tue Dec  4 
  06:55:39 UTC 2012 
  r...@obrian.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC  i386
  
  
  Mar  1 17:07:25 node1 hastd[1446]: [disk5] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  Mar  1 17:07:28 node1 hastd[1440]: [disk3] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  Mar  1 17:07:28 node1 hastd[1437]: [disk2] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  Mar  1 17:07:28 node1 hastd[1434]: [disk1] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  Mar  1 17:07:28 node1 hastd[1446]: [disk5] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  Mar  1 17:07:28 node1 hastd[1443]: [disk4] (primary) Unable to flush disk 
  cache on activemap update: Operation not supported by device.
  
  I tried setting zfs set zfs:zfs_nocacheflush=1 pool but that
  diddn't appear to work either.  I get a lot of lines in my log file
  because of this.  I have also tried zfs set sync=disabled pool but
  HAST still outputs those lines in the log file.
 
 These flushes were generated by HAST itself when it tried to flush
 activemap updates, that is why disabling BIO_FLUSH for ZFS dis not
 help much.
 
 Setting metaflush off in hast.conf should help though.
 
 BTW, hastd tries to detect devices that do not support BIO_FLUSH,
 checking for the returned errno, and automatically disable flushes if
 the errno is EOPNOTSUPP (Operation not supported). Unfortunately, your
 device returned ENODEV (Operation not supported by device).
 
 What device do you have?
 
 Pawel, do you think it would be a good idea to automatically disable
 activemap flush for ENODEV case too?

It would be better to find the driver that returns ENODEV and fix it, IMHO.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl


pgpUGDh098WMq.pgp
Description: PGP signature