Re: FS of choice for max random iops ( Maildir )
Volodymyr Kostyrko c.kw...@gmail.com, 2011-09-17 14:33 (+0200): You really like to wait for hours before fsck will finish checking for your volume? While it's true that fsck on large filesystems takes ages soft updates and background fsck makes it a lot less bothersome than it used to be. -- http://hack.org/mc/ Use plain text e-mail, please. HTML messages silently dropped. OpenPGP welcome, 0xE4C92FA5. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
free...@top-consulting.net wrote: C. TEST3 ( sequential writing ): bonnie++ -d /data -c 10 -s 8088 -n 0 -u 0 1. UFS + gjournal crashed the box This _might_ have been caused by a too-small journal provider. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
17.09.2011 00:39, free...@top-consulting.net wrote: I even went as far as disabling the cache flush option of ZFS through this variable: vfs.zfs.cache_flush_disable: 1, since I already have the write cache of the controller. I've also set some other variables as per the Tuning guide but according to several benchmarks ( iozone, bonnie++, dd ) ZFS still comes in slower than UFS at pretty much everything. Oh, so you are making setup for running iozone, bonnie++ and dd continuously? You really like to wait for hours before fsck will finish checking for your volume? Listen to the others, you need real world benchmark, not some stress-tests. -- Sphinx of black quartz judge my vow. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
16.09.2011 16:35, Terje Elde wrote: Note: you might be in trouble if you loose your ZIL, thus the doubling up. I *think* you can SSD a cache without risking dataloss, but don't take my word for it. Let me summarize this. ZFS will work even without ZIL or cache. Losing ZIL will make you LOSE DATA. It would be last 0 to 30 seconds of work. Losing cache gives you nothing bad. I didn't tested but I think ZFS will panic when losing log device. -- Sphinx of black quartz judge my vow. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
free...@top-consulting.net schreef: I have a new server that I would like to use as a back-end Maildir storage shared through NFS. The specs are: FreeBSD 9.0 Beta 2 Xeon x3470 @ 2.93 quad-core CPU 4 GB Ram @ 1333mhz ( upgrading to 12GB tomorrow ) 3WARE 9650SE-16LP card with write cache enabled ( battery is installed ) 16 x WD RE3 1TB drives RAID 10 setup Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. I've tried three options for the storage file system but I'm not sure which one is the best option since I can't really reproduce production conditions. I only ran tests with dd and bonnie and here's what I found: A. TEST1: dd bs=1024 if=/dev/zero of=/data/t1 count=1M 1. ZFS performed the worst, averaging 67MB/sec 2. UFS + gjournal did around 130MB/sec 3. UFS did around 190MB/sec B. TEST2 ( random file creation ): bonnie++ -d /data -c 10 -s 0 -n 50 -u 0 1. UFS + gjournal performed the worst 2. ZFS performed somewhat better 3. UFS performed the best again ( about 50% better ) C. TEST3 ( sequential writing ): bonnie++ -d /data -c 10 -s 8088 -n 0 -u 0 1. UFS + gjournal crashed the box 2. ZFS performed average 3. UFS performed better than ZFS ( about 50% better ) I really like the concepts behind ZFS and UFS + Journaling but the performance hit is quite drastic when compared to UFS. What I'm looking for here is max IOPS when doing random read/writes. Is UFS the best choice for this ? Do my results make sense ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org Did you use raidz1 2 or 3 or mirror for the ZFS ppol. I believe that ZFS mirror gives you the best performance, but the least actual space. If you did make a raidz[1,2,3] try it with a mirror pool. Also do not use the raid function of your raid controller if you use ZFS, this way you loose the goodies of zfs. If you setup ZFS use JBOD on the raid controller. Gr Johan ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
Quoting Johan Hendriks joh.hendr...@gmail.com: free...@top-consulting.net schreef: I have a new server that I would like to use as a back-end Maildir storage shared through NFS. The specs are: FreeBSD 9.0 Beta 2 Xeon x3470 @ 2.93 quad-core CPU 4 GB Ram @ 1333mhz ( upgrading to 12GB tomorrow ) 3WARE 9650SE-16LP card with write cache enabled ( battery is installed ) 16 x WD RE3 1TB drives RAID 10 setup Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. I've tried three options for the storage file system but I'm not sure which one is the best option since I can't really reproduce production conditions. I only ran tests with dd and bonnie and here's what I found: A. TEST1: dd bs=1024 if=/dev/zero of=/data/t1 count=1M 1. ZFS performed the worst, averaging 67MB/sec 2. UFS + gjournal did around 130MB/sec 3. UFS did around 190MB/sec B. TEST2 ( random file creation ): bonnie++ -d /data -c 10 -s 0 -n 50 -u 0 1. UFS + gjournal performed the worst 2. ZFS performed somewhat better 3. UFS performed the best again ( about 50% better ) C. TEST3 ( sequential writing ): bonnie++ -d /data -c 10 -s 8088 -n 0 -u 0 1. UFS + gjournal crashed the box 2. ZFS performed average 3. UFS performed better than ZFS ( about 50% better ) I really like the concepts behind ZFS and UFS + Journaling but the performance hit is quite drastic when compared to UFS. What I'm looking for here is max IOPS when doing random read/writes. Is UFS the best choice for this ? Do my results make sense ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org Did you use raidz1 2 or 3 or mirror for the ZFS ppol. I believe that ZFS mirror gives you the best performance, but the least actual space. If you did make a raidz[1,2,3] try it with a mirror pool. Also do not use the raid function of your raid controller if you use ZFS, this way you loose the goodies of zfs. If you setup ZFS use JBOD on the raid controller. Gr Johan I simply did a : zpool create data da1 and no zfs-level raid. I also created a dataset - tried both with lzjb compression and without - but the results were similar, aka bad. Is zfs supposed to be faster if you let it manage the disks directly ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On 16/09/2011 12:31, free...@top-consulting.net wrote: A. TEST1: dd bs=1024 if=/dev/zero of=/data/t1 count=1M 1. ZFS performed the worst, averaging 67MB/sec 2. UFS + gjournal did around 130MB/sec 3. UFS did around 190MB/sec B. TEST2 ( random file creation ): bonnie++ -d /data -c 10 -s 0 -n 50 -u 0 1. UFS + gjournal performed the worst 2. ZFS performed somewhat better 3. UFS performed the best again ( about 50% better ) C. TEST3 ( sequential writing ): bonnie++ -d /data -c 10 -s 8088 -n 0 -u 0 1. UFS + gjournal crashed the box 2. ZFS performed average 3. UFS performed better than ZFS ( about 50% better ) I really like the concepts behind ZFS and UFS + Journaling but the performance hit is quite drastic when compared to UFS. What I'm looking for here is max IOPS when doing random read/writes. Is UFS the best choice for this ? Do my results make sense ? Your tests do look a bit odd - ZFS usually does better on sequential and UFS on random IO (rw mix). For random IO I'd go with UFS. Try comparing with blogbench. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On 16/09/2011 13:30, free...@top-consulting.net wrote: Is zfs supposed to be faster if you let it manage the disks directly ? Not necessarily faster (in fact, RAID-Z variants have known limitations which are not so pronounced in RAID5/6), but definitely more convenient and in some respects safer. I would test very carefully if you need speed and stability from ZFS. For one thing, you will probably want to reduce the block size in ZFS to 8K or such. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On 16. sep. 2011, at 12:31, free...@top-consulting.net wrote: Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. ZFS will want to write to it's ZIL (zfs intent log) before writing to the final location of the data. Even if you're not waiting for the ZIL-write to disk (because of the controller ram), those writes will probably make it through to disk. That gives you twice as many writes to disk, and a lot more seek. If you want to take zfs for a proper spin, I'd like to sugget adding two small SSDs to the setup, mirrored by zfs. You can use those both for the ZIL, and also as cache, for the array. That's a fairly small investment these days, and I would be surprised if it didn't significantly improve performance, both for your benchmark, and real load. Note: you might be in trouble if you loose your ZIL, thus the doubling up. I *think* you can SSD a cache without risking dataloss, but don't take my word for it. Terje___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
Quoting Terje Elde te...@elde.net: On 16. sep. 2011, at 12:31, free...@top-consulting.net wrote: Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. ZFS will want to write to it's ZIL (zfs intent log) before writing to the final location of the data. Even if you're not waiting for the ZIL-write to disk (because of the controller ram), those writes will probably make it through to disk. That gives you twice as many writes to disk, and a lot more seek. If you want to take zfs for a proper spin, I'd like to sugget adding two small SSDs to the setup, mirrored by zfs. You can use those both for the ZIL, and also as cache, for the array. That's a fairly small investment these days, and I would be surprised if it didn't significantly improve performance, both for your benchmark, and real load. Note: you might be in trouble if you loose your ZIL, thus the doubling up. I *think* you can SSD a cache without risking dataloss, but don't take my word for it. Terje I know it's usually a big no-no but since I have the battery backed-up write cache from the raid card, can't I just disable the ZIL entirely ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On Fri, 16 Sep 2011 08:57:45 -0500, free...@top-consulting.net wrote: I know it's usually a big no-no but since I have the battery backed-up write cache from the raid card, can't I just disable the ZIL entirely ? No. ZFS doesn't work the way traditional filesystems do. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
Quoting Terje Elde te...@elde.net: On 16. sep. 2011, at 12:31, free...@top-consulting.net wrote: Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. ZFS will want to write to it's ZIL (zfs intent log) before writing to the final location of the data. Even if you're not waiting for the ZIL-write to disk (because of the controller ram), those writes will probably make it through to disk. That gives you twice as many writes to disk, and a lot more seek. If you want to take zfs for a proper spin, I'd like to sugget adding two small SSDs to the setup, mirrored by zfs. You can use those both for the ZIL, and also as cache, for the array. That's a fairly small investment these days, and I would be surprised if it didn't significantly improve performance, both for your benchmark, and real load. Note: you might be in trouble if you loose your ZIL, thus the doubling up. I *think* you can SSD a cache without risking dataloss, but don't take my word for it. Terje Well, I tried disabling the ZIL on a new dataset. These are the commands that I ran: zpool create data da1 zfs create data/maildomains zfs set sync=disabled data/maildomains dd bs=1024 if=/dev/zero of=/data/maildomains/t1 count=1M 1048576+0 records in 1048576+0 records out 1073741824 bytes transferred in 14.537711 secs (73859071 bytes/sec) Got a measly 74MB/sec. On the UFS partition however... dd bs=1024 if=/dev/zero of=/usr/t1 count=1M 1048576+0 records in 1048576+0 records out 1073741824 bytes transferred in 5.828395 secs (184225983 bytes/sec) 184MB/sec! And this is synchronous writing, not random! So what is ZFS good for finally ? Synchronous writing or small random iops ? By the way, this is how the array is configured with 3ware: Unit UnitType Status %RCmpl %V/I/M Stripe Size(GB) Cache AVrfy -- u0RAID-10 OK - - 64K 7450.5ON ON VPort Status Unit Size Type Phy Encl-SlotModel -- p0OK u0 931.51 GB SATA 0 -WDC WD1002FBYS-01A6 p1OK u0 931.51 GB SATA 1 -WDC WD1002FBYS-01A6 p2OK u0 931.51 GB SATA 2 -WDC WD1002FBYS-01A6 p3OK u0 931.51 GB SATA 3 -WDC WD1002FBYS-01A6 p4OK u0 931.51 GB SATA 4 -WDC WD1002FBYS-01A6 p5OK u0 931.51 GB SATA 5 -WDC WD1002FBYS-01A6 p6OK u0 931.51 GB SATA 6 -WDC WD1002FBYS-01A6 p7OK u0 931.51 GB SATA 7 -WDC WD1002FBYS-01A6 p8OK u0 931.51 GB SATA 8 -WDC WD1002FBYS-01A6 p9OK u0 931.51 GB SATA 9 -WDC WD1002FBYS-01A6 p10 OK u0 931.51 GB SATA 10 -WDC WD1002FBYS-01A6 p11 OK u0 931.51 GB SATA 11 -WDC WD1002FBYS-01A6 p12 OK u0 931.51 GB SATA 12 -WDC WD1002FBYS-01A6 p13 OK u0 931.51 GB SATA 13 -WDC WD1002FBYS-01A6 p14 OK u0 931.51 GB SATA 14 -WDC WD1002FBYS-01A6 p15 OK u0 931.51 GB SATA 15 -WDC WD1002FBYS-01A6 ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On 16.09.2011 15:57, free...@top-consulting.net wrote: Quoting Terje Elde te...@elde.net: On 16. sep. 2011, at 12:31, free...@top-consulting.net wrote: Right now I defined an entire array of 8TB ( all 16 disks ) separated in two pieces. 50 GB for FreeBSD to boot and the rest available to configure as storage. ZFS will want to write to it's ZIL (zfs intent log) before writing to the final location of the data. Even if you're not waiting for the ZIL-write to disk (because of the controller ram), those writes will probably make it through to disk. That gives you twice as many writes to disk, and a lot more seek. If you want to take zfs for a proper spin, I'd like to sugget adding two small SSDs to the setup, mirrored by zfs. You can use those both for the ZIL, and also as cache, for the array. That's a fairly small investment these days, and I would be surprised if it didn't significantly improve performance, both for your benchmark, and real load. Note: you might be in trouble if you loose your ZIL, thus the doubling up. I *think* you can SSD a cache without risking dataloss, but don't take my word for it. Terje I know it's usually a big no-no but since I have the battery backed-up write cache from the raid card, can't I just disable the ZIL entirely ? No. However, you could allow the ZIL to be written to a logical disk with the battery-backed cache. //Svein -- +---+--- /\ |Svein Skogen | sv...@d80.iso100.no \ / |Solberg Østli 9| PGP Key: 0xE5E76831 X|2020 Skedsmokorset | sv...@jernhuset.no / \ |Norway | PGP Key: 0xCE96CE13 | | sv...@stillbilde.net ascii | | PGP Key: 0x58CD33B6 ribbon |System Admin | svein-listm...@stillbilde.net Campaign|stillbilde.net | PGP Key: 0x22D494A4 +---+--- |msn messenger: | Mobile Phone: +47 907 03 575 |sv...@jernhuset.no | RIPE handle:SS16503-RIPE +---+--- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? Picture Gallery: https://gallery.stillbilde.net/v/svein/ signature.asc Description: OpenPGP digital signature
Re: FS of choice for max random iops ( Maildir )
On 16. sep. 2011, at 16:18, free...@top-consulting.net wrote: Got a measly 74MB/sec. You can't ask for advice, get it, do something completely different, and then complain that it didn't work. Neither can you ask people to donate their time, if you won't spend yours. In other words: if you won't listen, there's no point in us talking. However: Don't disable ZIL. Just don't. It's not the way to go. If you want to know why, google will help. Also, you're making some assumptions, such as the ZIL being bad for performance. That's not always the case. ZIL-writes are a rather nice load for spinning metal storage. Even if you write through cache, that can give you a boost on your real world workload. Which brings us to the third bit. You're benchmarking, not trying real world loads. That's the load you'll have to worry about, and it's the load zfs shines at. Thanks to the ZIL (the thing you're trying to kill, remember?) you can convert seek heavy writes to sequential zil-writes, freeing up disk bandwith for concurrent reads. If you want to test before spending money, try what Svein said. Set up a small logical volume (preferrably smaller than your controller cache, if it's large enough), then try that as a dedicated zil-device. Never tried that, but worth a shot. Terje___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
Quoting Terje Elde te...@elde.net: On 16. sep. 2011, at 16:18, free...@top-consulting.net wrote: Got a measly 74MB/sec. You can't ask for advice, get it, do something completely different, and then complain that it didn't work. Neither can you ask people to donate their time, if you won't spend yours. In other words: if you won't listen, there's no point in us talking. However: Don't disable ZIL. Just don't. It's not the way to go. If you want to know why, google will help. Also, you're making some assumptions, such as the ZIL being bad for performance. That's not always the case. ZIL-writes are a rather nice load for spinning metal storage. Even if you write through cache, that can give you a boost on your real world workload. Which brings us to the third bit. You're benchmarking, not trying real world loads. That's the load you'll have to worry about, and it's the load zfs shines at. Thanks to the ZIL (the thing you're trying to kill, remember?) you can convert seek heavy writes to sequential zil-writes, freeing up disk bandwith for concurrent reads. If you want to test before spending money, try what Svein said. Set up a small logical volume (preferrably smaller than your controller cache, if it's large enough), then try that as a dedicated zil-device. Never tried that, but worth a shot. Terje It's not about spending money or not. I really want to use ZFS for some of its features ( journaled, snapshots, etc ) but it has to be a good fit for me. I'm not ignoring the advice I am given, just taking it with a grain of salt disabling the ZIL is recommended - sometimes - for NFS. As per hundreds of messages I've read from the Archive along with this page, http://wiki.freebsd.org/ZFSTuningGuide, it does appear that disabling the ZIL is a solution for NFS. Yes, they still recommend SSD drives and I fully understand that. My point was the following: Why is a sequential write test like dd slower on ZFS than on UFS ? The writes is already serialized so enabling/disabling the ZIL should have very little impact - which is indeed the case. I even went as far as disabling the cache flush option of ZFS through this variable: vfs.zfs.cache_flush_disable: 1, since I already have the write cache of the controller. I've also set some other variables as per the Tuning guide but according to several benchmarks ( iozone, bonnie++, dd ) ZFS still comes in slower than UFS at pretty much everything. Either I am missing something or there is something wrong with my setup. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Re: FS of choice for max random iops ( Maildir )
On 16. sep. 2011, at 16:18, free...@top-consulting.net wrote: zpool create data da1 zfs create data/maildomains zfs set sync=disabled data/maildomains Just for the archives... sync=disabled won't disable disable the zil, it'll disable waiting for a disk-flush on fsync etc. With a battery backed controller cache, those flushes should go to cache, and be pretty mich free. You end up tossing away something for nothing. You're getting about half the performance on a sequential write to the zfs, as you get with raw ufs. That makes perfect sense, doesn't it? Ufs writes raw, zfs writes to zil, then final restingplace forthe data. Account for the seeks between, and you're seeing what you should. Move the zil if you don't want both those sets of writes on the same array, or do what Svein said, and get funk^w logical. (a tad simplified, but I think the logic will hold. (yes, pun intended)) Terje___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org