Re: Best filesystem options for large drive
Hello Nick, Thursday, February 12, 2015, 9:26:01 AM, you wrote: NH On 02/12/15 10:10, Boris Goldberg wrote: Hello Nick, NH ... I was entertaining the idea of making a 100 TB OpenBSD based archive storage, even asked the list. The only answer pointed to that FAQ page, and it stopped me from pursuing that idea. Servers with 128 GB of RAM aren't uncommon, but expensive (comparing to 64/32 GB ones). NH I don't care what OS you are using, 100TB single volume archive is NH doing it wrong. NH Chunk your data, you will thank me; when it comes time to upgrade and NH migrate your hardware, you will be kissing my feet. NH The numbers have changed a bit (for the bigger) but the idea is as valid NH today as it was eight years ago: NH http://archives.neohapsis.com/archives/openbsd/2007-04/1572.html Thanks. The facts aren't new, but well put together. Will try to don't plan the storage needs more than a (half) year ahead. It's too bad we don't have 10 TB disks yet. ;) -- Best regards, Borismailto:bo...@twopoint.com
Re: Best filesystem options for large drive
Hello Nick, Wednesday, February 11, 2015, 1:05:20 PM, you wrote: NH On 02/11/15 11:58, Jan Stary wrote: On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. Here's an almost full 4TB drive... FAQ4 still says If you make very large partitions, keep in mind that performing filesystem checks using fsck(8) requires about 1M of RAM per gigabyte of filesystem size ^^^ Does that still apply? Jan NH It is probably far less than that currently, but lacking a more precise NH number, I don't think this is a bad rule of thumb, and if you wish to NH disregard it, I suspect you either read and really understand the code NH or do some real world testing on YOUR hardware and file systems. The NH penalties for too much RAM are minimal; the penalties for too little are NH ... substantial. NH Note that you don't have to leave file systems mounted RW all the time, NH especially a backup server. Mount it RW when you need it, dismount or NH RO it when you don't...tripping over the power power cords won't NH (shouldn't?) corrupt a file system that is mounted RO. You don't get to NH ignore the issues, but you can reduce their occurrence. I was entertaining the idea of making a 100 TB OpenBSD based archive storage, even asked the list. The only answer pointed to that FAQ page, and it stopped me from pursuing that idea. Servers with 128 GB of RAM aren't uncommon, but expensive (comparing to 64/32 GB ones). -- Best regards, Borismailto:bo...@twopoint.com
Re: Best filesystem options for large drive
On 02/12/15 10:10, Boris Goldberg wrote: Hello Nick, ... I was entertaining the idea of making a 100 TB OpenBSD based archive storage, even asked the list. The only answer pointed to that FAQ page, and it stopped me from pursuing that idea. Servers with 128 GB of RAM aren't uncommon, but expensive (comparing to 64/32 GB ones). I don't care what OS you are using, 100TB single volume archive is doing it wrong. Chunk your data, you will thank me; when it comes time to upgrade and migrate your hardware, you will be kissing my feet. The numbers have changed a bit (for the bigger) but the idea is as valid today as it was eight years ago: http://archives.neohapsis.com/archives/openbsd/2007-04/1572.html Nick.
Re: Best filesystem options for large drive
On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. Here's an almost full 4TB drive... FAQ4 still says If you make very large partitions, keep in mind that performing filesystem checks using fsck(8) requires about 1M of RAM per gigabyte of filesystem size ^^^ Does that still apply? Jan
Re: Best filesystem options for large drive
Thanks all for the tuning flags the example. I'll take a look at the man pages and file set. Doesn't look like the 4TB FFS2 will be a problem on this machine after all.
Re: Best filesystem options for large drive
On 02/11/15 11:58, Jan Stary wrote: On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. Here's an almost full 4TB drive... FAQ4 still says If you make very large partitions, keep in mind that performing filesystem checks using fsck(8) requires about 1M of RAM per gigabyte of filesystem size ^^^ Does that still apply? Jan It is probably far less than that currently, but lacking a more precise number, I don't think this is a bad rule of thumb, and if you wish to disregard it, I suspect you either read and really understand the code or do some real world testing on YOUR hardware and file systems. The penalties for too much RAM are minimal; the penalties for too little are ... substantial. Note that you don't have to leave file systems mounted RW all the time, especially a backup server. Mount it RW when you need it, dismount or RO it when you don't...tripping over the power power cords won't (shouldn't?) corrupt a file system that is mounted RO. You don't get to ignore the issues, but you can reduce their occurrence. Nick.
Re: Best filesystem options for large drive
On Wed, Feb 11, 2015 at 6:43 AM, Janne Johansson icepic...@gmail.com wrote: You can invent how many journals and whatevers you like to hope to prevent the state from being inconsistent, but broken or breaking sectors will sooner or later force you to run over all files and read/check them, and in that case you will need lots of ram anyhow. The data in this thread seems to show that this is not true. 4TB fs with 1,642 files = 83MB of RAM, ~60 seconds 4TB fs with 3,900,811 files = 137MB of RAM, 30 minutes (Sure, on some platforms, 137MB is a lot of RAM but I don't think we're talking about.) Granted it's only two data points, but when number of files went up by 2375x, time to fsck went up by ~60x however RAM usage only went up by 1.7x. It seems as if increase in number of files requires only a modest increase in RAM. (Small disclaimer: we don't know platforms involved). On Wed, Feb 11, 2015 at 8:58 AM, Jan Stary h...@stare.cz wrote: FAQ4 still says If you make very large partitions, keep in mind that performing filesystem checks using fsck(8) requires about 1M of RAM per gigabyte of filesystem size ^^^ Does that still apply? A 4TB filesystem would mean 4GB of RAM, and neither fsck in the examples above was close to that. -- andrew fabbro and...@fabbro.org blog: https://raindog308.com
Re: Best filesystem options for large drive
2015-02-10 17:44 GMT+01:00 yary not@gmail.com: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. Is there some filesystem-plus-options for recent OpenBSD that guarantees the disk will always be in a consistent state, even after a crash, so that fsck won't be necessary? You can invent how many journals and whatevers you like to hope to prevent the state from being inconsistent, but broken or breaking sectors will sooner or later force you to run over all files and read/check them, and in that case you will need lots of ram anyhow. It's nice to have a journal (or intention log or whatever) so that after a random power-off you can see that no writes were ongoing and we can skip some (all?) checks, but when your super critical data is on a disk that is failing, you will need to check it regardless. Fsck usage is mostly about the unexpected shutdowns and crashes, but also for the degrading disks to salvage what can be gotten out of it. When the worst happens, shopping for ram sticks in order to get at least 50% of the data out, isn't that fun. -- May the most significant bit of your life be positive.
Re: Best filesystem options for large drive
On Tue, Feb 10, 2015 at 11:35:32PM +0100, Jan Stary wrote: On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. And if you know in advance that the files will be large (video editing?) and there will not be many of them, you might benefit from 'newfs -i' (and other options) when creating the file system. For that application, I recommed use the largest block and fragment size as well (this has also the effect of creating less inodes). Less metadata means faster checking. -Otto
Re: Best filesystem options for large drive
here's an example for fsck on a largish volume with a lot of files: # df -hi /nfs/archive Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted on /dev/sd0e 3.6T2.3T1.2T67% 3900811 119021683 3% /nfs/archive # umount /nfs/archive # \time -l fsck -f /dev/sd0e ** /dev/rsd0e ** File system is already clean ** Last Mounted on /nfs/archive ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 3900811 files, 307622602 used, 179239875 free (49355 frags, 22398815 blocks, 0.0% fragmentation) 1966.70 real14.68 user36.78 sys 137096 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 3561095 minor page faults 4 major page faults 0 swaps 0 block input operations 5 block output operations 0 messages sent 0 messages received 0 signals received 526407 voluntary context switches 30 involuntary context switches # note that with nearly 4 million files, the amount of time required by fsck increased dramatically(over 30 minutes) but memory usage increased much less (only 137MB). this particular system has 12GB RAM but doesn't appear to ever use much of it. the sd0 device is a 6TB RAID10 array (4x 3TB drives) on an Areca ARC1110 PCI-X controller (in a 64-bit 133MHz PCI-X slot), partitioned with 1/3 of the space on sd0d and the remaining 2/3 on sd0e. /dev/sd0d was mostly idle (although still mounted) while fsck was running. -ken On Tue, Feb 10, 2015 at 5:35 PM, Jan Stary h...@stare.cz wrote: On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. And if you know in advance that the files will be large (video editing?) and there will not be many of them, you might benefit from 'newfs -i' (and other options) when creating the file system.
Re: Best filesystem options for large drive
On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. Here's an almost full 4TB drive... # df -hi /export Filesystem SizeUsed Avail Capacity iused ifree %iused Mounted on /dev/sd0d 3.6T3.3T124G96%1642 122292628 0% /export ... but it has only 1642 files and directories. Checking this takes all of 60 seconds and 83M of memory; # umount /export # \time -l fsck -f /dev/sd0d ** /dev/rsd0d ** File system is already clean ** Last Mounted on /export ** Phase 1 - Check Blocks and Sizes ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups 1642 files, 444033559 used, 40503354 free (306 frags, 5062881 blocks, 0.0% fragmentation) 60.37 real14.21 user 4.29 sys 83688 maximum resident set size 0 average shared memory size 0 average unshared data size 0 average unshared stack size 513846 minor page faults 3 major page faults 0 swaps 0 block input operations 9 block output operations 0 messages sent 0 messages received 0 signals received 30962 voluntary context switches 19 involuntary context switches -- Christian naddy Weisgerber na...@mips.inka.de
Best filesystem options for large drive
I'm setting up an openBSD 5.6 box with a 4TB raid to back up a video editing cluster. I'll be using BackupPC which likes to have a single large volume so it can de-duplicate files using hard links. Thus the main volume will be all the 4TB. I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. Is there some filesystem-plus-options for recent OpenBSD that guarantees the disk will always be in a consistent state, even after a crash, so that fsck won't be necessary? (Yes I'll be testing out the RAID before deploying, would like to start with the right filesystem on it from the outset. And yes it will be on a UPS with graceful shutdown. I don't trust myself to get it all right, and bad things still happen...) -y
Re: Best filesystem options for large drive
On Feb 10 17:48:22, na...@mips.inka.de wrote: On 2015-02-10, yary not@gmail.com wrote: I know FFS2 can handle that size easily, but I'm worried about fsck taking forever. This machine will have 1.5GB RAM, from what I've read that's not enough memory to fsck a 4TB volume without painful swapping. It vastly depends on the number of files you have on there. And if you know in advance that the files will be large (video editing?) and there will not be many of them, you might benefit from 'newfs -i' (and other options) when creating the file system.