Re: UFS2 optimization for many small files
Chuck Swiger skrev: On Mar 12, 2008, at 12:23 PM, Angelo Turetta wrote: Chuck Swiger wrote: On Mar 12, 2008, at 11:44 AM, Angelo Turetta wrote: I tried understanding where the difference was, but I cannot work-out any cause in the file systems: I believe Cyrus will create hard links if the same email message is kept in multiple folders. Do you know if this includes hard-linking multiple copies of the same message received by different users? If it's only for messages in the same user's mailbox, no way incidence can reach 20% in my case. That's a good question. I don't see any reason why this couldn't include the same message received by different users, too It does. You may want to try something like rsync -aH when copying. Regards, Lars ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
On Mar 12, 2008, at 12:23 PM, Angelo Turetta wrote: Chuck Swiger wrote: On Mar 12, 2008, at 11:44 AM, Angelo Turetta wrote: I tried understanding where the difference was, but I cannot work- out any cause in the file systems: I believe Cyrus will create hard links if the same email message is kept in multiple folders. Do you know if this includes hard-linking multiple copies of the same message received by different users? If it's only for messages in the same user's mailbox, no way incidence can reach 20% in my case. That's a good question. I don't see any reason why this couldn't include the same message received by different users, too -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
Chuck Swiger wrote: On Mar 12, 2008, at 11:44 AM, Angelo Turetta wrote: I tried understanding where the difference was, but I cannot work-out any cause in the file systems: I believe Cyrus will create hard links if the same email message is kept in multiple folders. Do you know if this includes hard-linking multiple copies of the same message received by different users? If it's only for messages in the same user's mailbox, no way incidence can reach 20% in my case. Angelo. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
On Mar 12, 2008, at 11:44 AM, Angelo Turetta wrote: I then proceeded to copy my cyrus-imapd partition from /usr/local/ mail (on /dev/da0s1f) to the new 76GB /mail (/dev/da2s1d). During this copy I noticed the disk usage of the mailboxes (as reported by du(8)) growing about 20% larger in the process. Please note that cyrus stores mailboxes with 1 file per message, 1 directory per IMAP- folder, and the moved files are in the order of the hundred- thousands, with half of them less than 8 KB large. I tried understanding where the difference was, but I cannot work- out any cause in the file systems: I believe Cyrus will create hard links if the same email message is kept in multiple folders. If your copy did not preserve hard and symlinks, the extra space growth might be a result...consider retrying the copy using the options to rsync/tar/whatever to preserve the links. Regards, -- -Chuck ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
UFS2 optimization for many small files
I recently upgraded the disk of my mail server. The server was initially installed with a single 36GB RAID1 volume with FreeBSD 5 (summer 2004). Over the years I upgraded to FreeBSD 6, and some months ago I added another 36GB RAID1 volume and one 72GB RAID1 volume. I then proceeded to copy my cyrus-imapd partition from /usr/local/mail (on /dev/da0s1f) to the new 76GB /mail (/dev/da2s1d). During this copy I noticed the disk usage of the mailboxes (as reported by du(8)) growing about 20% larger in the process. Please note that cyrus stores mailboxes with 1 file per message, 1 directory per IMAP-folder, and the moved files are in the order of the hundred-thousands, with half of them less than 8 KB large. I tried understanding where the difference was, but I cannot work-out any cause in the file systems: [EMAIL PROTECTED] /data]# disklabel da0s1 # /dev/da0s1: 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 52428804.2BSD 2048 16384 32776 b: 4142832 524288 swap c: 711196920unused0 0 # "raw" part, don't edit d: 4194304 46671204.2BSD 2048 16384 28552 e: 1048576 88614244.2BSD 2048 16384 8 f: 61209692 9914.2BSD 2048 16384 28552 [EMAIL PROTECTED] /data]# disklabel da2s1 # /dev/da2s1: 8 partitions: #size offsetfstype [fsize bsize bps/cpg] c: 1422532480unused0 0 # "raw" part, don't edit d: 14225324804.2BSD 2048 16384 28552 What can I look at, now? Should I decide to reformat my disk, what newfs parameters you'd advice for my case? Thanks, Angelo. PS: here follows the disk definitions: why the disk formatted during the initial FreeBSD5 setup (da0) has a different geometry than the one formatted later with FreeBSD6 (da1, hardware identical to da0)? Maybe this is influencing the block occupation? - [EMAIL PROTECTED] /data]# fdisk /dev/da0 *** Working on device /dev/da0 *** parameters extracted from in-core disklabel are: cylinders=4427 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=4427 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 71119692 (34726 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 254/ sector 63 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: --- [EMAIL PROTECTED] /data]# fdisk /dev/da1 *** Working on device /dev/da1 *** parameters extracted from in-core disklabel are: cylinders=8716 heads=255 sectors/track=32 (8160 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=8716 heads=255 sectors/track=32 (8160 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 32, size 71122528 (34727 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 254/ sector 32 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: --- [EMAIL PROTECTED] /data]# fdisk /dev/da2 *** Working on device /dev/da2 *** parameters extracted from in-core disklabel are: cylinders=17433 heads=255 sectors/track=32 (8160 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=17433 heads=255 sectors/track=32 (8160 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 32, size 142253248 (69459 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 254/ sector 32 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
On 7/2/07, Nikolay Pavlov <[EMAIL PROTECTED]> wrote: On Wednesday, 27 June 2007 at 14:11:19 +0400, Nguyen Tam Chinh wrote: > Greetings, > > We're going to build a server with some 1Tb of over 500 million small > files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle > this kind of system well. From newfs(8) the min block size is 4k. This > is not optimal in our case, a 1k or 0,5k block is more effective IMHO. > I'd be happy if anyone can suggest what does fragment (block/8) in the > ufs2 mean and how this parameter works. I know It's better to read the > full ufs2 specification, but hope that someone here can give a hint. > Please advice with optimizations or tricks. > Thank you very much. > > -- > With best regards,| The Power to Serve > Nguyen Tam Chinh | http://www.FreeBSD.org > ___ > [EMAIL PROTECTED] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "[EMAIL PROTECTED]" I am not aware of any ZFS results on such tasks, may be you will be the one who share them ;) However RaiserFS whould be the best choise on such spesific case. It's not available on FreeBSD currently. I don't think UFS can handle a huge amount of small files effectively. Of course gjournal could be an option for fsck problems, but how do you plan to backup or sync this storage? I'm aware of the fsck/backup problems. In our case there's no need for backup so i went with ufs2. The current configuration is 4x250Gb disks with bloc/frag ratio 4k/512b. We're generating files with the average size of 6k ('cause the compress procedure does not work as well as we estimated). After a week I think we could collect some statistics in production. Anyway, in this case a 8k/1k would be more effective for us. Hope that I can test this in the next server. -- With best regards,| The Power to Serve Nguyen Tam Chinh | http://www.FreeBSD.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
On Wednesday, 27 June 2007 at 14:11:19 +0400, Nguyen Tam Chinh wrote: > Greetings, > > We're going to build a server with some 1Tb of over 500 million small > files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle > this kind of system well. From newfs(8) the min block size is 4k. This > is not optimal in our case, a 1k or 0,5k block is more effective IMHO. > I'd be happy if anyone can suggest what does fragment (block/8) in the > ufs2 mean and how this parameter works. I know It's better to read the > full ufs2 specification, but hope that someone here can give a hint. > Please advice with optimizations or tricks. > Thank you very much. > > -- > With best regards,| The Power to Serve > Nguyen Tam Chinh | http://www.FreeBSD.org > ___ > [EMAIL PROTECTED] mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "[EMAIL PROTECTED]" I am not aware of any ZFS results on such tasks, may be you will be the one who share them ;) However RaiserFS whould be the best choise on such spesific case. It's not available on FreeBSD currently. I don't think UFS can handle a huge amount of small files effectively. Of course gjournal could be an option for fsck problems, but how do you plan to backup or sync this storage? -- == - Best regards, Nikolay Pavlov. <<<--- == ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
> snapshot of a partition, in order to perform a background-fsck and > thus our website was down. So ufs2 does not scale well. Reasons not related to the nfs-server itself. FreeBSD itself was rock-solid. It was firmware-related on the storage-side. i always use software mirror concat or both in FreeBSD. always works, 10 times cheaper, fully portable and (yes true) comparable in speed. in some cases - faster. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
> Try zfs on amd64 unless your app doesn't work well with zfs or your does zfs have RELIABLE and USABLE software allowing to efficiently backup large filesystems to other media? (DVD's, tapes, other hard discs) Zfs has send/receive where you can do snapshots and send them to a different host. This could be your backup-host. I'm considering this solution myself where FreeBSD and zfs is my primary host and my nightly backups will be send to my solaris-host. Solaris has the required lto-3-drivers. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
> approx. 15 partitions ranging from 400 GB to 2 TB in size. If the > server for some reason had crashed the webservers were unable to the question is about the reason it crashed... > access the nfs-mounted partitions during the period the server did a > snapshot of a partition, in order to perform a background-fsck and > thus our website was down. So ufs2 does not scale well. Reasons not related to the nfs-server itself. FreeBSD itself was rock-solid. It was firmware-related on the storage-side. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
Thank you very much. Try zfs on amd64 unless your app doesn't work well with zfs or your does zfs have RELIABLE and USABLE software allowing to efficiently backup large filesystems to other media? (DVD's, tapes, other hard discs) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
I have tried using a 4K/0.5K UFS1 filesystem in the past and found the performance was very poor. UFS2 was based on 16K/2K and I would expect it to perform even worse with 4K/0.5K. I would suggest you try 8K/1K. not for small files. you are light with large files but it's not THAT bad as you say. i reagularly use 4K/0.5 UFS but not for everything if i require good fast speed for big files. for really big files i make 32/4 filesystem with very little inodes ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
approx. 15 partitions ranging from 400 GB to 2 TB in size. If the server for some reason had crashed the webservers were unable to the question is about the reason it crashed... access the nfs-mounted partitions during the period the server did a snapshot of a partition, in order to perform a background-fsck and thus our website was down. So ufs2 does not scale well. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
We're going to build a server with some 1Tb of over 500 million small files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle this kind of system well. From newfs(8) the min block size is 4k. This is not optimal in our case, a 1k or 0,5k block is more effective IMHO. I'd be happy if anyone can suggest what does fragment (block/8) in the ufs2 mean and how this parameter works. I know It's better to read the exactly as a block/cluster in windows. fragment is the smallest allocation block. "block" is a group of 8 fragments to make allocation faster and smarter. full ufs2 specification, but hope that someone here can give a hint. Please advice with optimizations or tricks. please DO NOT make single partition like that. try to divide it to 3-4 partitions. it will work on a single one but waiting for fsck will kill you ;) AFAIK fsck time grows nonlinearly with fs size to some extent.. options for newfs will be like that newfs -m -i -b 4096 -f 512 -U /dev/partition where A is space left. with mostly small files and huge partition don't worry to set it 1 or even 0. B - size of disk(bytes)/amount of inodes default is probably 2048, you may use 1024 or 4096 for your case - make rough estimate how much files will you have (you told between 4 and 0.5k, but what average?). making too much inodes=wasted space (128 bytes/inode), making too little=big problem :) another question - HOW do you plan to make backups of such data? with dump rsync tar etc. it's clearly "mission impossible". feel free to mail me i had such cases not 5E8 but over 1E8 files :) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
We're going to build a server with some 1Tb of over 500 million small files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle this kind of system well. From newfs(8) the min block size is 4k. This is not optimal in our case, a 1k or 0,5k block is more effective IMHO. I'd be happy if anyone can suggest what does fragment (block/8) in the ufs2 mean and how this parameter works. I know It's better to read the full ufs2 specification, but hope that someone here can give a hint. Please advice with optimizations or tricks. Thank you very much. Try zfs on amd64 unless your app doesn't work well with zfs or your organization doesn't allow current. Current is remarkably stable taking into account zfs is fairly new and ported from solaris and running on current. I'm using it on a 8.2 TB nexsan storage and no crashes during testing and a limited time in production. Some years ago I used FreeBSD (5.2) as nfs-server (using ufs2) on approx. 15 partitions ranging from 400 GB to 2 TB in size. If the server for some reason had crashed the webservers were unable to access the nfs-mounted partitions during the period the server did a snapshot of a partition, in order to perform a background-fsck and thus our website was down. So ufs2 does not scale well. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
On 2007-Jun-27 14:11:19 +0400, Nguyen Tam Chinh <[EMAIL PROTECTED]> wrote: > We're going to build a server with some 1Tb of over 500 million small > files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle > this kind of system well. Short answer: No. Longer answer: FreeBSD and UFS2 have been tweaked to support large numbers of files in larger filesystems and there are no hard limits that you will exceed by having 500,000,000 files in a >1TB FS. However, you will not be able to fsck the FS on an i386 system and will need a lot of RAM+SWAP on amd64 or SPARC64. fsck will also take a _long_ time (hours) to run. Depending on how the files are organised, you may run into severe performance problems with directory searching. > From newfs(8) the min block size is 4k. This > is not optimal in our case, a 1k or 0,5k block is more effective IMHO. > I'd be happy if anyone can suggest what does fragment (block/8) in the > ufs2 mean and how this parameter works. I suggest you read /usr/share/doc/smm/05.fastfs/paper.ascii.gz Whilst this paper discusses UFS1, the basics remain the same. I have tried using a 4K/0.5K UFS1 filesystem in the past and found the performance was very poor. UFS2 was based on 16K/2K and I would expect it to perform even worse with 4K/0.5K. I would suggest you try 8K/1K. BTW, in sizing your system, you will need to allow for both the last space when the file sizes are rounded up to a multiple of the fragment size, as well as the inode size (256 bytes). If you have 1TB of data, it's likely that you will have another 0.5-1TB of overheads. Overall, I suggest you look at an alternative way to store the data. -- Peter Jeremy pgp1NIeAQs5sj.pgp Description: PGP signature
Re: UFS2 optimization for many small files
Nguyen Tam Chinh wrote: [snipped] Please advice with optimizations or tricks. [...] Did you already looked at 'man 7 tuning'? HTH, Philipp -- www.familie-ost.info/~pj ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: UFS2 optimization for many small files
In response to "Nguyen Tam Chinh" <[EMAIL PROTECTED]>: > > We're going to build a server with some 1Tb of over 500 million small > files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle > this kind of system well. From newfs(8) the min block size is 4k. This > is not optimal in our case, a 1k or 0,5k block is more effective IMHO. > I'd be happy if anyone can suggest what does fragment (block/8) in the > ufs2 mean and how this parameter works. I know It's better to read the > full ufs2 specification, but hope that someone here can give a hint. > Please advice with optimizations or tricks. > Thank you very much. Read the newfs man page. Based on your assessment of your files, I'd go with a block size of 4K and a frag size of 500 bytes. Blocks are broken in to frags when a file doesn't fill an entire block. Make sure to set -i to about 250 or so. An inode is needed for each file or directory on the filesystem, so you're liable to run out of inodes with the default values. Make sure your files are organized in a directory hierarchy. No filesytem that I know of performs well with huge numbers of files in a single directory. Please don't cross-post. I see no reason to copy stable@ with this message. -- Bill Moran http://www.potentialtech.com ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"
UFS2 optimization for many small files
Greetings, We're going to build a server with some 1Tb of over 500 million small files with size from 0,5k to 4k. I'm wonder if the ufs2 can handle this kind of system well. From newfs(8) the min block size is 4k. This is not optimal in our case, a 1k or 0,5k block is more effective IMHO. I'd be happy if anyone can suggest what does fragment (block/8) in the ufs2 mean and how this parameter works. I know It's better to read the full ufs2 specification, but hope that someone here can give a hint. Please advice with optimizations or tricks. Thank you very much. -- With best regards,| The Power to Serve Nguyen Tam Chinh | http://www.FreeBSD.org ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "[EMAIL PROTECTED]"