There are many free open source and enterprise solutions as well, in any case you have to hire a professional Linux consultant and sys admin to handle that for you.
Speaking of GPFS. You would have to call IBM to get the quote and storage equipment. It is case by case. Their support is very good, they can even log-in to your servers to resolve the problem. GPFS like any other high performance distributed file system consists of separate storages for metadata, log and actual data. As I recall we were writing 10k small and medium files every 2-3 hours without any problems. But for that number of tiny files you have to make the metadata storage really big. Sincerely, Alexandr Normuradov 425-522-3703 On 5 April 2011 21:00, Derek Simkowiak <[email protected]> wrote: > /> Database servers write lots (maybe 10k/day?) of little PDF's to shared > filesystem; apps servers read'em, print'em, etc./ > > > 10k/day averages to one PDF file every ~9 seconds. Even if your server > load peaks to 900% of average during business hours, that's still only one > PDF file every second or so. Any modern Linux box with a new hard drive and > GigE could handle that... if you had a dedicated shared file server running > NFS or Samba, I think you'd be fine. > > For the mirroring -- assuming regular backups won't do -- consider using > DRDB: > > http://blog.mydream.com.hk/howto/howto-create-gfs-on-drbd-network-disk-mirroring > http://www.redhat.com/archives/linux-cluster/2007-December/msg00083.html > http://roaksoax.wordpress.com/2008/07/31/installing-drbd-on-hardy/ > > I've been waiting for cluster project to test it out on. The above links > recommend it with GFS, for "dual primary" (similar to "multi master") > clusters where two servers can write at the same time. In this case, you'd > have true high availability -- the "backup" would always be up and running. > You could even load balance between the two running servers... > > But since you said you don't need "dual primary", you could also just use > ext4 and then mount the DRDB device as a read-only mount on the backup > server. Since the backup never writes to the filesystem (unless it was > reconfigured to be the master), you wouldn't need to deal with GFS. Ext4 > has proven to be a performance fiend... I switched to it a few months ago > and I'm very happy with it. Clients can then just use NFS or Samba. (I > don't always use NFS. But when I do, I prefer user-mode NFS.) > > Here's a article from 2007 which includes Ext4 performance data: > > http://ciar.org/ttk/zfs-xfs-ext4.html > > > --Derek > > On 04/05/2011 07:59 PM, Glenn Stone wrote: >> >> On Tue, Apr 05, 2011 at 07:14:51PM -0700, Alexandr Normuradov wrote: >>> >>> > From real world experience we used IBMs GPFS. Very stable, has >>> multiple clients drivers and robust manuals. Cost alot but actually >>> works. >> >> How much is "a lot"? he asks curiously... >> >> Also, how good is IBM's support for this? That was a concern the boss had >> (and so do I), that Coda's support would be sketchy at best... I've dealt >> at >> the driver level before but it's been a *long* time... >> >> More info: Database servers write lots (maybe 10k/day?) of little PDF's >> to >> shared filesystem; apps servers read'em, print'em, etc. I've not seen >> anything indicating anyone has optimized for many small files; much to the >> contrary (optimized for database use, few large files). >> >> -- Glenn >> >> >>> On Tuesday, 5 April 2011, Derek Simkowiak<[email protected]> wrote: >>>> >>>> I looked at CODA for a cluster back in 2004 and decided I'd never >>>> use it. >>>> >>>> It's far more complex than any other filesystem I've worked with. >>>> It's the only one that requires a special "log" partition, a special >>>> "metadata" partition, and requires you to enter hex addresses for the >>>> starting locations of certain data blocks. >>>> >>>> Consider this paragraph from the CODA manual that tells you how big >>>> the RVM partition should be: >>>> >>>> As a rule of thumb, you will need about 3-5% of the total file data >>>> space for recoverable storage. We currently use 4% as a good value under >>>> most circumstances. In our systems the data segment is 90Meg for >>>> approximately 3.2 gigabytes of disk space. By making it smaller, you can >>>> reduce server startup time. However, if you run out of space on the RVM >>>> Data >>>> partition, you will be forced to reinitialize the system, a costly penalty. >>>> So, plan accordingly. >>>> >>>> Okay, so... I've never used CODA before, and I'm not sure what my >>>> filesystem will look like. There is no way their ancient example numbers >>>> for "3.2 gigabytes" scales up to today's filesystems that are closer to 3.2 >>>> terabytes. How am I supposed to know what initial values to choose? If I >>>> guess wrong, it'll only destroy the entire filesystem. I can understand >>>> inode size... but I can't understand this. >>>> >>>> And configuring the RVM (metadata) looks like this (again from the >>>> manual -- those hex values are supposed to be magically chosen and entered >>>> by the user): >>>> >>>> $ rdsinit /dev/hdc1 /dev/sdb1 >>>> >>>> Enter the length of the device /dev/sdb1: 119070700 >>>> >>>> Going to initialize data file to zero, could take awhile. >>>> done. >>>> rvm_initialize succeeded. >>>> >>>> starting address of rvm: 0x50000000 >>>> heap len: 0x1000000 >>>> static len: 0x100000 >>>> nlists: 80 >>>> chunksize: 32 >>>> >>>> rds_zap_heap completed successfully. >>>> rvm_terminate succeeded. >>>> >>>> [Note: Use of the decimal value for the length of the device and the use >>>> of hex values for the address and lengths of the next three values.] >>>> >>>> I absolutely love that little note... use decimal value for the >>>> first value, and hex for everything else. Don't forget! :) And the manual >>>> says that they use 0x50000000 (or was it 0x500000000? can't remember) for >>>> Intel-based architectures running Linux or FreeBSD... but nothing about >>>> other platforms. The tools, documentation, and skilled technicians >>>> necessary in an emergency just don't seem to be there for CODA. >>>> >>>> In short, managing CODA seems to be about on par with managing a big >>>> database. Too complex, too many options, and you need an in-house expert >>>> to >>>> keep the thing running. >>>> >>>> >>>>> /A big item on our wishlist is the ability to both have multiple hosts >>>>> writing to the distributed filesystem/ >>>> >>>> NFS or Samba. Cross-platform, any error message you come across is >>>> guaranteed to show up in a Google search, and any version of Linux comes >>>> with either of these ready to go. >>>> >>>> (There are others, like SSHFS and WebDAV, but they don't support the >>>> concept of UIDs and GIDs.) >>>> >>>> >>>>> /*and* have read-only backup copies on standby hosts (which could then >>>>> be converted to active in the event of catastrophe)/ >>>> >>>> How will this filesystem be used? Is this for a company file >>>> server, or for some real-time shared storage for a public server cluster? >>>> >>>> Note that hot mirrors (like CODA or RAID) are only part of the >>>> solution. They won't protect you if you accidentally delete the wrong >>>> file, >>>> or if you get rooted by a script kiddie. >>>> >>>> I use rsnapshot (which does rsync incrementals like Apple's Time >>>> Machine) run once every hour, to an offsite backup server. My backups are >>>> always online in a read-only fashion, ready for use at any time. If my >>>> primary server melts, then I've lost (at most) an hour of work. Plus, I >>>> have incrementals -- I can instantly see any of my files as they existed 3 >>>> hours, 4 days, or 5 months ago. Using SSH keys it's all encrypted and >>>> fully >>>> automatic. And if there's a disaster, I'm not dealing with any magic >>>> partition sizes or other such nonsense -- it's just files on a disk. >>>> >>>> >>>> --Derek >>>> >>>> On 04/05/2011 04:27 PM, Glenn Stone wrote: >>>> >>>> $NEWCOMPANY is having major issues with OCFS, and I'm looking into >>>> alternatives for replacing it. A big item on our wishlist is the >>>> ability to >>>> both have multiple hosts writing to the distributed filesystem, *and* >>>> have >>>> read-only backup copies on standby hosts (which could then be converted >>>> to >>>> active in the event of catastrophe). Coda seems to fit this bill, from >>>> what >>>> I've been able to google up. I'm not, however, able to determine if >>>> this >>>> thing is still in an R&D phase, or ready for prime time; it seems to >>>> maybe >>>> kinda slowly still be being worked on? (Latest RPM's are dated >>>> 1/26/2010). >>>> >>>> Exploring alternatives, >>>> Glenn >>>> >>>> Listen: I'm a politician, which means I'm a cheat and a liar, and when >>>> I'm >>>> not kissing babies I'm stealing their lollipops. But it also means I >>>> keep my >>>> options open. -- Jeffery Pelt, "Red October" >>>> >>>> >>>> >>>> >>> -- >>> Sincerely, >>> Alexandr Normuradov >>> 425-522-3703 >>> > >
