Re: [lopsa-tech] Question on SANs, blocksize, replication, and apps

John Stoffel Fri, 29 Jan 2010 09:13:38 -0800

Ski> I have been tussling with a SAN problem for several weeks now and
Ski> would like comments from you folks on it.


This doesn't sound so much like a SAN problem, but a replication
problem between your Equallogic boxes.

Ski> Situation: Equallogic iscsi san. Primary site has a 2TB volume
Ski> (1.7TB used per the SAN, 1.5TB used per the operating system).
Ski> The volume is used as the datastore for a Scalix (used to be HP
Ski> OpenMail) mail server.  File system is ext3 mounted with defaults
Ski> and yes I plan to redo this over the weekend.  It has about 30
Ski> million files of which 60% are less than 4K in size.  Equallogic
Ski> uses a 64K strip on its arrays with a 256K block size (not
Ski> changeable).  DR site has 6TB allocated for replicas.

How are you doing the replication?  Block level?  Rsync?

Ski> My primary problem is that the replication keeps failing for
Ski> running out of space even though I have 6TB available.  I can do
Ski> the first replica, and sometimes a second or third, but then it
Ski> starts failing due to lack of space.  Change amounts ranges are
Ski> 200 - 500GB.  Even with that I should be able to create a few
Ski> replicas into 6TB (I would thnk).

I'm not totally suprised, since it sounds like you're doing block
level replication here, and since your files are all so much smaller
than the minimum block size, you're having problems when only 4K of a
block changes, it has to send the entire 64K stripe or 256K block over
to the replica system.  

Does the initial replica take only 2Tb of space?  And then the
followons take lots more than the size of the changed files would
suggest? 

Ski> What have you experienced with SAN's and applications that have
Ski> millions of small files?  What tricks did you use to make them
Ski> work?  Am I barking up the wrong tree and need to go in a totally
Ski> different direction?

I think you'll need to bite the bullet and do some sort of per-file
replication, just because your usage is killing your SAN replication.
I assume the Scalix mailstore if maildir format, with each message in
it's own file?  Not fun.

I'm in a Netapp world these days, and while I do replication of
volumes with lots and lots of small files, it's not at the level of
churn you're at, nor is it important that I keep multiple snapshots
around.

Turning off atime updates in ext3 might be a good first step, anything
you can do to limit changes to the filesystem would be a good thing.

If you can break your filesystem down into smaller sub-units, that
might let you do rsync style file level scans more efficiently.  Or
maybe you just do the intial replica using the block level stuff, THEN
do a file level scan on the replica so your don't impact your
production box and keep copies there.

More details though!

Thanks,
John
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] Question on SANs, blocksize, replication, and apps

Reply via email to