If you have one NetApp, and can make it the 'centre of the data 
universe', you can use snapshots to help your syncing without having 
another NetApp or incurring a license fee, but you don't get 'live' 
copies at all times (which can cause a significant overhead on a pair of 
NetApps anyway).

You can segment your rsyncs - lots of rsyncs running can also affect 
your overhead, but you may be able to come up with a workable balance 
between parallel rsyncs and system load. Do make sure to delete any 
snapshots that you are no longer using, especially if a lot of changes 
are occurring on the filer. Keeping old snapshots around too long on a 
busy filer can cause you to eat into your 'live' diskspace, as the filer 
will consume whatever it needs to maintain the snapshots. (For 
'transient' filers, i.e., work volumes, we turn off snapshots 
altogether, to prevent this issue.)

It really depends on how much data is changing and at what rate - it's 
expensive to replicate a lot of change (in CPU overhead and bandwidth) 
no matter what the underlying technology is. You are always going to 
have trade-offs between the amount of data that you need to replicate, 
how often it gets updated, CPU overhead, and bandwidth consumed. (Oh, 
and how much you're willing to pay NetApp or other vendors. :-)

You may want to prioritise your data, and have different replication 
schedules based on this priority - if some data changes frequently but 
can stand having updates 'lag' for a longer period of time, you can save 
a lot of overhead by not replicating it as diligently as other data 
whose replication is more time sensitive.

Another trade-off: Your time and effort in setting up and maintaining 
the replication system - I assume that you don't work for free! (I 
certainly don't! :-)

You have other options with SAN storage, but it isn't necessarily better 
or cheaper (usually not), just different. Replication is usually done at 
a different level, with different overheads and impacts. You really do 
need to consider the technology that you are using when looking at 
replication because every technology has different replication strengths 
and weaknesses - and costs. Since the costs are not insignificant, they 
can override all of the other considerations, so it pays to investigate 
new technologies with this in mind.

- Richard


Edward Ned Harvey wrote:
>> Other options that I can think of for a random "would like some kind
>> of replication thingy without thrashing my filesystem regularly"
>> thingy:
>> - Your SAN probably has a replication engine, use that (for vast
>> quantities of random unstructured end-user data, this is probably the
>> beast/easiest method)
>>     
>
> In my present scenario (which I'm not necessarily trying to improve), I have 
> a NetApp.  Not a san.  I like this discussion to be general anyway, because 
> it helps me know how I can do better next time.  (which is not to say what we 
> have is bad - it's good - but I always want to make it better).  I am aware 
> of SnapMirror (or whatever they call it) which allows you to snapshot one 
> netapp onto another netapp.  But you have to buy another netapp machine plus 
> the snapmirror license.
>
>
>
>
>
> _______________________________________________
> Tech mailing list
> [email protected]
> http://lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/
>   

_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to