Re: [lopsa-tech] Netapp to ZFS migration experience?

John Stoffel Wed, 18 Feb 2009 07:00:31 -0800

>>>>> "Brad" == Brad Knowles <[email protected]> writes:


Brad> on 2/17/09 3:28 PM, John Stoffel said:
>> Has anyone done a migration from a mostly NFS based Netapp setup to
>> one with the new Sun ZFS solution on the 7x00 series arrays?  We're
>> thinking about doing this at $WORK due to the *large* cost savings.
>> We're looking at around 200Tb of disk spread across four to five
>> sites, with reliability and performance the two main drivers.  

Brad> You say that reliability and performance are your main drivers,
Brad> but you give the main reason you're looking at ZFS as being
Brad> cost.  That sounds to me like cost is really the main driver.

Well, when the cost is so much lower, it's a huge incentive to really
look at your requirements and see what you can change, or whether your
assumptions are wrong and need to be updated.

In this case, I know that Netapp is solid (mostly, don't ever use
their X.0 or even X.1 releases if you can help it now, they're just
not as solid as they used to be...) and works well performance wise,
but they're expensive.  And they have this limit of 16Tb for a
continuous chunk of data.

By that, I mean that I can only create containers of 16Tb in size, and
in that container I can create one or more flexvols, with just a
single level of qtrees in each volume.  

This used to be nice and flexible, but as single projects can now take
4-8Tb of disk space, it's getting harder and harder to allocate and
manage disk space easily.  

ZFS and Sun looks to fix that because they can grow past the 16Tb
level.  

Yes, this can/will be a problem for backups, but if we can setup a
single pool of storage which lets us put volume(s) in there of
arbitrry sizes, we'll just backup the individual volumes and deal with
the performance issues.

Brad> In our case, we get really, really good prices from Dell on
Brad> their storage, in part because Michael Dell studied at UT
Brad> Austin, and that apparently really means something to their
Brad> salespeople.  Because of that, we also get really, really good
Brad> prices from NetApp, otherwise there wouldn't be any NetApp
Brad> equipment on campus.

We've gotten good prices from Netapp in the past as well, esp when we
do large multi-million dollar orders.  But management is asking us to
cut costs and this looks to be one way to do so.  But it *is* a big
leap and possibly a huge problem just from the migration of data from
one system to another.  

Brad> We don't normally get such good deals from Sun on storage, so in
Brad> this space there's not that much price advantage for Sun versus
Brad> Netapp.

Have you looked at the new 7x00 series servers from Sun lately?  The
prices really are quite compelling, as are the features of ZFS.  

Brad> Ironically, when we price out Sun servers using the standard
Brad> retail cost, that still comes out better than the discounted
Brad> pricing we get from Dell with their discount.  And Sun usually
Brad> has a matching grant program once a year for participating
Brad> educational institutions, so you can get twice as much equipment
Brad> for the same price.

Yup, DEC used to have the same thing when they were around and I
worked at a Univ. 

Brad> And these Sun servers usually have more disk and more RAM than
Brad> the Dell equivalents, in half the rack space (1U instead of 2U,
Brad> 2U instead of 4U, etc...).  You can also manage these Sun
Brad> servers remotely using ALOM or ILOM from any web browser that
Brad> can do Java, whereas the Dell DRAC cards can only be managed
Brad> remotely from IE on Windows, because they do Active-X.

Brad> Can you tell that I've spent a lot of time on the respective
Brad> websites lately, trying every which way to maximize the amount
Brad> of equipment we can get for our budget?

Exactly, just what we're trying to do. 

Brad> For my part, reliability and performance are two of the three
Brad> biggest questions I've got with regards to the Sun 7000 series.

I'm not as worried about performance, since we do large simulations
where the time to read/write the data is totally dominated by the
compute time.  So for us I think it's Reliability -> Cost ->
Performance. 

Brad> The third question has to do with functionality, and how we get
Brad> the equivalent features of things like SnapMirror, SnapClone,
Brad> de-duplication, MetroCluster, etc....  HSM is nice, and ZFS can
Brad> do snapshots, but what about the rest?  How does their
Brad> replication compare to SnapMirror?  How does their clustering
Brad> compare?  How does their compression compare to NetApp de-dupe?

Yup, lots of questions.  And I do have questions about their
replication, esp across the WAN.  I need to pull down and fire up some
simulators on each coast to test this out.  They allow you to make
30Gb test area for simulation with the shipping code, which is more
than enough data to test out replication. 

Brad> Traditionally, one problem with mixing NFS, CIFS and iSCSI on
Brad> the same platform is that you can only make storage available
Brad> via one of these protocols, and you can't easily share something
Brad> via both NFS and CIFS, because the NTFS ACLs interfere.  I'm
Brad> told that there is a way you can set up Netapp devices so that
Brad> they actually work in these situations, although this type of
Brad> configuration is rather rigid and brittle -- at least it's
Brad> do-able.  

It's not a great idea to used mixed-security volumes which are
accessed by both Windows and Unix clients.  It just doesnt' work well
over NFSv3 connections.  This might change with the arrival of NFSv4
and ACL support, but it's going to take time to settle down.  

Again, this isn't a big issue for me since we don't use mixed mode
volumes at this time.  Though we are thinking about it purely to allow
us to use our CommVault backup system to do HSM for us.  

Brad> Can Sun do the same?

Don't care, it's not in my needs. 

Brad> And where does iSCSI fit in?  Is it a bastard step child, or is
Brad> everything based on iSCSI internally?

iSCSI is there, but not clear how well it is integrated.  They say
they support it, and you can re-size volumes, etc.  This too needs to
be investigated but that's why I'm asking around for any input from
the rest of you guys!  *grin*


>> We do have some iSCSI and some CIFS volumes, but not large numbers,
>> mostly we're NFS for compute clusters, home dirs, etc.  We generally
>> just export one *large* NFS mount point at each site for all data. It
>> makes life simpler so we don't have to shuffle data/volumes around.  

Brad> For doing compute clusters, note that Ranger uses Thumpers
Brad> running GLustre.  In the kind of situation you describe, I'd
Brad> definitely take a look at something like Lustre.

Who/what is Ranger?  

Also, we're not talking *large* clusters, or high IO clusters.  We're
doing EDA designs.  Lots of data in some ways, but mostly lots of
simulations of chips.  

>> - Unknown performance of volume replication across WAN (NetApp
>> SnapVault sucks across WAN, known Con :-)

Brad> Have you tried WAN accelerators here?  There are some that are
Brad> specifically designed to optimize storage across the WAN, and
Brad> I'd check into whether or not they can help with SnapVault and
Brad> SnapMirror performance.

We've got WAN Accelerators in place already and they do help, but
SnapVault still sucks.  I've been told that SnapMirror uses a
different TCP stack and does give better performance, but when we went
with SnapVault four years ago, it was a cheaper license versus
SnapMirror and supposedly fit our usage model (Disaster Recovery)
better.

I've also been told, but haven't confirmed that if you setup
SnapVaults using Multip-Path IP, it uses yet another TCP stack which
is also more efficient.

But doing DR across a WAN using T3s and 16Tb of data still sucks, esp
when they can create/delete 2Tb in a single day.  It's just amazing
how quickly you fall behind in your replication when TCP over a
fat-wide pipe just sucks performance wise.

Thanks for the feedback,
John
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] Netapp to ZFS migration experience?

Reply via email to