>>>>> "Brad" == Brad Knowles <[email protected]> writes:
Brad> on 2/17/09 3:28 PM, John Stoffel said: >> Has anyone done a migration from a mostly NFS based Netapp setup to >> one with the new Sun ZFS solution on the 7x00 series arrays? We're >> thinking about doing this at $WORK due to the *large* cost savings. >> We're looking at around 200Tb of disk spread across four to five >> sites, with reliability and performance the two main drivers. Brad> You say that reliability and performance are your main drivers, Brad> but you give the main reason you're looking at ZFS as being Brad> cost. That sounds to me like cost is really the main driver. Well, when the cost is so much lower, it's a huge incentive to really look at your requirements and see what you can change, or whether your assumptions are wrong and need to be updated. In this case, I know that Netapp is solid (mostly, don't ever use their X.0 or even X.1 releases if you can help it now, they're just not as solid as they used to be...) and works well performance wise, but they're expensive. And they have this limit of 16Tb for a continuous chunk of data. By that, I mean that I can only create containers of 16Tb in size, and in that container I can create one or more flexvols, with just a single level of qtrees in each volume. This used to be nice and flexible, but as single projects can now take 4-8Tb of disk space, it's getting harder and harder to allocate and manage disk space easily. ZFS and Sun looks to fix that because they can grow past the 16Tb level. Yes, this can/will be a problem for backups, but if we can setup a single pool of storage which lets us put volume(s) in there of arbitrry sizes, we'll just backup the individual volumes and deal with the performance issues. Brad> In our case, we get really, really good prices from Dell on Brad> their storage, in part because Michael Dell studied at UT Brad> Austin, and that apparently really means something to their Brad> salespeople. Because of that, we also get really, really good Brad> prices from NetApp, otherwise there wouldn't be any NetApp Brad> equipment on campus. We've gotten good prices from Netapp in the past as well, esp when we do large multi-million dollar orders. But management is asking us to cut costs and this looks to be one way to do so. But it *is* a big leap and possibly a huge problem just from the migration of data from one system to another. Brad> We don't normally get such good deals from Sun on storage, so in Brad> this space there's not that much price advantage for Sun versus Brad> Netapp. Have you looked at the new 7x00 series servers from Sun lately? The prices really are quite compelling, as are the features of ZFS. Brad> Ironically, when we price out Sun servers using the standard Brad> retail cost, that still comes out better than the discounted Brad> pricing we get from Dell with their discount. And Sun usually Brad> has a matching grant program once a year for participating Brad> educational institutions, so you can get twice as much equipment Brad> for the same price. Yup, DEC used to have the same thing when they were around and I worked at a Univ. Brad> And these Sun servers usually have more disk and more RAM than Brad> the Dell equivalents, in half the rack space (1U instead of 2U, Brad> 2U instead of 4U, etc...). You can also manage these Sun Brad> servers remotely using ALOM or ILOM from any web browser that Brad> can do Java, whereas the Dell DRAC cards can only be managed Brad> remotely from IE on Windows, because they do Active-X. Brad> Can you tell that I've spent a lot of time on the respective Brad> websites lately, trying every which way to maximize the amount Brad> of equipment we can get for our budget? Exactly, just what we're trying to do. Brad> For my part, reliability and performance are two of the three Brad> biggest questions I've got with regards to the Sun 7000 series. I'm not as worried about performance, since we do large simulations where the time to read/write the data is totally dominated by the compute time. So for us I think it's Reliability -> Cost -> Performance. Brad> The third question has to do with functionality, and how we get Brad> the equivalent features of things like SnapMirror, SnapClone, Brad> de-duplication, MetroCluster, etc.... HSM is nice, and ZFS can Brad> do snapshots, but what about the rest? How does their Brad> replication compare to SnapMirror? How does their clustering Brad> compare? How does their compression compare to NetApp de-dupe? Yup, lots of questions. And I do have questions about their replication, esp across the WAN. I need to pull down and fire up some simulators on each coast to test this out. They allow you to make 30Gb test area for simulation with the shipping code, which is more than enough data to test out replication. Brad> Traditionally, one problem with mixing NFS, CIFS and iSCSI on Brad> the same platform is that you can only make storage available Brad> via one of these protocols, and you can't easily share something Brad> via both NFS and CIFS, because the NTFS ACLs interfere. I'm Brad> told that there is a way you can set up Netapp devices so that Brad> they actually work in these situations, although this type of Brad> configuration is rather rigid and brittle -- at least it's Brad> do-able. It's not a great idea to used mixed-security volumes which are accessed by both Windows and Unix clients. It just doesnt' work well over NFSv3 connections. This might change with the arrival of NFSv4 and ACL support, but it's going to take time to settle down. Again, this isn't a big issue for me since we don't use mixed mode volumes at this time. Though we are thinking about it purely to allow us to use our CommVault backup system to do HSM for us. Brad> Can Sun do the same? Don't care, it's not in my needs. Brad> And where does iSCSI fit in? Is it a bastard step child, or is Brad> everything based on iSCSI internally? iSCSI is there, but not clear how well it is integrated. They say they support it, and you can re-size volumes, etc. This too needs to be investigated but that's why I'm asking around for any input from the rest of you guys! *grin* >> We do have some iSCSI and some CIFS volumes, but not large numbers, >> mostly we're NFS for compute clusters, home dirs, etc. We generally >> just export one *large* NFS mount point at each site for all data. It >> makes life simpler so we don't have to shuffle data/volumes around. Brad> For doing compute clusters, note that Ranger uses Thumpers Brad> running GLustre. In the kind of situation you describe, I'd Brad> definitely take a look at something like Lustre. Who/what is Ranger? Also, we're not talking *large* clusters, or high IO clusters. We're doing EDA designs. Lots of data in some ways, but mostly lots of simulations of chips. >> - Unknown performance of volume replication across WAN (NetApp >> SnapVault sucks across WAN, known Con :-) Brad> Have you tried WAN accelerators here? There are some that are Brad> specifically designed to optimize storage across the WAN, and Brad> I'd check into whether or not they can help with SnapVault and Brad> SnapMirror performance. We've got WAN Accelerators in place already and they do help, but SnapVault still sucks. I've been told that SnapMirror uses a different TCP stack and does give better performance, but when we went with SnapVault four years ago, it was a cheaper license versus SnapMirror and supposedly fit our usage model (Disaster Recovery) better. I've also been told, but haven't confirmed that if you setup SnapVaults using Multip-Path IP, it uses yet another TCP stack which is also more efficient. But doing DR across a WAN using T3s and 16Tb of data still sucks, esp when they can create/delete 2Tb in a single day. It's just amazing how quickly you fall behind in your replication when TCP over a fat-wide pipe just sucks performance wise. Thanks for the feedback, John _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
