Yes, you can use NAS. Like SAN, the key is adequate performance. This is the tricky part because getting that performance is very difficult and very expensive. When internal policies and infrastructure dictate SAN or NAS, dedicated high quality NAS can often be preferable to shared, under provisioned SAN (while being cheaper.)
As Mike pointed out, can you maintain HA with your NAS setup? This is particular to the unit. Without a clustered file system, you won't have multiple nodes pointing at same volume. Each node should receive dedicated pools and bandwidth. You should not stripe across all volumes then thin provision out of a single pool. No CIFS, windows shares, SMB. NFS has performance limitations even with 10g under Linux. Test, test, test. It is often overlay services of "fancy" NAS that kill performance - dedup, compression, site-to-site replication, etc that kill performance. Is this a shared resource? If so, how do ensure enough bandwidth for the MarkLogic nodes? How do you ensure you don't destroy the performance of other nodes? You should have explicit visibility and control of each volume. An example of successful SLA's can be found in Amazons Provisioned IOPS storage. While neither SAN nor NAS, it's sets a standard for what you should expect/demand from shared storage: - explicit bandwidth guarantee to the storage pool (110 mb/sec for most high end instances - coincidently the practical throughput limit for many NFS limitations.) - guaranteed IOPS at large block sizes for each volume. You need 20 mb/sec per forest. 16 forests a node, not unreasonable for a nice system with local storage, would need 240,000 IOPS at 4k blocks from your NAS. I think you'll find local storage much more cost effective. - sustained SLA compliance even if maxing out all guarantees. A typical pattern sometimes is that a MarkLogic user will ask for that much bandwidth (80K 4k IOPS per node) then get laughed at by the storage admins. It's out of band with everything they have experience with. MarkLogic can end up looking more like a video streaming load than like Oracle. It really uses that much bandwidth and if the total provided is less, performance can drop off a cliff. We are developing guidelines now for AWS storage but one rule of thumb is probably useful for NAS also. If you can, provision one volume per forest so you can track an allocate performance by volume/forest with less effort. It also will make reallocation of load easier. Local disk replication will move the copies of forests around for HA. Don't try to do that with the disk subsystem. If you pass along more details as to planned configurations, I may be of more help. Aaron Rosenbaum Director, Product Management [email protected] Sent from my iPhone On 9 Feb 2013, at 11:57, "[email protected]" <[email protected]> wrote: > Send General mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://developer.marklogic.com/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of General digest..." > > > Today's Topics: > > 1. > 2. Re: Marklogic Cluster Setup (Michael Blakeley) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 8 Feb 2013 22:26:44 +0000 > From: "Khan, Kashif" <[email protected]> > Subject: [MarkLogic Dev General] Marklogic Cluster Setup > To: MarkLogic Developer Discussion <[email protected]> > Message-ID: > <2dc312a3c8102e43a525aff7dc281f032c868...@hmhandmbx01.ex.pubedu.hegn.us> > > Content-Type: text/plain; charset="windows-1252" > > Hello Everyone, We are creating a Marklogic Cluster for failover. I have a > couple of questions. > > 1. We are planning to use NAS for data storage. Is there any performance > hit if we use NAS over SAN? > 2. We do not have GFS setup. > * It is possible to attach One NAS file store to all 3 MarkLogic > Servers in the cluster? > * OR do we have to attach an Independent NAS with each Marklogic > Instance and set up a cloning job to transfer data to each of the other 2 NAS > instances. > >> From the documentation it seems like we can not attach one NAS file store to >> all three MArkLogic servers unless we have GFS. Any info will be greatly >> appreciated. > > ???????????????? > Kashif Khan > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://developer.marklogic.com/pipermail/general/attachments/20130208/53611d26/attachment-0001.html > > > ------------------------------ > > Message: 2 > Date: Fri, 8 Feb 2013 14:51:30 -0800 > From: Michael Blakeley <[email protected]> > Subject: Re: [MarkLogic Dev General] Marklogic Cluster Setup > To: MarkLogic Developer Discussion <[email protected]> > Message-ID: <[email protected]> > Content-Type: text/plain; charset=windows-1252 > > The question "which is faster?" is impossible to answer generically. It's > possible to design local storage so that it is slower or faster than a given > NAS. It's possible to design NAS so that it is slower or faster than given > local storage. But in most cases it is cheaper to build out similar levels of > performance from local disk than from NAS (or SAN). > > Performance aside, I would not use a NAS as part of a failover solution. The > whole point of failover is high availability, and relying on a NAS simply > introduces another system that can fail. Using a NAS also implies shared > filesystems, which are cantankerous and require their own fencing mechanisms. > This pulls in yet more systems that can fail, and probably will. > > I prefer to use local storage, with local replication of forests. This also > avoids the strong probability that the I/O demands of the cluster will swamp > the network link to the NAS, or the NAS controller. > > So I would size the number of forests needed, then the storage capacity and > I/O performance needed, and finally specify local disk and network to meet > those needs. > > -- Mike > > On 8 Feb 2013, at 14:26 , "Khan, Kashif" <[email protected]> wrote: > >> Hello Everyone, We are creating a Marklogic Cluster for failover. I have a >> couple of questions. >> ? We are planning to use NAS for data storage. Is there any performance >> hit if we use NAS over SAN? >> ? We do not have GFS setup. >> ? It is possible to attach One NAS file store to all 3 MarkLogic >> Servers in the cluster? >> ? OR do we have to attach an Independent NAS with each Marklogic >> Instance and set up a cloning job to transfer data to each of the other 2 >> NAS instances. >> From the documentation it seems like we can not attach one NAS file store to >> all three MArkLogic servers unless we have GFS. Any info will be greatly >> appreciated. >> >> ???????????????? >> Kashif Khan >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > > > ------------------------------ > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > > > End of General Digest, Vol 104, Issue 22 > **************************************** _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
