I've done some evaluations with smallish clusters. I'm cautious about planning a large cluster on AWS for a couple of reasons.
One of the basic rules of clustering is to avoid network latency within the cluster. But every AWS instance is a minimum of two hops from its closest neighbor, with the hypervisors acting as gateways. That closest neighbor will be co-resident on the same hypervisor, so for HA purposes you might even restrict yourself to instances at least 4-5 hops away. See http://blakeley.com/blogofile/2011/11/21/testing-aws-ec2-instances-for-co-residence/ for more on co-residency and how to avoid it. So I'd try to minimize the cluster size by maximizing the size of the data host instances. This has the happy side-effect of making it more likely that you have the hypervisor all to yourself, which should remove the hypervisor co-residency problem. If possible I would use this strategy to run single-host clusters exclusively. I'm not sure that you care about HA, but it is one potential driver for a multi-instance cluster. Some folks grow their clusters almost entirely for HA purposes, using replica forests within a cluster. As alluded to above, though, it requires care to avoid co-residency problems: instances mounting forest replicas on the same hypervisor aren't going to provide much redundancy. EBS is more of a black box, so it isn't clear that a replica forest will help in the event of EBS failures. So it may be better to treat AWS as the local HA solution, ignoring local failover. Instead worry only about what happens when AWS fails entirely - as it seems to do about once annually. Such events can be treated as "disasters" and handled via database replication or flex-rep to a foreign cluster in another data center. Getting back to instance tuning, stock EBS can certainly be a bottleneck. With update-heavy workloads I've seen the CPU outstrip EBS enough to generate XDMP-TOOMANYSTANDS errors. Even with RAID-0 and many EBS volumes it's tricky to get consistent merge rates above 10-MB/sec. As Wayne and David mentioned the new EBS options are supposed to help, but I don't have direct experience with that yet. I can tell you that EBS performance shifts around as demand changes: Christmas shopping season seems to have a pronounced effect. And no matter how many EBS volumes you mount or RAID together, the traffic all goes through the same network interface. So having 10-Gbit interfaces is supposed to help. The OS itself can also be a bottleneck. I suspect Amazon has done some tuning of linux for their http://aws.amazon.com/amazon-linux-ami/ offering, but my testing suggests that this only makes much of a difference when multiple demanding instances share a hypervisor. Still, it might be worth comparing the Amazon linux AMIs with Windows AMIs. You might find a compelling reason to switch. This might be another reason to use a very large instance type to ensure exclusive use of the hypervisor. You may also want to check the CPU type when bringing up new instances, and reject any that have older CPU models. Some zones still have quite a few older 4-digit Opteron CPUs, and you will notice the difference in performance if you get one of these. -- Mike On 8 Jan 2013, at 05:46 , Ron Hitchens <[email protected]> wrote: > > We tried the EBS Optimized option and that hasn't made > much of a difference either. I suppose RAIDing across EBS > is a way to go, but I'm afraid that would fall outside the > comfort zone of the people administering this stuff. > > I'll have them look into the Provisioned IOPs thing. What > I really want is high-performance local disk to meet the > performance targets we have. > > Thanks for the help. > > Is anybody out there actually running large-ish production > MarkLogic clusters in the cloud? > > On Jan 8, 2013, at 12:35 PM, David Lee wrote: > >> Almost certainly as Wayne suggests your bottleneck is IO. >> >> The default storage is EBS which is a type of network SAN. >> Some instance types have "EBS Optimized" which you should try. >> This gives a dedicated network channel to EBS. >> Then add RAID across the EBS for extra fun. >> >> Even better as Wayne suggests is instances with "Provisioned IOPS" >> or some of the truly amazing DB oriented instances with tons of local >> storage. >> >> Also you could consider using Ephemerial Storage, however as the name >> suggests it >> will not last beyond the instance life. >> >> >> ----------------------------------------------------------------------------- >> David Lee >> Lead Engineer >> MarkLogic Corporation >> [email protected] >> Phone: +1 812-482-5224 >> Cell: +1 812-630-7622 >> www.marklogic.com >> >> >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Wayne Feick >> Sent: Tuesday, January 08, 2013 7:20 AM >> To: General Mark Logic Developer Discussion >> Subject: Re: [MarkLogic Dev General] MarkLogic in AWS Cloud >> >> I don't have a lot of experience with it, but EBS volumes have limited >> bandwidth. Some people have had success striping across multiple EBS volumes >> from within Linux instances. You could also look at the more recent >> guaranteed IOPs capability Amazon now offers. >> >> Wayne >> >> Ron Hitchens <[email protected]> wrote: >> >> >> Has anyone had any experience configuring and running non-trivial >> MarkLogic clusters in the cloud? Specifically Amazon EC2 VMs? >> >> I've got a test cluster of three nodes setup in AWS and am trying >> to figure out the best configuration for it. The system seems to be >> quite slow at some things, but reasonably fast at others. Bumping >> the VM up to bigger instances (more ram, more cores) doesn't seem to >> have a significant impact on speed or throughput. >> >> I suspect I/O bandwidth may be the culprit, but that's just a >> hunch. Does anyone have any experience with tuning EC2 VMs? >> >> The test environment I'm working with now is three m2.xlarge >> instances (32gb RAM, 4 cores, "high" network speed). The OS is >> Windows (groan, I don't have a choice there). Production cluster(s) >> are likely to be similar, but probably six nodes or so. >> >> Any advice//war stories/dire warnings greatly appreciated. >> >> Thanks. >> >> --- >> Ron Hitchens {mailto:[email protected]} Ronsoft Technologies >> +44 7879 358 212 (voice) http://www.ronsoft.com >> +1 707 924 3878 (fax) Bit Twiddling At Its Finest >> "No amount of belief establishes any fact." -Unknown >> >> >> >> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > --- > Ron Hitchens {mailto:[email protected]} Ronsoft Technologies > +44 7879 358 212 (voice) http://www.ronsoft.com > +1 707 924 3878 (fax) Bit Twiddling At Its Finest > "No amount of belief establishes any fact." -Unknown > > > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
