Re: [MarkLogic Dev General] RAM Rich, I/O Poor in the Cloud

David Lee Sat, 16 Feb 2013 12:57:30 -0800

This is interesting ! Ram cheaper then Disk ?  (I know its complicated ... but 
its an interesting evolution in the market).


So 'how expensive' is provisioned IO ? I have found in my few tests that using 
1000 IOPS I can get a sustained througput of 40MB/sec read and right forever.  
If this is really a RAM backed mostly read-only system then your IO operations 
will be few.

Costs for 1000 IOPS 
http://aws.amazon.com/ebs/

$0.125 per GB-month of provisioned storage
$0.10 per provisioned IOPS-month

So per month for say a 100GB (minimum size for 1000 IOPS) 
$12.25 / month storage
$100  / month for IOPS   ( 1000 IOPS  )


Is that beyond your budget ?



-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 812-482-5224
Cell:  +1 812-630-7622
www.marklogic.com


-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Ron Hitchens
Sent: Saturday, February 16, 2013 1:50 PM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] RAM Rich, I/O Poor in the Cloud


  I'm trying to work out the best way to deploy a system
I'm designing into the cloud on AWS.  We've been through
various permutations of AWS configurations and the main
thing we've learned is that there is a lot of uncertainty
and unpredictability around I/O performance in AWS.

  It's relatively expensive to provision guaranteed, high
performance I/O.  We're testing an SSD solution at the
moment, but that is ephemeral (lost if the VM shuts down)
and very expensive.  That's not a deal-killer for our
architecture, but makes it more complicated to deploy
and strains the ops budget.

  RAM, on the other hand, is relatively cheap to add to
and AWS instance.  The total database size, at present, is
under 20GB and will grow relatively slowly.  Provisioning
an AWS instance with ~64GB of RAM is fairly cost effective,
but the persistent EBS storage is sloooow.

  So, I have two questions:

  1) Is there a best practice to tune MarkLogic where
RAM is plentiful (twice the size of the data or more) so
as to maximize caching of data.  Ideally, we'd like the
whole database loaded into RAM.  This system will run as
a read-only replica of a master database located elsewhere.
The goal is to maximize query performance, but updates of
relatively low frequency will be coming in from the master.

  The client is a Windows shop, but Linux is an approved
solution if need be.  Are there exploitable differences at
the OS level that can improve filesystem caching?  Are there
RAM disk or configuration tricks that would maximize RAM
usage without affecting update persistence?

  2) Given #1 could lead to a mostly RAM-based configuration,
does it make sense to go with a single high-RAM, high-CPU
E+D-node that serves all requests with little or no actual I/O?
Or would it be an overall win to cluster E-nodes in front of
the big-RAM D-node to offload query evaluation and pay the
(10-gb) network latency penalty for inter-node comms?

  We do have the option of deploying multiple standalone
big-RAM E+D-nodes, each of which is a full replica of the data
from the master.  This would basically give us the equivalent
of failover redundancy, but at the load balancer level rather
than within the cluster.  This would also let us disperse
them across AZs and regions without worrying about split-brain
cluster issues.

  Thoughts?  Recommendations?

---
Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
    +44 7879 358 212 (voice)          http://www.ronsoft.com
    +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
"No amount of belief establishes any fact." -Unknown




_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] RAM Rich, I/O Poor in the Cloud

Reply via email to