The info I have is that IOPS can be provisioned
from 1,000 to 10,000 and Google dragged back this
page which gives the same numbers:

http://aws.amazon.com/about-aws/whats-new/2012/09/25/announcing-provisioned-iops-for-amazon-rds/

   Someone else has done the initial costing on various
configurations and the moaning centers around the cost
of paying for database-server-level I/O bandwidth.

   But I agree with you: lots of RAM and careful tuning
may get it done without the cost and complexity.  Hence
this discussion.

On Feb 17, 2013, at 12:14 AM, David Lee <[email protected]> wrote:

> I understand its complicated (IT IS!) 
> But I think you might have a off-by-10x in your calculations.
> AWS Provisioned EBS IOPS come at MAX 2000 (not 10,000) and I have not been 
> able to achieve faster rates over 1000 .
> Certainly this is nothing compared to SSDs but it's an order of magnitude 
> over regular ESBs and reasonably fast, 
> 40mBytes/Sec sustained at about $100/TB/Month/Volume.   Couple this with lots 
> of Ram and after a short period your system shouldnt be hitting the disk 
> often (I hope ...).
> So IMHO I wouldnt discard this offhand as "too expensive" ... 
> With 1000 IOPS and a big memory machine and some warming up I think you can 
> achive very good performance.
> But yes its complicated ... note that IOPS dont hint at bandwidth ... they 
> seem to run 16k block IO reguardless of the filesystem blocksize.  
> 
> Of course noone can beat local SSD's for speed ... but ... THEY are expensive.
> 
> 
> 
> 
> 
> -----------------------------------------------------------------------------
> David Lee
> Lead Engineer
> MarkLogic Corporation
> [email protected]
> Phone: +1 812-482-5224
> Cell:  +1 812-630-7622
> www.marklogic.com
> 
> 
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Ron Hitchens
> Sent: Saturday, February 16, 2013 5:19 PM
> To: MarkLogic Developer Discussion
> Subject: Re: [MarkLogic Dev General] RAM Rich, I/O Poor in the Cloud
> 
> 
>   Right, but the cost needs to be multiplied by
> all the nodes in all the clusters.  And our ops people
> want to put multiple EBS volumes on each node.
> 
>   But the source of the debate comes from comparing I/O
> speed for PIOPs (1,000 to about 10,000) against SSDs (up to
> around 150,000) and in-RAM speed (even faster, obviously).
> 
>   Our needs are focused less on I/O throughput and more
> on fast data access.  In that light, 1,000 IOPs seems
> pretty slow.  That is why I'm looking at alternative ways
> of structuring the system to minimize I/O (so we can go
> with the lowest PIOP tier) but still get super-fast data
> access without paying for higher PIOP tiers (which shoot
> up in cost quickly) or SSDs which will require additional
> deployment complexity to get the data loaded onto them (and
> are not available in all AWS zones).
> 
>   Also, it's not a simple comparison of cost of RAM vs
> cost of disk (or more accurately I/O speed), it's also
> a complexity management issue.  Figuring out what means
> what in AWS and how the various options interact and making
> multiple instances talk to each other properly (both in
> terms of AWS configuration and corporate governance on
> our end) quickly becomes a tangle of dependencies.  This is
> why I'm trying to determine if just paying for a big RAM
> instance and a minimal guaranteed level of I/O performance
> will be a better cost/benefit ratio once everything is
> factored in.
> 
> On Feb 16, 2013, at 8:57 PM, David Lee <[email protected]> wrote:
> 
>> This is interesting ! Ram cheaper then Disk ?  (I know its complicated ... 
>> but its an interesting evolution in the market).
>> 
>> So 'how expensive' is provisioned IO ? I have found in my few tests that 
>> using 1000 IOPS I can get a sustained througput of 40MB/sec read and right 
>> forever.  If this is really a RAM backed mostly read-only system then your 
>> IO operations will be few.
>> 
>> Costs for 1000 IOPS 
>> http://aws.amazon.com/ebs/
>> 
>> $0.125 per GB-month of provisioned storage
>> $0.10 per provisioned IOPS-month
>> 
>> So per month for say a 100GB (minimum size for 1000 IOPS) 
>> $12.25 / month storage
>> $100  / month for IOPS   ( 1000 IOPS  )
>> 
>> 
>> Is that beyond your budget ?
>> 
>> 
>> 
>> -----------------------------------------------------------------------------
>> David Lee
>> Lead Engineer
>> MarkLogic Corporation
>> [email protected]
>> Phone: +1 812-482-5224
>> Cell:  +1 812-630-7622
>> www.marklogic.com
>> 
>> 
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of Ron Hitchens
>> Sent: Saturday, February 16, 2013 1:50 PM
>> To: MarkLogic Developer Discussion
>> Subject: [MarkLogic Dev General] RAM Rich, I/O Poor in the Cloud
>> 
>> 
>> I'm trying to work out the best way to deploy a system
>> I'm designing into the cloud on AWS.  We've been through
>> various permutations of AWS configurations and the main
>> thing we've learned is that there is a lot of uncertainty
>> and unpredictability around I/O performance in AWS.
>> 
>> It's relatively expensive to provision guaranteed, high
>> performance I/O.  We're testing an SSD solution at the
>> moment, but that is ephemeral (lost if the VM shuts down)
>> and very expensive.  That's not a deal-killer for our
>> architecture, but makes it more complicated to deploy
>> and strains the ops budget.
>> 
>> RAM, on the other hand, is relatively cheap to add to
>> and AWS instance.  The total database size, at present, is
>> under 20GB and will grow relatively slowly.  Provisioning
>> an AWS instance with ~64GB of RAM is fairly cost effective,
>> but the persistent EBS storage is sloooow.
>> 
>> So, I have two questions:
>> 
>> 1) Is there a best practice to tune MarkLogic where
>> RAM is plentiful (twice the size of the data or more) so
>> as to maximize caching of data.  Ideally, we'd like the
>> whole database loaded into RAM.  This system will run as
>> a read-only replica of a master database located elsewhere.
>> The goal is to maximize query performance, but updates of
>> relatively low frequency will be coming in from the master.
>> 
>> The client is a Windows shop, but Linux is an approved
>> solution if need be.  Are there exploitable differences at
>> the OS level that can improve filesystem caching?  Are there
>> RAM disk or configuration tricks that would maximize RAM
>> usage without affecting update persistence?
>> 
>> 2) Given #1 could lead to a mostly RAM-based configuration,
>> does it make sense to go with a single high-RAM, high-CPU
>> E+D-node that serves all requests with little or no actual I/O?
>> Or would it be an overall win to cluster E-nodes in front of
>> the big-RAM D-node to offload query evaluation and pay the
>> (10-gb) network latency penalty for inter-node comms?
>> 
>> We do have the option of deploying multiple standalone
>> big-RAM E+D-nodes, each of which is a full replica of the data
>> from the master.  This would basically give us the equivalent
>> of failover redundancy, but at the load balancer level rather
>> than within the cluster.  This would also let us disperse
>> them across AZs and regions without worrying about split-brain
>> cluster issues.
>> 
>> Thoughts?  Recommendations?
>> 
>> ---
>> Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
>>   +44 7879 358 212 (voice)          http://www.ronsoft.com
>>   +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
>> "No amount of belief establishes any fact." -Unknown
>> 
>> 
>> 
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> ---
> Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
>     +44 7879 358 212 (voice)          http://www.ronsoft.com
>     +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
> "No amount of belief establishes any fact." -Unknown
> 
> 
> 
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

---
Ron Hitchens {mailto:[email protected]}   Ronsoft Technologies
     +44 7879 358 212 (voice)          http://www.ronsoft.com
     +1 707 924 3878 (fax)              Bit Twiddling At Its Finest
"No amount of belief establishes any fact." -Unknown




_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to