I would *not* use 'local storage' (aka "Ephemeral Storage") for a ML DB/Forest. 
 It may be faster but its transient.
your EC2 instance can crash at any time and you will lose it all.
If you care about data integrity (some apps may not) use EBS.
Last time I tried,  EBS was limited to 750 GB/volume.   However you can easily 
RAID these volumes (I setup a 1.5TB raid easily).
However to make use of this much you'll need your own paid ML License.   
Neither the paid AMI nor the community license can access this much data.

As for a "recommended" instance type,  I don't know.  I've used the default 64 
bit Amazon Linux instance and it works well.



----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
d...@epocrates.com
812-482-5224


-----Original Message-----
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Michael Blakeley
Sent: Monday, August 08, 2011 2:44 PM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] EC2 AMI & ML

David, I think the "local instance storage" limit is an AWS limit specific to 
the instance type you might select: m1.xlarge has 850-GB, etc 
(http://aws.amazon.com/ec2/instance-types/), not to either server license. You 
can always connect more storage using EBS, though. EBS acts something like NAS 
or SAN storage, and can persist across instance stop-start cycles and can be 
moved from instance to instance within an availability zone, which can be 
useful. Instance storage (aka ephemeral storage) is more like local disk. Its 
performance can be more predictable, but it isn't as flexible as EBS. If you 
want 5-TB of storage on one host, I think you'd have to use EBS 
(http://aws.amazon.com/ebs/).

-- Mike

On 8 Aug 2011, at 11:28 , Steiner, David J. (LNG-DAY) wrote:

> Just to make sure I understand correctly...
>  
> If I want to use MarkLogic in EC2, then I can either pick the pay-per-use 
> license or community license, both of which have restrictions on content 
> size, i.e., where the most amount of storage I can get is: 1690 GB of local 
> instance storage.  Is this correct, or am I misunderstanding "local instance 
> storage" to be the total storage available for data in the ML database?
>  
> I'm assuming there is a limit, so alternatively, if I want to use the license 
> that I already have, so that I can create a ML cluster that holds 5TBs of 
> data, for instance, then I need to use an AMI that just contains an operating 
> system and I would then have to install MarkLogic on it myself so that I can 
> enter the license key.
>  
> The documentation at: 
> http://developer.marklogic.com/pubs/4.2/books/ec2.pdfonly covers using the 
> established ML AMIs and not what to do if you want to use your existing 
> license.
>  
> A search on the EC2 AMI page on "RightScale CentOS Linux" which is what is 
> mentioned in the guide produces 19 results.  So, is there a recommended AMI 
> (or AMIs) to pick?
>  
> Thanks,
> David
>  
>  
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to