I noticed recently that ML is supported on Amazon EC2, this Is an
exciting possibility.

As an experiment to see if I can get my experimental database to run on
EC2 I am trying to load it into a Community edition license EC2 instance
(have yet to get approval to purchase a "Standard License" for EC2).

I have a few questions .

 

1) License size restrictions.

Prior editions before 4.1.4 I noticed that license size was related to
"Index Size" .. or atleast that's how it seemed.

The same size XML would use different % of my license depending on what
indexing options I selected.

It doesn't appear to do that anymore ... Is community edition licensed
based on content or index size or  both ?
That is, is it possible to decrease the size for licensing purposes by
turning off various search features ?

 

2) Backup/Restore

I tried first to load the data from XML directly using my program that
uses XCC.  I have about 26 GB of XML data

across 10 different data sources (and several million documents).
Loading with XML directly is cumbersome, and I ran into lots of problems
trying to load it to EC2, network problems would cause a batch load to
abort and I'd have to start over.

So instead I tried doing Backup of my master DB,  then rsyncing the
backup to EC2 then doing a restore.

I also tried the same thing to a desktop (local) ML community edition
license server just to test.

This worked fine ... Except I know I have too much data (26G) so I
turned off many of the search options like 2 letter searches etc.  On
both my desktop and my EC2 instance after restoring the 26G from disk ML
went into a re-indexing and refragmenting phase (expected) ... and is
showing 254% above license use (also expected).

What was NOT expected is that 48 hours later it's still reindexing with
no end in sight.

ML is pegging the CPU and predicts it will be done in 5-10 minutes ...
for 2 days now.

Both in my EC2 and desktop instances .. so I know this is not just an
EC2 issue.

 

I want to let this run to completion to see if I can get the data set
under community edition size, so I can at least prove the concept to my
manager and try to justify the concept of an EC2 ML server.

 

But why is it taking so long ?  Is this expected ? Does it have anything
to do with the license ? 

That is, because I'm over license size is it just going to run forever ?
or will it eventually complete? 
Any clues on how long this is going to take ? 

If this has nothing to do with licensing, this is a huge problem if it
can take days to recover from a restore ...

what would happen in production if I had to restore from backup or
transfer data from one server to another ? The server would be offline
for days ?  

 

 

 

 

----------------------------------------

David A. Lee

Senior Principal Software Engineer

Epocrates, Inc.

[email protected] <mailto:[email protected]> 

812-482-5224

 

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to