Re: Need deployment strategy

2010-01-14 Thread Jeremy Hinegardner
On Wed, Jan 13, 2010 at 05:38:33PM -0500, Paul Rosen wrote:
 Hi all,

 The way the indexing works on our system is as follows:

 We have a separate staging server with a copy of our web app. The clients 
 will index a number of documents in a batch on the staging server (this 
 happens about once a week), then they play with the results on the staging 
 server for a day until satisfied. Only then do they give the ok to deploy.

 What I've been doing is, when they want to deploy, I do the following:

 1) merge and optimize the index on the staging server,

 2) copy it to the production server,

 3) stop solr on production,

 4) copy the new index on top of the old one,

 5) start solr on production.

 This works, but has the following disadvantages:

 1) The index is getting bigger, so it takes longer to zip it and transfer 
 it.

If you are doing the optimize every time before submitting to production, you
will need to transfer the entire index each time anyway.  To only transfer some
of them you would need to NOT optimize and then use one of the replication
strategies (rsync or Java) to only replicate the deltas.

 2) The user is only added a few records, yet we copy over all of them. If a 
 bug happens that causes an unrelated document to get deleted or replaced on 
 staging, we wouldn't notice, and we'd propagate the problem to the server. 
 I'd sleep better if I were only moving the records that were new or changed 
 and leaving the records that already work in place.

 3) solr is down on production for about 5 minutes, so users during that 
 time are getting errors.

 I was looking for some kind of replication strategy where I can run a task 
 on the production server to tell it to merge a core from the staging 
 server. Is that possible?

 I can open up port 8983 on the staging server only to the production 
 server, but then what do I do on production to get the core?

Have you considered using MultiCore approach and some of the commands 
from CoreAdmin[1] and SolrReplication[2]?

Start out with multicore enabled on the production server, and have the
production core running with the name 'prod' or something like that.  

On the staging server, maybe have it in multicore or not.  

Then your deployment procedure would be:

1) On the production server use the CREATE admin command to create a new
   core 'deploy_MMDD' with configuration from the 'prod' core.   The
   configuration of this core should have replication enabled but with no poll
   interval so replication only happens on demand.

2) Trigger a replication from 'staging' server to the 'deploy_MMDD' core
   using the replication handler.

3) use the ALIAS core command to add the name 'staging' to the 
   'deploy_MMDD' core

4) use the SWAP core command to swap the 'staging' and 'prod' cores and make
   sure it all works.  If it doesn't work use SWAP to swap them back.

In the end, you have physical cores with the names 'deploy_MMDD', or
something else appropriate for your environment,  and those would be the
instanceDir's and such on disk.  Then you have logical core aliases of 'staging'
and 'production' etc.  Sort of like symlinks on the file system.

I have not done a deployment like this yet, just thought about it a few times.
And I have not tested this out to see what, if any, complications there are.

enjoy,

-jeremy

[1] - http://wiki.apache.org/solr/CoreAdmin
[2] - http://wiki.apache.org/solr/SolrReplication

-- 

 Jeremy Hinegardner  jer...@hinegardner.org 



Need deployment strategy

2010-01-13 Thread Paul Rosen

Hi all,

The way the indexing works on our system is as follows:

We have a separate staging server with a copy of our web app. The 
clients will index a number of documents in a batch on the staging 
server (this happens about once a week), then they play with the results 
on the staging server for a day until satisfied. Only then do they give 
the ok to deploy.


What I've been doing is, when they want to deploy, I do the following:

1) merge and optimize the index on the staging server,

2) copy it to the production server,

3) stop solr on production,

4) copy the new index on top of the old one,

5) start solr on production.

This works, but has the following disadvantages:

1) The index is getting bigger, so it takes longer to zip it and 
transfer it.


2) The user is only added a few records, yet we copy over all of them. 
If a bug happens that causes an unrelated document to get deleted or 
replaced on staging, we wouldn't notice, and we'd propagate the problem 
to the server. I'd sleep better if I were only moving the records that 
were new or changed and leaving the records that already work in place.


3) solr is down on production for about 5 minutes, so users during that 
time are getting errors.


I was looking for some kind of replication strategy where I can run a 
task on the production server to tell it to merge a core from the 
staging server. Is that possible?


I can open up port 8983 on the staging server only to the production 
server, but then what do I do on production to get the core?


Thanks,
Paul