To be clear: 1) is fine. Lucene index updates are carefully sequenced so that the index is never in a bogus state. All data files are written and flushed to disk, then the segments.* files are written that match the data files. You can capture the files with a set of hard links to create a backup.

The CheckIndex program will verify the index backup.
java -cp yourcopy/lucene-core-SOMETHING.jar org.apache.lucene.index.CheckIndex collection/data/index

lucene-core-SOMETHING.jar is usually in the solr-webapp directory where Solr is unpacked.

On 12/20/2012 02:16 AM, Andy D'Arcy Jewell wrote:
Hi all.

Can anyone advise me of a way to pause and resume SolR 4 so I can perform a backup? I need to be able to revert to a usable (though not necessarily complete) index after a crash or other "disaster" more quickly than a re-index operation would yield.

I can't yet afford the "extravagance" of a separate SolR replica just for backups, and I'm not sure if I'll ever have the luxury. I'm currently running with just one node, be we are not yet live.

I can think of the following ways to do this, each with various downsides:

1) Just backup the existing index files whilst indexing continues
    + Easy
    + Fast
    - Incomplete
    - Potential for corruption? (e.g. partial files)

2) Stop/Start Tomcat
    + Easy
    - Very slow and I/O, CPU intensive
    - Client gets errors when trying to connect

3) Block/unblock SolR port with IpTables
    + Fast
    - Client gets errors when trying to connect
- Have to wait for existing transactions to complete (not sure how, maybe watch socket FD's in /proc)

4) Pause/Restart SolR service
    + Fast ? (hopefully)
    - Client gets errors when trying to connect

In any event, the web app will have to gracefully handle unavailability of SolR, probably by displaying a "down for maintenance" message, but this should preferably be only a very short amount of time.

Can anyone comment on my proposed solutions above, or provide any additional ones?

Thanks for any input you can provide!

-Andy


Reply via email to