To be clear: 1) is fine. Lucene index updates are carefully sequenced so
that the index is never in a bogus state. All data files are written and
flushed to disk, then the segments.* files are written that match the
data files. You can capture the files with a set of hard links to create
a backup.
The CheckIndex program will verify the index backup.
java -cp yourcopy/lucene-core-SOMETHING.jar
org.apache.lucene.index.CheckIndex collection/data/index
lucene-core-SOMETHING.jar is usually in the solr-webapp directory where
Solr is unpacked.
On 12/20/2012 02:16 AM, Andy D'Arcy Jewell wrote:
Hi all.
Can anyone advise me of a way to pause and resume SolR 4 so I can
perform a backup? I need to be able to revert to a usable (though not
necessarily complete) index after a crash or other "disaster" more
quickly than a re-index operation would yield.
I can't yet afford the "extravagance" of a separate SolR replica just
for backups, and I'm not sure if I'll ever have the luxury. I'm
currently running with just one node, be we are not yet live.
I can think of the following ways to do this, each with various
downsides:
1) Just backup the existing index files whilst indexing continues
+ Easy
+ Fast
- Incomplete
- Potential for corruption? (e.g. partial files)
2) Stop/Start Tomcat
+ Easy
- Very slow and I/O, CPU intensive
- Client gets errors when trying to connect
3) Block/unblock SolR port with IpTables
+ Fast
- Client gets errors when trying to connect
- Have to wait for existing transactions to complete (not sure
how, maybe watch socket FD's in /proc)
4) Pause/Restart SolR service
+ Fast ? (hopefully)
- Client gets errors when trying to connect
In any event, the web app will have to gracefully handle
unavailability of SolR, probably by displaying a "down for
maintenance" message, but this should preferably be only a very short
amount of time.
Can anyone comment on my proposed solutions above, or provide any
additional ones?
Thanks for any input you can provide!
-Andy