I agree with the goals - having a replication module that was more integrated with Solr and worked in Windows would be nice.
The details are still a bit fuzzy though... I'm not sure if SolrJ & BinaryResponseWriter should be used as the overhead when transferring gigabytes of files would probably be significant. One would probably want to transfer the file in chunks also... a single gigabyte HTTP request is probably not the best idea. -Yonik On Tue, Apr 29, 2008 at 5:01 AM, Noble Paul നോബിള് नोब्ळ् <[EMAIL PROTECTED]> wrote: > hi , > The current replication strategy in solr involves shell scripts . The > following are the drawbacks > * It does not work with windows > * Replication works as a separate piece not integrated with solr. > * Cannot control replication from solr admin/JMX > * Each operation requires manual telnet to the host > > Doing the replication within java code has the following advantages > * Platform independence > * Manual steps can be completely eliminated. Everything can be driven > from solrconfig.xml . > ** Just put in the url of the master in the slaves that should be good > enough to enable replication. Other things like frequency of > snapshoot/snappull can also be configured > * Start/stop can be triggered from solr/admin or JMX > * Can get the status/progress while replication is going on > * No need to have a login into the machine > > The implementation can be done as two components > * A SolrEventListener which does a snapshoot . Same as done by the script > * A ReplicationHandler which can act as a server to dish out the index > snapshots (in the master) > ** In the slave the same handler can poll at regular intervals and if > there is a new snapshot fetch the index over http (it can use > solrj+BinaryReponseWriter) > * The same Handler can do a snap install > * The Handler may expose all the operations over a REST interface or JMX > * It may also show the current state of the master index through the console > > What do you think? > > -- > --Noble Paul >
