Re: Solr replication by solr (for windows)

Shalin Shekhar Mangar Tue, 29 Apr 2008 11:06:56 -0700

We can probably do away with hard-links if a core swap (rename) can be made
to work without downtime.


On Tue, Apr 29, 2008 at 11:32 PM, Noble Paul നോബിള്‍ नोब्ळ् <
[EMAIL PROTECTED]> wrote:

> Solrj/BinaryResponseWriter should be used for calls to get metadata on
> the index. The actual index transfer must be done over simple http. I
> may propose a Simple BinaryRawResponseWriter for that.
>
> Sending a huge file in a single response is definitely a bad idea. It
> should be send in chunks of say 10MB or so (configurable)
> . It must have also some mechanism to generate checksums for the whole
> and if possible for chunks.
> A solution can look like this
> * getFileList . Get the names of index files and their checksums.
> (NamedList response)
> * getFilePart: for 1...n of configured chunk size (simple binary
> output/http)
> * join parts 1..n  and compare checksums
> * If it passes keep the file delete the parts
> * If it fails get checksums for individual chunks (NamedList response)
> * and re-fetch the corrupted chunks (simple binary output/http)
>
> Once all the files are downloaded and the checksums are matched ,
> trigger a snapinstall
>
> The details of the snapinstall in windows (with or without hardlinks
> is still a bit fuzzy). But in worst case scenario a copy should be ok.
> (better than having no replication at all)
>
> The solution may not be very optimal for non optimized index
> replication. But in other cases we may be able to achieve comparable
> performance.
>
>
> --Noble
>
>
> On Tue, Apr 29, 2008 at 7:18 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> > I agree with the goals - having a replication module that was more
> >  integrated with Solr and worked in Windows would be nice.
> >
> >  The details are still a bit fuzzy though... I'm not sure if SolrJ &
> >  BinaryResponseWriter should be used as the overhead when transferring
> >  gigabytes of files would probably be significant.  One would probably
> >  want to transfer the file in chunks also...  a single gigabyte HTTP
> >  request is probably not the best idea.
> >
> >  -Yonik
> >
> >
> >
> >  On Tue, Apr 29, 2008 at 5:01 AM, Noble Paul നോബിള്‍ नोब्ळ्
> >  <[EMAIL PROTECTED]> wrote:
> >  > hi ,
> >  >  The current replication strategy in solr involves shell scripts .
> The
> >  >  following are the drawbacks
> >  >  *  It does not work with windows
> >  >  * Replication works as a separate piece not integrated with solr.
> >  >  * Cannot control replication from solr admin/JMX
> >  >  * Each operation requires manual telnet to the host
> >  >
> >  >  Doing the replication within java code has the following advantages
> >  >  * Platform independence
> >  >  * Manual steps can be completely eliminated. Everything can be
> driven
> >  >  from solrconfig.xml .
> >  >  ** Just put in the url of the master in the slaves that should be
> good
> >  >  enough to enable replication. Other things like frequency of
> >  >  snapshoot/snappull can also be configured
> >  >  * Start/stop can be triggered from solr/admin or JMX
> >  >  * Can get the status/progress while replication is going on
> >  >  * No need to have a login into the machine
> >  >
> >  >  The implementation can be done as two components
> >  >  * A SolrEventListener which does a snapshoot . Same as done by the
> script
> >  >  * A ReplicationHandler which can act as a server to dish out the
> index
> >  >  snapshots (in the master)
> >  >  ** In the slave the same handler can poll at regular intervals and
> if
> >  >  there is a new snapshot fetch the index over http (it can use
> >  >  solrj+BinaryReponseWriter)
> >  >  * The same Handler can do a snap install
> >  >  * The Handler may expose all the operations over a REST interface or
> JMX
> >  >  * It may also show the current state of the master index through the
> console
> >  >
> >  >  What do you think?
> >  >
> >  >  --
> >  >  --Noble Paul
> >  >
> >
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Solr replication by solr (for windows)

Reply via email to