Hi Yonik and Upayavira,

Thank you both for your insightful responses. We now have a much better
understanding of how to implement distributed indexing, although no doubt
more issues will emerge along the way.

Just to clarify (and for critique), our approach goes something like this:
We will use a DistributedUpdateRequestHandler to process an update request
when a 'shards' parameter is present in the URL (as with distributed
search). For example

http://localhost:8983/solr/collection1/update?shards=localhost:8983/solr,localhost:7574/solr

will index the docs across both servers specified. Of course, as Yonik
suggested, this could easily be extended (by using a different URL or
additional params) to handle an entire cluster or a logical shard.

The server would then use the information received from the request handler
to add the documents to the index. To do this, a
ShardPolicy/ShardDistributionPolicy would be consulted  - as specified in
the solrconfig.xml? (with a default method if none specified) - which would
decide which shard to send that document to. Then the documents would
actually be forwarded on to their respective shards to be indexed.

We'll be sure to keep the mailing list posted on our progress.

Thanks,

Alex

Reply via email to