We have a large index, separated
into multiple shards, that consists of records exported from a database.  One 
requirement is to support near real-time
synchronization with the database.  To accomplish this we are considering 
creating
a "daily" shard where create and update documents
(records never get deleted) will be posted and at the end of the day, "empty" 
the daily shard into
the other shards and start afresh the next day.


 


The problem with this
approach is when an existing database record is updated into the daily shard, 
then the daily shard contains an updated document that has a duplicate id with 
another shard. 
It is my understanding that in the case of duplicate document ids returned
from multiple shards, the document returned first will be returned in the
search results and the other duplicate document ids will be discarded.


 


My question is where can I
customize the solr code to specify that documents from a particular shard 
should be
given precedence in the search results.  Any pointers would be very much 
appreciated.
                                          
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1

Reply via email to