We have a large index, separated into multiple shards, that consists of records exported from a database. One requirement is to support near real-time synchronization with the database. To accomplish this we are considering creating a "daily" shard where create and update documents (records never get deleted) will be posted and at the end of the day, "empty" the daily shard into the other shards and start afresh the next day.
The problem with this approach is when an existing database record is updated into the daily shard, then the daily shard contains an updated document that has a duplicate id with another shard. It is my understanding that in the case of duplicate document ids returned from multiple shards, the document returned first will be returned in the search results and the other duplicate document ids will be discarded. My question is where can I customize the solr code to specify that documents from a particular shard should be given precedence in the search results. Any pointers would be very much appreciated. _________________________________________________________________ Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1