Hi, We currently have a single Solr server, with a single index. There are a number of CMS processes distributed over a number of servers, with each CMS process sending an update to the Solr index when changes are made to a content object.
My concern is that a scenario is possible where a content object is changed and reindexed concurrently by two CMS processes. The database ensures consistency within the CMS, these transactions get comitted as T1 and T2. But I cannot see how to ensure that the reindexing operations (that result in a delete and add for the document) are processed in the order R1 then R2, rather than R2 then R1. In the second case the index record is now inconsistent with the content object in the database. I would like to supply a transaction id with the reindex request, and configure Solr such that a reindex operation is processed if and only if the supplied transaction id is greater than the currently indexed transaction id. Otherwise the only way I can see to guarantee consistency is 1) have index operations processed by a single writer, or 2) commit the index operation between database prepare and commit statements. The first is not desirable as we introduce a single point of failure (in addition to the single Solr server) and delay updating the index. The second is not desirable because it reduces the throughput of the database, and with a distributed Solr setup would not solve the problem. >From what I can tell this conditional indexing feature is not supported by Solr. Might it be supported by Lucene but not exposed by Solr? Thanks, Laurence 2008/12/4 Shalin Shekhar Mangar <[EMAIL PROTECTED]>: > It is not clear how you are using Solr i.e. distributed vs single index. > > Summarily, Solr does not update documents. It overwrites the old document > with the new one if an old document with the same uniqueKey exists in the > index. > > Does that answer your question? > > On Thu, Dec 4, 2008 at 1:46 AM, Laurence Rowe <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> Our CMS is distributed over a cluster and I was wandering how I can >> ensure that index records of newer versions of documents are never >> overwritten by older ones. Amazon AWS uses a timestamp on requests to >> ensure 'eventual consistency' of operations. Is there a way to supply >> a transaction ID with an update so an update is conditional on the >> supplied transaction id being greater than the existing indexed >> transaction id? >> >> Laurence >> > > > > -- > Regards, > Shalin Shekhar Mangar. >