Thanks Shalin! would you not expect
req.getSearcher().docFreq(t); to be slightly faster? Or maybe even req.getSearcher().getFirstMatch(t) != -1; which one should be faster, any known side effects? On Wed, Jun 29, 2011 at 1:45 PM, Shalin Shekhar Mangar <shalinman...@gmail.com> wrote: > On Wed, Jun 29, 2011 at 2:01 AM, eks dev <eks...@yahoo.co.uk> wrote: > >> Quick question, >> Is there a way with solr to conditionally update document on unique >> id? Meaning, default, add behavior if id is not already in index and >> *not to touch index" if already there. >> >> Deletes are not important (no sync issues). >> >> I am asking because I noticed with deduplication turned on, >> index-files get modified even if I update the same documents again >> (same signatures). >> I am facing very high dupes rate (40-50%), and setup is going to be >> master-slave with high commit rate (requirement is to reduce >> propagation latency for updates). Having unnecessary index >> modifications is going to waste "effort" to ship the same information >> again and again. >> >> if there is no standard way, what would be the fastest way to check if >> Term exists in index from UpdateRequestProcessor? >> >> > I'd suggest that you use the searcher's getDocSet with a TermQuery. > > Use the SolrQueryRequest#getSearcher so you don't need to worry about ref > counting. > > e.g. req.getSearcher().getDocSet(new TermQuery(new Term(signatureField, > sigString))).size(); > > > >> I intend to extend SignatureUpdateProcessor to prevent a document from >> propagating down the chain if this happens? >> Would that be a way to deal with it? I repeat, there are no deletes to >> make headaches with synchronization >> >> > Yes, that should be fine. > > -- > Regards, > Shalin Shekhar Mangar. >