Thanks Shalin!

would you not expect

req.getSearcher().docFreq(t);

to be slightly faster? Or maybe even

req.getSearcher().getFirstMatch(t) != -1;

which one should be faster, any known side effects?




On Wed, Jun 29, 2011 at 1:45 PM, Shalin Shekhar Mangar
<shalinman...@gmail.com> wrote:
> On Wed, Jun 29, 2011 at 2:01 AM, eks dev <eks...@yahoo.co.uk> wrote:
>
>> Quick question,
>> Is there a way with solr to conditionally update document on unique
>> id? Meaning, default, add behavior if id is not already in index and
>> *not to touch index" if already there.
>>
>> Deletes are not important (no sync issues).
>>
>> I am asking because I noticed with deduplication turned on,
>> index-files get modified even if I update the same documents again
>> (same signatures).
>> I am facing very high dupes rate (40-50%), and setup is going to be
>> master-slave with high commit rate (requirement is to reduce
>> propagation latency for updates). Having unnecessary index
>> modifications is going to waste  "effort" to ship the same information
>> again and again.
>>
>> if there is no standard way, what would be the fastest way to check if
>> Term exists in index from UpdateRequestProcessor?
>>
>>
> I'd suggest that you use the searcher's getDocSet with a TermQuery.
>
> Use the SolrQueryRequest#getSearcher so you don't need to worry about ref
> counting.
>
> e.g. req.getSearcher().getDocSet(new TermQuery(new Term(signatureField,
> sigString))).size();
>
>
>
>> I intend to extend SignatureUpdateProcessor to prevent a document from
>> propagating down the chain if this happens?
>> Would that be a way to deal with it? I repeat, there are no deletes to
>> make headaches with synchronization
>>
>>
> Yes, that should be fine.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Reply via email to