List,

I've stumbled upon an issue with the deduplication mechanism. It either 
deletes all documents or does nothing at all and it depends on the 
overwriteDupes setting, resp. true and false.

I use a slightly modified configuration:

  <updateRequestProcessorChain name="dedupe">
    <processor 
class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory">
      <bool name="enabled">true</bool>
      <str name="signatureField">sig</str>
      <bool name="overwriteDupes">true</bool>
      <str name="fields">content</str>
      <str 
name="signatureClass">org.apache.solr.update.processor.Lookup3Signature</str>
    </processor>
    <processor class="solr.LogUpdateProcessorFactory" />
    <processor class="solr.RunUpdateProcessorFactory" />
  </updateRequestProcessorChain>


        <field name="sig" type="string" stored="true" indexed="false" 
multiValued="true" />

After importing new documents i (only with overwriteDupes=false) can clearly 
see the correct signatures. Most documents have a distinct signature and some 
share the same because the content field's value is identical for those 
documents.


Anyway, why does it delete all my documents? Any clues? The wiki is not very 
helpful on this subject.


Cheers.


Markus Jelsma - Technisch Architect - Buyways BV
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Reply via email to