I have configured de-duplication according to the Wiki.......... My signature field is defined thus...
<field name="signature" type="string" stored="true" indexed="true" multiValued="false" /> and my updateRequestProcessor as follows.... <updateRequestProcessorChain name="dedupe"> <processor class="org.apache.solr.update.processor.SignatureUpdateProcessorFactory"> <bool name="enabled">true</bool> <bool name="overwriteDupes">false</bool> <str name="signatureField">signature</str> <str name="fields">content</str> <str name="signatureClass">org.apache.solr.update.processor.Lookup3Signature</str> </processor> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> I am using SOLRJ to write to the index with the binary (as opposed to XML) so my update handler is defined as below..... <requestHandler name="/update/javabin" class="solr.BinaryUpdateRequestHandler" > <lst name="defaults"> <str name="update.processor">dedupe</str> </lst> </requestHandler> However I was expecting SOLR to only allow 1 instance of a duplicate document into the index, but I get the following results when I query mt index... I have deliberately added my ISA Letter file 4 times and can see it has correctly generated an identical signature for the first 4 entries (d91a5ce933457fd5). The fifth entry is a different document and correctly has a different signature. I was expecting to only see 1 instance of the duplicate. Am I misinterpreting the way it works? Many Thanks. <result name="response" numFound="36" start="0"> ? <doc> <str name="doctitle">ISA Letter</str> <str name="signature">d91a5ce933457fd5</str> </doc> ? <doc> <str name="doctitle">ISA Letter</str> <str name="signature">d91a5ce933457fd5</str> </doc> ? <doc> <str name="doctitle">ISA Letter</str> <str name="signature">d91a5ce933457fd5</str> </doc> ? <doc> <str name="doctitle">ISA Letter</str> <str name="signature">d91a5ce933457fd5</str> </doc> ? <doc> <str name="doctitle">ISA Mailing pack letter</str> <str name="signature">fd9d9e1c0de32fb5</str> </doc> If you wish to view the St. James's Place email disclaimer, please use the link below http://www.sjp.co.uk/portal/internet/SJPemaildisclaimer