Running 4.8.1. I am experiencing the same problem where I get duplicates on
index update despite using overwrite=true when adding existing documents.
My duplicate ratio is a lot higher with maybe 25 - 50% of records having
duplicates (and as the index continues to run the duplicates increase from
2 to 3,4,5 etc).

<field name="key"    type="string"    indexed="true"    stored="true"
required="true"/>

and

<uniqueKey>key</uniqueKey>

are set in the schema.xml but along with overwrite="true" this still
doesn't guarantee uniqueness.

On 5 August 2015 at 14:29, Tarala, Magesh <mtar...@bh.com> wrote:

> I deleted the index and re-indexed. Duplicates went away. Have not
> identified root cause, but looks like updating documents is causing it
> sporadically. Going to try deleting the document and then update.
>
>
> -----Original Message-----
> From: Tarala, Magesh
> Sent: Monday, August 03, 2015 8:27 AM
> To: solr-user@lucene.apache.org
> Subject: Duplicate Documents
>
> I'm using solr 4.10.2. I'm using "id" field as the unique key - it is
> passed in with the document when ingesting the documents into solr. When
> querying I get duplicate documents with different "_version_". Out off
> approx. 25K unique documents ingested into solr, I see approx. 300
> duplicates.
>
> It is a 3 node solr cloud with one shard and 2 replicas.
> I'm also using nested documents.
>
> Thanks in advance for any insights.
>
> --Magesh
>
>

Reply via email to