Thanks for the suggestions. No, not using MERGEINDEXES nor MapReduceIndexerTool.
I've pasted the <add/> XML in case there is something broken there (cut down for brevity, i.e. the "..."): <add overwrite="true" commitWithin="10000"><doc><field name="handle_s">123456789/3</field><field name="title">Test Submission</field><field name="title_sort">Test Submission</field><field name="access">1</field><field name="parent_id">1</field><field name="collection_s">Test Collection</field><field name="collection_fc">test collection|||Test Collection</field><field name="collection_sort">Test Collection</field><field name="dc.contributor.author_fc">young, hayden|||Young, Hayden</field><field name="author">Young, Hayden</field><field name="dc.contributor.author_sm">Young, Hayden</field>...<field name="key">archive.item.1</field>...</doc></add> On 11 September 2015 at 18:06, Erick Erickson <erickerick...@gmail.com> wrote: > Are you by any chance using the MERGEINDEXES > core admin call? Or using MapReduceIndexerTool? > > Neither of those delete duplicates.... > > This is a fundamental part of Solr though, so it's > virtually certain that there's some innocent-seeming > thing you're doing that's causing this... > > Best, > Erick > > On Fri, Sep 11, 2015 at 8:55 AM, Shawn Heisey <apa...@elyograg.org> wrote: > > On 9/11/2015 9:10 AM, Mr Havercamp wrote: > >> fieldType def: > >> > >> <!-- The StrField type is not analyzed, but indexed/stored > >> verbatim. --> > >> <fieldType name="string" class="solr.StrField" > >> sortMissingLast="true" /> > >> > >> It is not SolrCloud. > > > > As long as it's not a distributed index, I can't think of any problem > > those field/type definitions might cause. Even if it were distributed > > and you had the same document in multiple shards, duplicates should be > > removed at query time, if each shard has the same schema as the others. > > > > I don't have any further ideas. There may be something wrong that I > > haven't thought of. > > > > Thanks, > > Shawn > > >