Then how about: if the UpdateLog is in effect, then overwrite=false is ignored (uniqueKey constraint honored). If we've already agreed it's "wrong" to expect duplicated docs with overwrite=false, then we can choose to ignore this performance hint in certain cases.
RE your aside: agreed! On Thu, Sep 20, 2018 at 3:33 PM Yonik Seeley <[email protected]> wrote: > On Thu, Sep 20, 2018 at 3:18 PM David Smiley <[email protected]> > wrote: > >> Is it even sensible to want overwrite=false and have an UpdateLog? That >> is, isn't the weight of the UpdateLog well more than whatever savings are >> had with overwrite=false? I suspect that the combinbing these two today >> has edge cases we don't even realize, despite the apparent lack of >> exceptions. >> > > They don't seem mutually exclusive or anything. It's true that you would > gain more in performance by not using the update log than by > overwrite=false. But it's also true that overwrite=false is implemented in > a way that has no functional impact and the same can't be said for not > using the update log (i.e. people can't just remove the update log and have > everything work like it did). > > Aside: bulk indexing is enough of a pain point / important use case, we > *should* figure out how to do it better by skipping as much as possible > (update log included) and still have explainable rational behavior. That's > something for the future though... > > -Yonik > > -- Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.solrenterprisesearchserver.com
