[ https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213093#comment-17213093 ]
David Smiley commented on SOLR-14923: ------------------------------------- I am responsible for this bug, along with [~moshebla], the contributor of SOLR-12638. Perhaps the single most bit of code I've regretted committing on behalf of another are the few lines of code you have found Thomas. I expressed my reservations at the time: https://issues.apache.org/jira/browse/SOLR-12638?focusedCommentId=16872898&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16872898 bq. What gnaws at me is that this "UpdateLog.openRealtimeSearcher" is being called optimistically on a new doc because maaaayyyybeee some future atomic update will need to see it. And not just any type of atomic update; one that is directly to a nested child doc (something I consider highly experimental). It's as if we're optimizing for making that future atomic update faster by doing work in advance that will, I think, very rarely actually be used. It's a tragedy, if I'm understanding this right. There's a bit of conversation before in the issue about it as well. It's difficult for me to say at the moment what the fix is because that's fairly complex low-level Solr code that I think few people understand well. Nonetheless I'll look into it further this week. > Indexing performance is unacceptable when child documents are involved > ---------------------------------------------------------------------- > > Key: SOLR-14923 > URL: https://issues.apache.org/jira/browse/SOLR-14923 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update, UpdateRequestProcessors > Affects Versions: master (9.0), 8.3, 8.4, 8.5, 8.6 > Reporter: Thomas Wöckinger > Priority: Critical > Labels: performance > > Parallel indexing does not make sense at moment when child documents are used. > The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the > end of the method doVersionAdd if Ulog caches should be refreshed. > This check will return true if any child document is included in the > AddUpdateCommand. > If so ulog.openRealtimeSearcher(); is called, this call is very expensive, > and executed in a synchronized block of the UpdateLog instance, therefore all > other operations on the UpdateLog are blocked too. > Because every important UpdateLog method (add, delete, ...) is done using a > synchronized block almost each operation is blocked. > This reduces multi threaded index update to a single thread behavior. > The described behavior is not depending on any option of the UpdateRequest, > so it does not make any difference if 'waitFlush', 'waitSearcher' or > 'softCommit' is true or false. > The described behavior makes the usage of ChildDocuments useless, because the > performance is unacceptable. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org