[
https://issues.apache.org/jira/browse/SOLR-14923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17248316#comment-17248316
]
Thomas Wöckinger commented on SOLR-14923:
-----------------------------------------
{quote}in RTG.getInputDocument, you added the potential open _before_ the check
if the doc from the updateLog is null... ({{sid == null}}) but shouldn't we not
if sid isn't null? Thus move it down right below, indented. Also, maybe in DUP,
we can sometimes further restrict when this logic happens – perhaps only when
the document coming in is an atomic update. I'll investigate that.
{quote}
This was my first intention, but
org.apache.solr.cloud.NestedShardedAtomicUpdateTest.test will fail if doing so,
didn't find the cause yet.
{quote}Earlier in this issue, you tried modifying
UpdateLog.openRealtimeSearcher to move the searcher re-open outside of the
synchronized block. That makes sense to me; we should do that.
{quote}
This will require some test modification, because some extra commits are
required or must be moved, so the behavior will definitely change.
> Indexing performance is unacceptable when child documents are involved
> ----------------------------------------------------------------------
>
> Key: SOLR-14923
> URL: https://issues.apache.org/jira/browse/SOLR-14923
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: update, UpdateRequestProcessors
> Affects Versions: 8.3, 8.4, 8.5, 8.6, master (9.0)
> Reporter: Thomas Wöckinger
> Priority: Critical
> Labels: performance, pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Parallel indexing does not make sense at moment when child documents are used.
> The org.apache.solr.update.processor.DistributedUpdateProcessor checks at the
> end of the method doVersionAdd if Ulog caches should be refreshed.
> This check will return true if any child document is included in the
> AddUpdateCommand.
> If so ulog.openRealtimeSearcher(); is called, this call is very expensive,
> and executed in a synchronized block of the UpdateLog instance, therefore all
> other operations on the UpdateLog are blocked too.
> Because every important UpdateLog method (add, delete, ...) is done using a
> synchronized block almost each operation is blocked.
> This reduces multi threaded index update to a single thread behavior.
> The described behavior is not depending on any option of the UpdateRequest,
> so it does not make any difference if 'waitFlush', 'waitSearcher' or
> 'softCommit' is true or false.
> The described behavior makes the usage of ChildDocuments useless, because the
> performance is unacceptable.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]