[
https://issues.apache.org/jira/browse/SOLR-16561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638026#comment-17638026
]
Hang Sun commented on SOLR-16561:
---------------------------------
Added PR: https://github.com/apache/solr/pull/1189
> Use autoSoftCommmitMaxTime as preferred poll interval of IndexFetcher
> ---------------------------------------------------------------------
>
> Key: SOLR-16561
> URL: https://issues.apache.org/jira/browse/SOLR-16561
> Project: Solr
> Issue Type: Improvement
> Components: replication (java)
> Affects Versions: 8.8.2
> Reporter: Hang Sun
> Priority: Minor
> Labels: replication-performance
> Attachments: SOLR-16561.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> TLOG/PULL replicas use *IndexFetcher* to fetch segment files from leaders.
> Once new segment files are downloaded and merged into existing index, a new
> Searcher is opened so the updated data is made available to the clients. The
> poll interval is determined by following code in *ReplicateFromLeader*
> {code:java}
> if (uinfo.autoCommmitMaxTime != -1) {
> pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime/2);
> } else if (uinfo.autoSoftCommmitMaxTime != -1) {
> pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime/2);
> }{code}
>
> In a typical config for replication using TLOG/PULL replicas where data
> visibility is less important (a trade-off to avoid NRT replicas), we set a
> short commit time to persist changes and long soft-commit time to make
> changes visible.
>
> {code:java}
> <autoCommit>
> <maxTime>15000</maxTime>
> <openSearcher>false</openSearcher>
> </autoCommit>
> <autoSoftCommit>
> <maxTime>3600000</maxTime>
> </autoSoftCommit>
> {code}
>
> With about config, the poll interval will be 15/2 = 7 sec. This leads to
> frequent opening of new Searchers which causes huge impact on realtime user
> queries, especially if the new Searcher takes long time to warmup. This also
> makes changes visible on followers ahead of leaders.
> Because the polling of new segment files is more about visibility because
> TLOG replicas still get updates to tlog files via UpdateHandler (this is my
> understanding). It seems more appropriate to use *autoSoftCommmitMaxTime* as
> the poll interval.
> I would proposed change below where *autoSoftCommmitMaxTime* is chosen as
> the preferred interval. This will make the poll interval much longer and
> make the visibility order more inline with eventual consistency pattern.
>
> {code:java}
> if (uinfo.autoSoftCommmitMaxTime != -1) {
> pollIntervalStr = toPollIntervalStr(uinfo.autoSoftCommmitMaxTime);
> } else if (uinfo.autoCommmitMaxTime != -1) {
> pollIntervalStr = toPollIntervalStr(uinfo.autoCommmitMaxTime);
> }
> {code}
> The change has been tried and showed much less impact on realtime queries.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]