Migration from NRT to TLOG performance issues

Nick Vladiceanu Fri, 11 Jun 2021 05:56:20 -0700

hello,
I’m facing some performance issues when moving from NRT replica types to TLOG + 
PULL. We’re constantly indexing new data and heavily querying (~2k rps).


- index size is ~ 2.5Gi;
- number of docs ~4.6M;
- 2 shards;
- 7 cores and 14Gi of memory
- 30 instances
- JVM Heap is 12Gi

When running on NRT only, the response time in avg is ~150ms p99 and 40ms p95. 
When changing to TLOG (6 tlog replicas) + 24 PULL, the response time grows to 
~350ms p99 and 120ms p95.

Here are some fragments from our solrconfig:

 
>     <updateHandler class="solr.DirectUpdateHandler2">
>         <updateLog>
>             <str name="dir">${solr.data.dir:}</str>
>             <int 
> name="tlogDfsReplication">${solr.ulog.tlogDfsReplication:3}</int>
>         </updateLog>
> 
>         <autoCommit>
>             <maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
>             <maxDocs>${solr.autoCommit.maxDocs:10000}</maxDocs>
>             <openSearcher>true</openSearcher>
>         </autoCommit>
> 
>         <autoSoftCommit>
>             <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
>         </autoSoftCommit>
>     </updateHandler>

>       <query>
>         <maxBooleanClauses>1000</maxBooleanClauses>
>         <filterCache class="solr.CaffeineCache"
>                      size="${filterCache.size:32768}"
>                      initialSize="${filterCache.initialSize:32768}"
>                      autowarmCount="20%"/>
> 
>         <queryResultCache class="solr.CaffeineCache"
>                           size="${queryResultCache.size:32768}"
>                           initialSize="${queryResultCache.initialSize:32768}"
>                           autowarmCount="0%"/>
> 
>         <documentCache class="solr.CaffeineCache"
>                        size="${documentCache.size:150000}"
>                        initialSize="${documentCache.initialSize:150000}"
>                        autowarmCount="0%"/>
> 
>         <enableLazyFieldLoading>true</enableLazyFieldLoading>
>         <useFilterForSortedQuery>true</useFilterForSortedQuery>
> 
>         <queryResultWindowSize>160</queryResultWindowSize>
>         <queryResultMaxDocsCached>300</queryResultMaxDocsCached>
> 
>         <listener event="newSearcher" class="solr.QuerySenderListener">
>         </listener>
>         <listener event="firstSearcher" class="solr.QuerySenderListener">
>         </listener>
> 
>         <useColdSearcher>false</useColdSearcher>
>         <maxWarmingSearchers>8</maxWarmingSearchers> 
>     </query>

One of my assumption was to reduce the maxWarmingSearchers and to increase the 
autoCommit maxTime, since the softCommit isn’t available anymore in TLOG 
replicas. Is that valid? 

I couldn’t find any documents with the differences/considerations we need to 
take into account between NRT and TLOG, could you please help? Thanks a lot in 
advance. Please let me know if there is anything else required.

Best regards,
Nick Vladiceanu

Migration from NRT to TLOG performance issues

Reply via email to