Re: Suggestion or recommendation for NRT
Hi Team, Any suggestion or recommendation for the above approach which we are doing to have better search performance. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Suggestion or recommendation for NRT
Thanks a lot for your time to respond for my clarifications. We are having two environment, ENV A and ENV B ( Both same capacity of RAM ( r5.2xlarge and same number of shards and replicas type (NRT) for the collection) ENV A - it is having a collection which is optimized ( segment count 1 and numdocs = maxdocs ) it is used only for Search request. No delta updates are being triggerred. ENV B - It is having same collection copied from "ENV A" and continues DELTA updates in progress so it is used for Indexing and search request. Indexing using KAFKA connect plugin that uses SOLRJ with solr.commit.within=30 ( milli seconds ) We are comparing performance between those environments for search request using automation test running with bunch of queries. Regarding search warmup: 1 true 20 200 *:* true *:* true false 24 -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Suggestion or recommendation for NRT
That seems high. It can be tricky to get tests. Are you running with some kind of test runner? Do you have, say, 3-4 thousand queries you run? Are you running the tests after warming the searchers? Also, if you have indexed down to one segment, _then_ tried adding docs and measuring you are not getting accurate results. See: https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/ Best, Erick > On Jul 1, 2020, at 5:55 PM, ramyogi wrote: > > Thanks Erick for the details and reference to understand better about merging > segment stuff. > When I compare performance of uninterrupted/optimized ( segment count 1) > collection for search request vs (indexing + search) in parallel going on > collection performance is 3 times higher, > for example : first one is responding 100ms in average but second one around > 400ms. > > is that expected behaviour like we need to tradeoff if we do Indexing and > search in the same collection parallel. > or we can still fine tune with some parameters for better performance then > please suggest some. > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Suggestion or recommendation for NRT
Thanks Erick for the details and reference to understand better about merging segment stuff. When I compare performance of uninterrupted/optimized ( segment count 1) collection for search request vs (indexing + search) in parallel going on collection performance is 3 times higher, for example : first one is responding 100ms in average but second one around 400ms. is that expected behaviour like we need to tradeoff if we do Indexing and search in the same collection parallel. or we can still fine tune with some parameters for better performance then please suggest some. -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Suggestion or recommendation for NRT
Updated documents are marked as deleted in the old segment and added to a new segment. When commits happen, merges occur and only then is the space occupied by the deleted document reclaimed. Which segments are merged on commit depends on a number of factors. Unless you can prove the extra space is a problem, you should just ignore the issue. The percentage of deleted documents should max out at around 33% on Solr 7.5+. For background on merging, see: http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html The third animation (TieredMergePolicy) is the default. Best, Erick > On Jul 1, 2020, at 3:51 PM, ramyogi wrote: > > Even though same document indexed over and over again due to incremental > update. Index size is being increased. > Do I miss any configuration to make optimization occur by internally ? > > > > -- > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Suggestion or recommendation for NRT
Even though same document indexed over and over again due to incremental update. Index size is being increased. Do I miss any configuration to make optimization occur by internally ? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Suggestion or recommendation for NRT
Hi, We are using SOLR 7.5.0 version, We are testing one collection for both Search and Index. Our collection created with below indexerconfig, We are using indexing process KAFKA connect plugin with every 5 min commit (cloud SOLRJ) as below https://github.com/jcustenborder/kafka-connect-solr Our collection 30 shard and 3 replica with good RAM EC2 nodes ( 90 nodes) . it is almost 2.5 TB size. We could see the performance impact for search request when indexing in progress. Any kind of recommendation or fine tunning steps to be considered , Please provide any references if there is available that will help. 150 8000 100 10 10 ${solr.lock.type:native} true -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html