Re: Suggestion or recommendation for NRT

2020-07-08 Thread ramyogi
Hi Team, Any suggestion or recommendation for the above approach which we are
doing  to have better search performance.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Suggestion or recommendation for NRT

2020-07-02 Thread ramyogi
Thanks a lot for your time to respond for my clarifications.

We are having two environment,
ENV A and ENV B ( Both same capacity of RAM ( r5.2xlarge  and same number of
shards and replicas type (NRT) for the collection)

ENV A -  it is having a collection which is optimized ( segment count 1 and
numdocs = maxdocs ) it is used only for Search request. No delta updates are
being triggerred.


ENV B - It is having same collection copied from "ENV A" and continues DELTA
updates in progress so it is used for Indexing and search request. Indexing
using KAFKA connect plugin that uses SOLRJ with
solr.commit.within=30 ( milli seconds )


We are comparing performance between those environments for search request
using automation test running with bunch of queries.

Regarding search warmup:



1








true

20

200




*:*
true






*:*
true




false

24





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Suggestion or recommendation for NRT

2020-07-01 Thread Erick Erickson
That seems high. It can be tricky to get tests. Are you running with
some kind of test runner? Do you have, say, 3-4 thousand queries
you run? Are you running the tests after warming the searchers?

Also, if you have indexed down to one segment, _then_ tried
adding docs and measuring you are not getting accurate results.

See: 
https://lucidworks.com/post/segment-merging-deleted-documents-optimize-may-bad/

Best,
Erick

> On Jul 1, 2020, at 5:55 PM, ramyogi  wrote:
> 
> Thanks Erick for the details and reference to understand better about merging
> segment stuff.
> When I compare  performance of uninterrupted/optimized ( segment count 1)
> collection  for search request vs (indexing + search) in parallel  going on
> collection   performance is 3 times higher,
> for example : first one is responding 100ms in average but second one around
> 400ms.
> 
> is that expected behaviour like we need to tradeoff if we do Indexing and
> search in the same collection parallel.
> or we can still fine tune with some parameters for better performance then
> please suggest some.
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Suggestion or recommendation for NRT

2020-07-01 Thread ramyogi
Thanks Erick for the details and reference to understand better about merging
segment stuff.
When I compare  performance of uninterrupted/optimized ( segment count 1)
collection  for search request vs (indexing + search) in parallel  going on
collection   performance is 3 times higher,
for example : first one is responding 100ms in average but second one around
400ms.

is that expected behaviour like we need to tradeoff if we do Indexing and
search in the same collection parallel.
or we can still fine tune with some parameters for better performance then
please suggest some.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Suggestion or recommendation for NRT

2020-07-01 Thread Erick Erickson
Updated documents are marked as deleted in the
old segment and added to a new segment. When
commits happen, merges occur and only then is the
space occupied by the deleted document reclaimed.

Which segments are merged on commit depends
on a number of factors.

Unless you can prove the extra space is a problem,
you should just ignore the issue. The percentage of
deleted documents should max out at around 33%
on Solr 7.5+.

For background on merging, see:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

The third animation (TieredMergePolicy) is the default.

Best,
Erick

> On Jul 1, 2020, at 3:51 PM, ramyogi  wrote:
> 
> Even though same document indexed over and over again due to incremental
> update. Index size is being increased.
> Do I miss any configuration to make optimization occur by internally ?
> 
> 
> 
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html



Re: Suggestion or recommendation for NRT

2020-07-01 Thread ramyogi
Even though same document indexed over and over again due to incremental
update. Index size is being increased.
Do I miss any configuration to make optimization occur by internally ?



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Suggestion or recommendation for NRT

2020-06-29 Thread ramyogi
Hi,

We are using SOLR 7.5.0 version, We are testing one collection for both
Search and Index.
Our collection created with  below indexerconfig, We are using indexing
process KAFKA connect plugin with every 5 min commit (cloud SOLRJ) as below
https://github.com/jcustenborder/kafka-connect-solr

Our collection 30 shard and 3 replica with good RAM EC2 nodes ( 90 nodes) .
it is almost 2.5 TB size. We could see the performance impact for search
request when indexing in progress.   Any kind of recommendation or fine
tunning steps to be considered , Please provide any references if there is
available that will help. 



150
8000
100

10
10


${solr.lock.type:native}
true







--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html