Re: Replica goes into recovery mode in Solr 6.1.0

vishal patel Thu, 09 Jul 2020 21:54:12 -0700

I’ve been running Solr for a dozen years and I’ve never needed a heap larger 
than 8 GB.
>> What is your data size? same like us 1 TB? is your searching or indexing 
>> frequently? NRT model?


My question is why replica is going into recovery? When replica went down, I 
checked GC log but GC pause was not more than 2 seconds.
Also, I cannot find out any reason for recovery from Solr log file. i want to 
know the reason why replica goes into recovery.

Regards,
Vishal Patel
________________________________
From: Walter Underwood <wun...@wunderwood.org>
Sent: Friday, July 10, 2020 3:03 AM
To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

Those are extremely large JVMs. Unless you have proven that you MUST
have 55 GB of heap, use a smaller heap.

I’ve been running Solr for a dozen years and I’ve never needed a heap
larger than 8 GB.

Also, there is usually no need to use one JVM per replica.

Your configuration is using 110 GB (two JVMs) just for Java
where I would configure it with a single 8 GB JVM. That would
free up 100 GB for file caches.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 8, 2020, at 10:10 PM, vishal patel <vishalpatel200...@outlook.com> 
> wrote:
>
> Thanks for reply.
>
> what you mean by "Shard1 Allocated memory”
>>> It means JVM memory of one solr node or instance.
>
> How many Solr JVMs are you running?
>>> In one server 2 solr JVMs in which one is shard and other is replica.
>
> What is the heap size for your JVMs?
>>> 55GB of one Solr JVM.
>
> Regards,
> Vishal Patel
>
> Sent from Outlook<http://aka.ms/weboutlook>
> ________________________________
> From: Walter Underwood <wun...@wunderwood.org>
> Sent: Wednesday, July 8, 2020 8:45 PM
> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>
> I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know of
> any way to dedicate system RAM to an application object like a replica.
>
> How many Solr JVMs are you running?
>
> What is the heap size for your JVMs?
>
> Setting soft commit max time to 100 ms does not magically make Solr super 
> fast.
> It makes Solr do too much work, makes the work queues fill up, and makes it 
> fail.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Jul 7, 2020, at 10:55 PM, vishal patel <vishalpatel200...@outlook.com> 
>> wrote:
>>
>> Thanks for your reply.
>>
>> One server has total 320GB ram. In this 2 solr node one is shard1 and second 
>> is shard2 replica. Each solr node have 55GB memory allocated. shard1 has 
>> 585GB data and shard2 replica has 492GB data. means almost 1TB data in this 
>> server. server has also other applications and for that 60GB memory 
>> allocated. So total 150GB memory is left.
>>
>> Proper formatting details:
>> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
>>
>> Are you running multiple huge JVMs?
>>>> Not huge but 60GB memory allocated for our 11 application. 150GB memory 
>>>> are still free.
>>
>> The servers will be doing a LOT of disk IO, so look at the read and write 
>> iops. I expect that the solr processes are blocked on disk reads almost all 
>> the time.
>>>> is it chance to go in recovery mode if more IO read and write or blocked?
>>
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>>>> Our requirement is NRT so we keep the less time
>>
>> Regards,
>> Vishal Patel
>> ________________________________
>> From: Walter Underwood <wun...@wunderwood.org>
>> Sent: Tuesday, July 7, 2020 8:15 PM
>> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>>
>> This isn’t a support list, so nobody looks at issues. We do try to help.
>>
>> It looks like you have 1 TB of index on a system with 320 GB of RAM.
>> I don’t know what "Shard1 Allocated memory” is, but maybe half of
>> that RAM is used by JVMs or some other process, I guess. Are you
>> running multiple huge JVMs?
>>
>> The servers will be doing a LOT of disk IO, so look at the read and
>> write iops. I expect that the solr processes are blocked on disk reads
>> almost all the time.
>>
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>> That is probably causing your outages.
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Jul 7, 2020, at 5:18 AM, vishal patel <vishalpatel200...@outlook.com> 
>>> wrote:
>>>
>>> Any one is looking my issue? Please guide me.
>>>
>>> Regards,
>>> Vishal Patel
>>>
>>>
>>> ________________________________
>>> From: vishal patel <vishalpatel200...@outlook.com>
>>> Sent: Monday, July 6, 2020 7:11 PM
>>> To: solr-user@lucene.apache.org <solr-user@lucene.apache.org>
>>> Subject: Replica goes into recovery mode in Solr 6.1.0
>>>
>>> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We 
>>> have 2 shards and each shard has 1 replica. We have 3 collection.
>>> We do not use any cache and also disable in Solr config.xml. Search and 
>>> Update requests are coming frequently in our live platform.
>>>
>>> *Our commit configuration in solr.config are below
>>> <autoCommit>
>>> <maxTime>600000</maxTime>
>>>     <maxDocs>20000</maxDocs>
>>>     <openSearcher>false</openSearcher>
>>> </autoCommit>
>>> <autoSoftCommit>
>>>     <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
>>> </autoSoftCommit>
>>>
>>> *We used Near Real Time Searching So we did below configuration in 
>>> solr.in.cmd
>>> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>>>
>>> *Our collections details are below:
>>>
>>> Collection      Shard1  Shard1 Replica  Shard2  Shard2 Replica
>>> Number of Documents     Size(GB)        Number of Documents     Size(GB)    
>>>     Number of Documents     Size(GB)        Number of Documents     Size(GB)
>>> collection1     26913364        201     26913379        202     26913380    
>>>     198     26913379        198
>>> collection2     13934360        310     13934367        310     13934368    
>>>     219     13934367        219
>>> collection3     351539689       73.5    351540040       73.5    351540136   
>>>     75.2    351539722       75.2
>>>
>>> *My server configurations are below:
>>>
>>>      Server1 Server2
>>> CPU     Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 Mhz, 10 Core(s), 20 
>>> Logical Processor(s)        Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 
>>> Mhz, 10 Core(s), 20 Logical Processor(s)
>>> HardDisk(GB)    3845 ( 3.84 TB) 3485 GB (3.48 TB)
>>> Total memory(GB)        320     320
>>> Shard1 Allocated memory(GB)     55
>>> Shard2 Replica Allocated memory(GB)     55
>>> Shard2 Allocated memory(GB)             55
>>> Shard1 Replica Allocated memory(GB)             55
>>> Other Applications Allocated Memory(GB) 60      22
>>> Other Number Of Applications    11      7
>>>
>>>
>>> Sometimes, any one replica goes into recovery mode. Why replica goes into 
>>> recovery? Due to heavy search OR heavy update/insert OR long GC pause time? 
>>> If any one of them then what should we do in configuration?
>>> Should we increase the shard for recovery issue?
>>>
>>> Regards,
>>> Vishal Patel
>>>
>>
>

Re: Replica goes into recovery mode in Solr 6.1.0

Reply via email to