Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-09 Thread vishal patel
I’ve been running Solr for a dozen years and I’ve never needed a heap larger 
than 8 GB.
>> What is your data size? same like us 1 TB? is your searching or indexing 
>> frequently? NRT model?

My question is why replica is going into recovery? When replica went down, I 
checked GC log but GC pause was not more than 2 seconds.
Also, I cannot find out any reason for recovery from Solr log file. i want to 
know the reason why replica goes into recovery.

Regards,
Vishal Patel

From: Walter Underwood 
Sent: Friday, July 10, 2020 3:03 AM
To: solr-user@lucene.apache.org 
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

Those are extremely large JVMs. Unless you have proven that you MUST
have 55 GB of heap, use a smaller heap.

I’ve been running Solr for a dozen years and I’ve never needed a heap
larger than 8 GB.

Also, there is usually no need to use one JVM per replica.

Your configuration is using 110 GB (two JVMs) just for Java
where I would configure it with a single 8 GB JVM. That would
free up 100 GB for file caches.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 8, 2020, at 10:10 PM, vishal patel  
> wrote:
>
> Thanks for reply.
>
> what you mean by "Shard1 Allocated memory”
>>> It means JVM memory of one solr node or instance.
>
> How many Solr JVMs are you running?
>>> In one server 2 solr JVMs in which one is shard and other is replica.
>
> What is the heap size for your JVMs?
>>> 55GB of one Solr JVM.
>
> Regards,
> Vishal Patel
>
> Sent from Outlook
> 
> From: Walter Underwood 
> Sent: Wednesday, July 8, 2020 8:45 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>
> I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know of
> any way to dedicate system RAM to an application object like a replica.
>
> How many Solr JVMs are you running?
>
> What is the heap size for your JVMs?
>
> Setting soft commit max time to 100 ms does not magically make Solr super 
> fast.
> It makes Solr do too much work, makes the work queues fill up, and makes it 
> fail.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Jul 7, 2020, at 10:55 PM, vishal patel  
>> wrote:
>>
>> Thanks for your reply.
>>
>> One server has total 320GB ram. In this 2 solr node one is shard1 and second 
>> is shard2 replica. Each solr node have 55GB memory allocated. shard1 has 
>> 585GB data and shard2 replica has 492GB data. means almost 1TB data in this 
>> server. server has also other applications and for that 60GB memory 
>> allocated. So total 150GB memory is left.
>>
>> Proper formatting details:
>> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
>>
>> Are you running multiple huge JVMs?
 Not huge but 60GB memory allocated for our 11 application. 150GB memory 
 are still free.
>>
>> The servers will be doing a LOT of disk IO, so look at the read and write 
>> iops. I expect that the solr processes are blocked on disk reads almost all 
>> the time.
 is it chance to go in recovery mode if more IO read and write or blocked?
>>
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
 Our requirement is NRT so we keep the less time
>>
>> Regards,
>> Vishal Patel
>> 
>> From: Walter Underwood 
>> Sent: Tuesday, July 7, 2020 8:15 PM
>> To: solr-user@lucene.apache.org 
>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>>
>> This isn’t a support list, so nobody looks at issues. We do try to help.
>>
>> It looks like you have 1 TB of index on a system with 320 GB of RAM.
>> I don’t know what "Shard1 Allocated memory” is, but maybe half of
>> that RAM is used by JVMs or some other process, I guess. Are you
>> running multiple huge JVMs?
>>
>> The servers will be doing a LOT of disk IO, so look at the read and
>> write iops. I expect that the solr processes are blocked on disk reads
>> almost all the time.
>>
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>> That is probably causing your outages.
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Jul 7, 2020, at 5:18 AM, vishal patel  
>>> wrote:
>>>
>>> Any one is looking my issue? Please guide me.
>>>
>>> Regards,
>>> Vishal Patel
>>>
>>>
>>> 
>>> From: vishal patel 
>>> Sent: Monday, July 6, 2020 7:11 PM
>>> To: solr-user@lucene.apache.org 
>>> Subject: Replica goes into recovery mode in Solr 6.1.0
>>>
>>> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We 
>>> have 2 shards and each shard has 1 replica. We have 3 collection.
>>> We do not use any cache and also disable in Solr config.xml. Search and 
>>> Update requests are coming frequently in our live platform.
>>>
>>> *Our commit 

Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-09 Thread Walter Underwood
Those are extremely large JVMs. Unless you have proven that you MUST
have 55 GB of heap, use a smaller heap.

I’ve been running Solr for a dozen years and I’ve never needed a heap
larger than 8 GB.

Also, there is usually no need to use one JVM per replica.

Your configuration is using 110 GB (two JVMs) just for Java
where I would configure it with a single 8 GB JVM. That would
free up 100 GB for file caches.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 8, 2020, at 10:10 PM, vishal patel  
> wrote:
> 
> Thanks for reply.
> 
> what you mean by "Shard1 Allocated memory”
>>> It means JVM memory of one solr node or instance.
> 
> How many Solr JVMs are you running?
>>> In one server 2 solr JVMs in which one is shard and other is replica.
> 
> What is the heap size for your JVMs?
>>> 55GB of one Solr JVM.
> 
> Regards,
> Vishal Patel
> 
> Sent from Outlook
> 
> From: Walter Underwood 
> Sent: Wednesday, July 8, 2020 8:45 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
> 
> I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know of
> any way to dedicate system RAM to an application object like a replica.
> 
> How many Solr JVMs are you running?
> 
> What is the heap size for your JVMs?
> 
> Setting soft commit max time to 100 ms does not magically make Solr super 
> fast.
> It makes Solr do too much work, makes the work queues fill up, and makes it 
> fail.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Jul 7, 2020, at 10:55 PM, vishal patel  
>> wrote:
>> 
>> Thanks for your reply.
>> 
>> One server has total 320GB ram. In this 2 solr node one is shard1 and second 
>> is shard2 replica. Each solr node have 55GB memory allocated. shard1 has 
>> 585GB data and shard2 replica has 492GB data. means almost 1TB data in this 
>> server. server has also other applications and for that 60GB memory 
>> allocated. So total 150GB memory is left.
>> 
>> Proper formatting details:
>> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
>> 
>> Are you running multiple huge JVMs?
 Not huge but 60GB memory allocated for our 11 application. 150GB memory 
 are still free.
>> 
>> The servers will be doing a LOT of disk IO, so look at the read and write 
>> iops. I expect that the solr processes are blocked on disk reads almost all 
>> the time.
 is it chance to go in recovery mode if more IO read and write or blocked?
>> 
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
 Our requirement is NRT so we keep the less time
>> 
>> Regards,
>> Vishal Patel
>> 
>> From: Walter Underwood 
>> Sent: Tuesday, July 7, 2020 8:15 PM
>> To: solr-user@lucene.apache.org 
>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>> 
>> This isn’t a support list, so nobody looks at issues. We do try to help.
>> 
>> It looks like you have 1 TB of index on a system with 320 GB of RAM.
>> I don’t know what "Shard1 Allocated memory” is, but maybe half of
>> that RAM is used by JVMs or some other process, I guess. Are you
>> running multiple huge JVMs?
>> 
>> The servers will be doing a LOT of disk IO, so look at the read and
>> write iops. I expect that the solr processes are blocked on disk reads
>> almost all the time.
>> 
>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>> That is probably causing your outages.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Jul 7, 2020, at 5:18 AM, vishal patel  
>>> wrote:
>>> 
>>> Any one is looking my issue? Please guide me.
>>> 
>>> Regards,
>>> Vishal Patel
>>> 
>>> 
>>> 
>>> From: vishal patel 
>>> Sent: Monday, July 6, 2020 7:11 PM
>>> To: solr-user@lucene.apache.org 
>>> Subject: Replica goes into recovery mode in Solr 6.1.0
>>> 
>>> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We 
>>> have 2 shards and each shard has 1 replica. We have 3 collection.
>>> We do not use any cache and also disable in Solr config.xml. Search and 
>>> Update requests are coming frequently in our live platform.
>>> 
>>> *Our commit configuration in solr.config are below
>>> 
>>> 60
>>> 2
>>> false
>>> 
>>> 
>>> ${solr.autoSoftCommit.maxTime:-1}
>>> 
>>> 
>>> *We used Near Real Time Searching So we did below configuration in 
>>> solr.in.cmd
>>> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>>> 
>>> *Our collections details are below:
>>> 
>>> Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
>>> Number of Documents Size(GB)Number of Documents Size(GB)
>>> Number of Documents Size(GB)Number of Documents Size(GB)
>>> collection1 26913364201 26913379202 

Re: Multiple fq vs combined fq performance

2020-07-09 Thread Alexandre Rafalovitch
I _think_ it will run all 3 and then do index hopping. But if you know one
fq is super expensive, you could assign it a cost
Value over 100 will try to use PostFilter then and apply the query on top
of results from other queries.


https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter

Hope it helps,
Alex.

On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey,  wrote:

> Hi all! In a collection where we have ~54 million documents we've noticed
> running a query with the following:
>
> "fq":["{!cache=false}_class:taggedTickets",
>   "{!cache=false}taggedTickets_ticketId:100241",
>   "{!cache=false}companyId:22476"]
>
> when I debugQuery I see:
>
> "parsed_filter_queries":[
>   "{!cache=false}_class:taggedTickets",
>   "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241
> TO 100241])",
>   "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])"
> ]
>
> runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` it
> drops down to ~5ms (it's important to note that `taggedTickets_ticketId` is
> globally unique).
>
> If we change the fqs to:
>
> "fq":["{!cache=false}_class:taggedTickets",
>   "{!cache=false}+companyId:22476 +taggedTickets_ticketId:100241"]
>
> when I debugQuery I see:
>
> "parsed_filter_queries":[
>"{!cache=false}_class:taggedTickets",
>"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476])
> +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO 100241])"
> ]
>
> we get the correct result back in ~5ms.
>
> My current thought is that in the slow scenario Solr is still running
> `{!cache=false}IndexOrDocValuesQuery(companyId:[22476
> TO 22476])` even though it "has the answer" from the first two fq.
>
> Am I off-base or misunderstanding how `fq` are processed?
>


Re: Solr docker image works with image option but not with build option in docker-compose

2020-07-09 Thread Shawn Heisey

On 7/8/2020 3:36 PM, gnandre wrote:

I am using Solr docker image 8.5.2-slim from https://hub.docker.com/_/solr.
I use it as a base image and then add some more stuff to it with my custom
Dockerfile. When I build the final docker image, it is built successfully.
After that, when I try to use it in docker-compose.yml (with build option)
to start a Solr service, it complains about no permission for creating
directories under /var/solr path. I have given read/write permission to
solr user for /var/solr path in dockerfile.Also, when I use image instead
of build option in docker-compose.yml file for the same image, it does not
throw any errors like that and Solr starts without any issues. Any clue why
this might be happening?


The docker images for Solr are not created by this project.  They are 
made by third parties.


We are in discussions for bringing one of the docker images into the 
project, but until that happens, support for it will have to come from 
the people that made it.  We know very little about how to deal with any 
problems that are occurring.


I would really like to help, but I do not know what might be wrong, and 
I do not know what questions to ask.


Thanks,
Shawn


Multiple fq vs combined fq performance

2020-07-09 Thread Chris Dempsey
Hi all! In a collection where we have ~54 million documents we've noticed
running a query with the following:

"fq":["{!cache=false}_class:taggedTickets",
  "{!cache=false}taggedTickets_ticketId:100241",
  "{!cache=false}companyId:22476"]

when I debugQuery I see:

"parsed_filter_queries":[
  "{!cache=false}_class:taggedTickets",
  "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241
TO 100241])",
  "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])"
]

runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476` it
drops down to ~5ms (it's important to note that `taggedTickets_ticketId` is
globally unique).

If we change the fqs to:

"fq":["{!cache=false}_class:taggedTickets",
  "{!cache=false}+companyId:22476 +taggedTickets_ticketId:100241"]

when I debugQuery I see:

"parsed_filter_queries":[
   "{!cache=false}_class:taggedTickets",
   "{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476])
+IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO 100241])"
]

we get the correct result back in ~5ms.

My current thought is that in the slow scenario Solr is still running
`{!cache=false}IndexOrDocValuesQuery(companyId:[22476
TO 22476])` even though it "has the answer" from the first two fq.

Am I off-base or misunderstanding how `fq` are processed?


Re: QTime lesser for facet.limit=-1 than facet.limit=5000/10000

2020-07-09 Thread Mikhail Khludnev
Hi,
Usually, limit=-1 works as a single pass-through and counts accumulating;
but when limit >0 causes collecting per value docset, whic might take
longer. There's a note about this effect in uniqueBlock() description.

On Wed, Jul 8, 2020 at 11:29 AM ana  wrote:

> Hi Team,
> Which is more optimized: facet.limit=-1 OR facet.limit=1/5/4?
> For a high Cardinality string field, with no cache enabled, no docValues
> enabled, after every RELOAD on Solr admin UI for each query with different
> facet.limit, why the QTime for "facet.limit=-1" is lesser as compared to
> that of a 'facet.limit=5000/1". What factors apart from those listed
> above matters in calculating QTime?
>
> My understanding is that facet.limit=-1 should have higher response Time as
> per Solr ref guide as compared to any other higher facet.limit specified.
>
> Experiment :
>
> field = abc_s
> cardinality:71520
> num of docs : count:52055449,
> total num of facets:70657
> appliedMethod: FC
> Test query :
> http://localhost:8983/solr/
> /select?facet.field=abc_s=on=*:*=0=true_s.facet.limit=-1
>
> facet.limit -1  100  5000   1 4
> 5
> QTime   983857   34295324  1006
>  1027
>
> Debug response for facet.limit=1 is attached
> facet_Response_1.txt
> 
>
>
>
>
> --
> Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>


-- 
Sincerely yours
Mikhail Khludnev