Re: Multiple fq vs combined fq performance

2020-07-10 Thread Tomás Fernández Löbbe
All non-cached filters will be executed together (leapfrog between them)
and will be sorted by the filter cost (I guess that, since you aren't
setting a cost, then the order of the input matters).  You can try setting
a cost in your filters (lower than 100, so that they don't become post
filters)

One other thing though, I guess you are using Point fields? If you
typically query for a single value like in this example (vs. ranges), you
may want to use string fields for those. See
https://issues.apache.org/jira/browse/SOLR-11078.




On Fri, Jul 10, 2020 at 7:51 AM Chris Dempsey  wrote:

> Thanks for the suggestion, Alex. It doesn't appear that
> IndexOrDocValuesQuery (at least in Solr 7.7.1) supports the PostFilter
> interface. I've tried various values for cost on each of the fq and it
> doesn't change the QTime.
>
> So, after digging around a bit even though
> {!cache=false}taggedTickets_ticketId:100241 only matches one and only
> one document in the collection that doesn't matter for the other two fq who
> continue to look over the index of the collection, correct?
>
> On Thu, Jul 9, 2020 at 4:24 PM Alexandre Rafalovitch 
> wrote:
>
> > I _think_ it will run all 3 and then do index hopping. But if you know
> one
> > fq is super expensive, you could assign it a cost
> > Value over 100 will try to use PostFilter then and apply the query on top
> > of results from other queries.
> >
> >
> >
> >
> https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter
> >
> > Hope it helps,
> > Alex.
> >
> > On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey, 
> wrote:
> >
> > > Hi all! In a collection where we have ~54 million documents we've
> noticed
> > > running a query with the following:
> > >
> > > "fq":["{!cache=false}_class:taggedTickets",
> > >   "{!cache=false}taggedTickets_ticketId:100241",
> > >   "{!cache=false}companyId:22476"]
> > >
> > > when I debugQuery I see:
> > >
> > > "parsed_filter_queries":[
> > >   "{!cache=false}_class:taggedTickets",
> > >
>  "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241
> > > TO 100241])",
> > >   "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])"
> > > ]
> > >
> > > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476`
> > it
> > > drops down to ~5ms (it's important to note that
> `taggedTickets_ticketId`
> > is
> > > globally unique).
> > >
> > > If we change the fqs to:
> > >
> > > "fq":["{!cache=false}_class:taggedTickets",
> > >   "{!cache=false}+companyId:22476
> > +taggedTickets_ticketId:100241"]
> > >
> > > when I debugQuery I see:
> > >
> > > "parsed_filter_queries":[
> > >"{!cache=false}_class:taggedTickets",
> > >"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476])
> > > +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO
> > 100241])"
> > > ]
> > >
> > > we get the correct result back in ~5ms.
> > >
> > > My current thought is that in the slow scenario Solr is still running
> > > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476
> > > TO 22476])` even though it "has the answer" from the first two fq.
> > >
> > > Am I off-base or misunderstanding how `fq` are processed?
> > >
> >
>


Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-10 Thread Walter Underwood
Sorting and faceting takes a lot of memory. From your charts, I would try
a 31 GB heap. That would make GC faster. 680 ms is very long for a GC
and can cause problems.

Combine a 680 ms GC with a 100 ms soft commit time and you can have
lots of trouble.

Change your soft commit time to 1 (ten seconds) or longer.

Look at a 24 hour graph of heap usage. It should look like a sawtooth,
increasing, then dropping after every full GC. The bottom of the sawtooth
is the the memory that Solr actually needs. Take the highest number from
the bottom of the sawtooth and add some extra, maybe 2 GB. Try that
heap size.

Upgrade to 6.6.2. That includes all bug fixes for the 6.x release. The 6.x 
release had several bad bugs, especially in the middle releases. We were
switching prod to Sol Cloud while those were being released and it was
not fun.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 10, 2020, at 4:59 AM, vishal patel  
> wrote:
> 
> Thanks for quick reply.
> 
> I assume caches (are they too large?), perhaps uninverted indexes.
> Docvalues would help with latter ones. Do you use them?
>>> We do not use any cache. we disabled the cache from solrconfig.xml
> here is my solrconfig .xml and schema.xml
> https://drive.google.com/file/d/12SHl3YGP7jT4goikBkeyB2s1NX5_C2gz/view
> https://drive.google.com/file/d/1LwA1d4OiMhQQv806tR0HbZoEjA8IyfdR/view
> 
> We used Docvalues on that field which is used for sorting or faceting.
> 
> You could also try upgrading to the latest version in 6.x series as a starter.
>>> I will surely try.
> 
> So, the node in question isn't responding quickly enough to http requests and 
> gets put into recovery. The log for the recovering node starts too late, so I 
> can't say anything about what happened before 14:42:43.943 that lead to 
> recovery.
>>> There is no error before 14:42:43.943 just search and insert requests are 
>>> there. I got that node is responding but why it is not responding? Due to 
>>> lack of memory or any other cause
> why we cannot get idea from log for reason of not responding.
> 
> Is there any monitor for Solr from where we can find the root cause?
> 
> Regards,
> Vishal Patel
> 
> 
> 
> From: Ere Maijala 
> Sent: Friday, July 10, 2020 4:27 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
> 
> vishal patel kirjoitti 10.7.2020 klo 12.45:
>> Thanks for your input.
>> 
>> Walter already said that setting soft commit max time to 100 ms is a recipe 
>> for disaster
 I know that but our application is already developed and run on live 
 environment since last 5 years. Actually, we want to show a data very 
 quickly after the insert.
>> 
>> you have huge JVM heaps without an explanation for the reason
 We gave the 55GB ram because our usage is like that large query search and 
 very frequent searching and indexing.
>> Here is my memory snapshot which I have taken from GC.
> 
> Yes, I can see that a lot of memory is in use, but the question is why.
> I assume caches (are they too large?), perhaps uninverted indexes.
> Docvalues would help with latter ones. Do you use them?
> 
>> I have tried Solr upgrade from 6.1.0 to 8.5.1 but due to some issue we 
>> cannot do. I have also asked in here
>> https://lucene.472066.n3.nabble.com/Sorting-in-other-collection-in-Solr-8-5-1-td4459506.html#a4459562
> 
> You could also try upgrading to the latest version in 6.x series as a
> starter.
> 
>> Why we cannot find the reason of recovery from log? like memory or CPU 
>> issue, frequent index or search, large query hit,
>> My log at the time of recovery
>> https://drive.google.com/file/d/1F8Bn7jSXspe2HRelh_vJjKy9DsTRl9h0/view
>> [https://lh5.googleusercontent.com/htOUfpihpAqncFsMlCLnSUZPu1_9DRKGNajaXV1jG44fpFzgx51ecNtUK58m5lk=w1200-h630-p]
>> recovery_shard.txt
>> drive.google.com
> 
> Isn't it right there on the first lines?
> 
> 2020-07-09 14:42:43.943 ERROR
> (updateExecutor-2-thread-21007-processing-http:11.200.212.305:8983//solr//products
> x:products r:core_node1 n:11.200.212.306:8983_solr s:shard1 c:products)
> [c:products s:shard1 r:core_node1 x:products]
> o.a.s.u.StreamingSolrClients error
> org.apache.http.NoHttpResponseException: 11.200.212.305:8983 failed to
> respond
> 
> followed by a couple more error messages about the same problem and then
> initiation of recovery:
> 
> 2020-07-09 14:42:44.002 INFO  (qtp1239731077-771611) [c:products
> s:shard1 r:core_node1 x:products] o.a.s.c.ZkController Put replica
> core=products coreNodeName=core_node3 on 11.200.212.305:8983_solr into
> leader-initiated recovery.
> 
> So the node in question isn't responding quickly enough to http requests
> and gets put into recovery. The log for the recovering node starts too
> late, so I can't say 

Re: Multiple fq vs combined fq performance

2020-07-10 Thread Chris Dempsey
Thanks for the suggestion, Alex. It doesn't appear that
IndexOrDocValuesQuery (at least in Solr 7.7.1) supports the PostFilter
interface. I've tried various values for cost on each of the fq and it
doesn't change the QTime.

So, after digging around a bit even though
{!cache=false}taggedTickets_ticketId:100241 only matches one and only
one document in the collection that doesn't matter for the other two fq who
continue to look over the index of the collection, correct?

On Thu, Jul 9, 2020 at 4:24 PM Alexandre Rafalovitch 
wrote:

> I _think_ it will run all 3 and then do index hopping. But if you know one
> fq is super expensive, you could assign it a cost
> Value over 100 will try to use PostFilter then and apply the query on top
> of results from other queries.
>
>
>
> https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter
>
> Hope it helps,
> Alex.
>
> On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey,  wrote:
>
> > Hi all! In a collection where we have ~54 million documents we've noticed
> > running a query with the following:
> >
> > "fq":["{!cache=false}_class:taggedTickets",
> >   "{!cache=false}taggedTickets_ticketId:100241",
> >   "{!cache=false}companyId:22476"]
> >
> > when I debugQuery I see:
> >
> > "parsed_filter_queries":[
> >   "{!cache=false}_class:taggedTickets",
> >   "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[100241
> > TO 100241])",
> >   "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])"
> > ]
> >
> > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476`
> it
> > drops down to ~5ms (it's important to note that `taggedTickets_ticketId`
> is
> > globally unique).
> >
> > If we change the fqs to:
> >
> > "fq":["{!cache=false}_class:taggedTickets",
> >   "{!cache=false}+companyId:22476
> +taggedTickets_ticketId:100241"]
> >
> > when I debugQuery I see:
> >
> > "parsed_filter_queries":[
> >"{!cache=false}_class:taggedTickets",
> >"{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476])
> > +IndexOrDocValuesQuery(taggedTickets_ticketId:[100241 TO
> 100241])"
> > ]
> >
> > we get the correct result back in ~5ms.
> >
> > My current thought is that in the slow scenario Solr is still running
> > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476
> > TO 22476])` even though it "has the answer" from the first two fq.
> >
> > Am I off-base or misunderstanding how `fq` are processed?
> >
>


Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-10 Thread vishal patel
Thanks for quick reply.

I assume caches (are they too large?), perhaps uninverted indexes.
Docvalues would help with latter ones. Do you use them?
>> We do not use any cache. we disabled the cache from solrconfig.xml
here is my solrconfig .xml and schema.xml
https://drive.google.com/file/d/12SHl3YGP7jT4goikBkeyB2s1NX5_C2gz/view
https://drive.google.com/file/d/1LwA1d4OiMhQQv806tR0HbZoEjA8IyfdR/view

We used Docvalues on that field which is used for sorting or faceting.

You could also try upgrading to the latest version in 6.x series as a starter.
>> I will surely try.

So, the node in question isn't responding quickly enough to http requests and 
gets put into recovery. The log for the recovering node starts too late, so I 
can't say anything about what happened before 14:42:43.943 that lead to 
recovery.
>> There is no error before 14:42:43.943 just search and insert requests are 
>> there. I got that node is responding but why it is not responding? Due to 
>> lack of memory or any other cause
why we cannot get idea from log for reason of not responding.

Is there any monitor for Solr from where we can find the root cause?

Regards,
Vishal Patel



From: Ere Maijala 
Sent: Friday, July 10, 2020 4:27 PM
To: solr-user@lucene.apache.org 
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

vishal patel kirjoitti 10.7.2020 klo 12.45:
> Thanks for your input.
>
> Walter already said that setting soft commit max time to 100 ms is a recipe 
> for disaster
>>> I know that but our application is already developed and run on live 
>>> environment since last 5 years. Actually, we want to show a data very 
>>> quickly after the insert.
>
> you have huge JVM heaps without an explanation for the reason
>>> We gave the 55GB ram because our usage is like that large query search and 
>>> very frequent searching and indexing.
> Here is my memory snapshot which I have taken from GC.

Yes, I can see that a lot of memory is in use, but the question is why.
I assume caches (are they too large?), perhaps uninverted indexes.
Docvalues would help with latter ones. Do you use them?

> I have tried Solr upgrade from 6.1.0 to 8.5.1 but due to some issue we cannot 
> do. I have also asked in here
> https://lucene.472066.n3.nabble.com/Sorting-in-other-collection-in-Solr-8-5-1-td4459506.html#a4459562

You could also try upgrading to the latest version in 6.x series as a
starter.

> Why we cannot find the reason of recovery from log? like memory or CPU issue, 
> frequent index or search, large query hit,
> My log at the time of recovery
> https://drive.google.com/file/d/1F8Bn7jSXspe2HRelh_vJjKy9DsTRl9h0/view
> [https://lh5.googleusercontent.com/htOUfpihpAqncFsMlCLnSUZPu1_9DRKGNajaXV1jG44fpFzgx51ecNtUK58m5lk=w1200-h630-p]
> recovery_shard.txt
> drive.google.com

Isn't it right there on the first lines?

2020-07-09 14:42:43.943 ERROR
(updateExecutor-2-thread-21007-processing-http:11.200.212.305:8983//solr//products
x:products r:core_node1 n:11.200.212.306:8983_solr s:shard1 c:products)
[c:products s:shard1 r:core_node1 x:products]
o.a.s.u.StreamingSolrClients error
org.apache.http.NoHttpResponseException: 11.200.212.305:8983 failed to
respond

followed by a couple more error messages about the same problem and then
initiation of recovery:

2020-07-09 14:42:44.002 INFO  (qtp1239731077-771611) [c:products
s:shard1 r:core_node1 x:products] o.a.s.c.ZkController Put replica
core=products coreNodeName=core_node3 on 11.200.212.305:8983_solr into
leader-initiated recovery.

So the node in question isn't responding quickly enough to http requests
and gets put into recovery. The log for the recovering node starts too
late, so I can't say anything about what happened before 14:42:43.943
that lead to recovery.

--Ere

>
> 
> From: Ere Maijala 
> Sent: Friday, July 10, 2020 2:10 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>
> Walter already said that setting soft commit max time to 100 ms is a
> recipe for disaster. That alone can be the issue, but if you're not
> willing to try higher values, there's no way of being sure. And you have
> huge JVM heaps without an explanation for the reason. If those do not
> cause problems, you indicated that you also run some other software on
> the same server. Is it possible that the other processes hog CPU, disk
> or network and starve Solr?
>
> I must add that Solr 6.1.0 is over four years old. You could be hitting
> a bug that has been fixed for years, but even if you encounter an issue
> that's still present, you will need to uprgade to get it fixed. If you
> look at the number of fixes done in subsequent 6.x versions alone in the
> changelog (https://lucene.apache.org/solr/8_5_1/changes/Changes.html)
> you'll see that there are a lot of 

Re: SOLR / Zookeeper Compatibility

2020-07-10 Thread mithunseal
I am new to this SOLR-ZOOKEEPER. I am not able to understand the
compatibility thing. For example, I am using SOLR 7.5.0 which uses ZK
3.4.11. So SOLR 7.5.0 will not work with ZK 3.4.10?

Can someone please confirm this?

Thanks,
Mithun Seal



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-10 Thread Ere Maijala
vishal patel kirjoitti 10.7.2020 klo 12.45:
> Thanks for your input.
> 
> Walter already said that setting soft commit max time to 100 ms is a recipe 
> for disaster
>>> I know that but our application is already developed and run on live 
>>> environment since last 5 years. Actually, we want to show a data very 
>>> quickly after the insert.
> 
> you have huge JVM heaps without an explanation for the reason
>>> We gave the 55GB ram because our usage is like that large query search and 
>>> very frequent searching and indexing.
> Here is my memory snapshot which I have taken from GC.

Yes, I can see that a lot of memory is in use, but the question is why.
I assume caches (are they too large?), perhaps uninverted indexes.
Docvalues would help with latter ones. Do you use them?

> I have tried Solr upgrade from 6.1.0 to 8.5.1 but due to some issue we cannot 
> do. I have also asked in here
> https://lucene.472066.n3.nabble.com/Sorting-in-other-collection-in-Solr-8-5-1-td4459506.html#a4459562

You could also try upgrading to the latest version in 6.x series as a
starter.

> Why we cannot find the reason of recovery from log? like memory or CPU issue, 
> frequent index or search, large query hit,
> My log at the time of recovery
> https://drive.google.com/file/d/1F8Bn7jSXspe2HRelh_vJjKy9DsTRl9h0/view
> [https://lh5.googleusercontent.com/htOUfpihpAqncFsMlCLnSUZPu1_9DRKGNajaXV1jG44fpFzgx51ecNtUK58m5lk=w1200-h630-p]
> recovery_shard.txt
> drive.google.com

Isn't it right there on the first lines?

2020-07-09 14:42:43.943 ERROR
(updateExecutor-2-thread-21007-processing-http:11.200.212.305:8983//solr//products
x:products r:core_node1 n:11.200.212.306:8983_solr s:shard1 c:products)
[c:products s:shard1 r:core_node1 x:products]
o.a.s.u.StreamingSolrClients error
org.apache.http.NoHttpResponseException: 11.200.212.305:8983 failed to
respond

followed by a couple more error messages about the same problem and then
initiation of recovery:

2020-07-09 14:42:44.002 INFO  (qtp1239731077-771611) [c:products
s:shard1 r:core_node1 x:products] o.a.s.c.ZkController Put replica
core=products coreNodeName=core_node3 on 11.200.212.305:8983_solr into
leader-initiated recovery.

So the node in question isn't responding quickly enough to http requests
and gets put into recovery. The log for the recovering node starts too
late, so I can't say anything about what happened before 14:42:43.943
that lead to recovery.

--Ere

> 
> 
> From: Ere Maijala 
> Sent: Friday, July 10, 2020 2:10 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
> 
> Walter already said that setting soft commit max time to 100 ms is a
> recipe for disaster. That alone can be the issue, but if you're not
> willing to try higher values, there's no way of being sure. And you have
> huge JVM heaps without an explanation for the reason. If those do not
> cause problems, you indicated that you also run some other software on
> the same server. Is it possible that the other processes hog CPU, disk
> or network and starve Solr?
> 
> I must add that Solr 6.1.0 is over four years old. You could be hitting
> a bug that has been fixed for years, but even if you encounter an issue
> that's still present, you will need to uprgade to get it fixed. If you
> look at the number of fixes done in subsequent 6.x versions alone in the
> changelog (https://lucene.apache.org/solr/8_5_1/changes/Changes.html)
> you'll see that there are a lot of them. You could be hitting something
> like SOLR-10420, which has been fixed for over three years.
> 
> Best,
> Ere
> 
> vishal patel kirjoitti 10.7.2020 klo 7.52:
>> I’ve been running Solr for a dozen years and I’ve never needed a heap larger 
>> than 8 GB.
 What is your data size? same like us 1 TB? is your searching or indexing 
 frequently? NRT model?
>>
>> My question is why replica is going into recovery? When replica went down, I 
>> checked GC log but GC pause was not more than 2 seconds.
>> Also, I cannot find out any reason for recovery from Solr log file. i want 
>> to know the reason why replica goes into recovery.
>>
>> Regards,
>> Vishal Patel
>> 
>> From: Walter Underwood 
>> Sent: Friday, July 10, 2020 3:03 AM
>> To: solr-user@lucene.apache.org 
>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>>
>> Those are extremely large JVMs. Unless you have proven that you MUST
>> have 55 GB of heap, use a smaller heap.
>>
>> I’ve been running Solr for a dozen years and I’ve never needed a heap
>> larger than 8 GB.
>>
>> Also, there is usually no need to use one JVM per replica.
>>
>> Your configuration is using 110 GB (two JVMs) just for Java
>> where I would configure it with a single 8 GB JVM. That would
>> free up 100 GB for file caches.
>>
>> wunder
>> Walter 

Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-10 Thread vishal patel
Thanks for your input.

Walter already said that setting soft commit max time to 100 ms is a recipe for 
disaster
>> I know that but our application is already developed and run on live 
>> environment since last 5 years. Actually, we want to show a data very 
>> quickly after the insert.

you have huge JVM heaps without an explanation for the reason
>> We gave the 55GB ram because our usage is like that large query search and 
>> very frequent searching and indexing.
Here is my memory snapshot which I have taken from GC.

https://drive.google.com/file/d/1WPYqg-wPFGnnMu8FopXs4EAGAgSq8ZEG/view
[https://lh6.googleusercontent.com/INm-eNbRs_A9CuCjQcxOyoHlX_gmRHHu7FeyMbU1Mj3rOj3UHjYbn0j9tIk8TuM=w1200-h630-p]
heapusage_before_gc.PNG
drive.google.com


https://drive.google.com/file/d/1LYEdcY9Om_0u8ltIHikU7hsuuKYQPh_m/view
[https://lh6.googleusercontent.com/hzRu4UiUAALHmoZuNi2pmNX-M4W2Um7EL67ee6qn0X_F1hpzlx2DVwGndvQ3K2Y=w1200-h630-p]
JVM_memory.PNG
drive.google.com



you indicated that you also run some other software on the same server. Is it 
possible that the other processes hog CPU, disk or network and starve Solr?
>> I will check that

I have tried Solr upgrade from 6.1.0 to 8.5.1 but due to some issue we cannot 
do. I have also asked in here
https://lucene.472066.n3.nabble.com/Sorting-in-other-collection-in-Solr-8-5-1-td4459506.html#a4459562

https://lucene.472066.n3.nabble.com/Query-takes-more-time-in-Solr-8-5-1-compare-to-6-1-0-version-td4458153.html


Why we cannot find the reason of recovery from log? like memory or CPU issue, 
frequent index or search, large query hit,
My log at the time of recovery
https://drive.google.com/file/d/1F8Bn7jSXspe2HRelh_vJjKy9DsTRl9h0/view
[https://lh5.googleusercontent.com/htOUfpihpAqncFsMlCLnSUZPu1_9DRKGNajaXV1jG44fpFzgx51ecNtUK58m5lk=w1200-h630-p]
recovery_shard.txt
drive.google.com


https://drive.google.com/file/d/1y0fC_n5u3MBMQbXrvxtqaD8vBBXDLR6I/view
[https://lh4.googleusercontent.com/WtJhD6JBgBDxbT-hEp59mGl82Z0OIR0CseEKphLm7PGAPwOGB2EXNhe0Dfa5t6E=w1200-h630-p]
recovery_replica.txt
drive.google.com

Regards,
Vishal Patel



From: Ere Maijala 
Sent: Friday, July 10, 2020 2:10 PM
To: solr-user@lucene.apache.org 
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

Walter already said that setting soft commit max time to 100 ms is a
recipe for disaster. That alone can be the issue, but if you're not
willing to try higher values, there's no way of being sure. And you have
huge JVM heaps without an explanation for the reason. If those do not
cause problems, you indicated that you also run some other software on
the same server. Is it possible that the other processes hog CPU, disk
or network and starve Solr?

I must add that Solr 6.1.0 is over four years old. You could be hitting
a bug that has been fixed for years, but even if you encounter an issue
that's still present, you will need to uprgade to get it fixed. If you
look at the number of fixes done in subsequent 6.x versions alone in the
changelog (https://lucene.apache.org/solr/8_5_1/changes/Changes.html)
you'll see that there are a lot of them. You could be hitting something
like SOLR-10420, which has been fixed for over three years.

Best,
Ere

vishal patel kirjoitti 10.7.2020 klo 7.52:
> I’ve been running Solr for a dozen years and I’ve never needed a heap larger 
> than 8 GB.
>>> What is your data size? same like us 1 TB? is your searching or indexing 
>>> frequently? NRT model?
>
> My question is why replica is going into recovery? When replica went down, I 
> checked GC log but GC pause was not more than 2 seconds.
> Also, I cannot find out any reason for recovery from Solr log file. i want to 
> know the reason why replica goes into recovery.
>
> Regards,
> Vishal Patel
> 
> From: Walter Underwood 
> Sent: Friday, July 10, 2020 3:03 AM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>
> Those are extremely large JVMs. Unless you have proven that you MUST
> have 55 GB of heap, use a smaller heap.
>
> I’ve been running Solr for a dozen years and I’ve never needed a heap
> larger than 8 GB.
>
> Also, there is usually no need to use one JVM per replica.
>
> Your configuration is using 110 GB (two JVMs) just for Java
> where I would configure it with a single 8 GB JVM. That would
> free up 100 GB for file caches.
>
> wunder
> 

Re: I Became a Solr Committer in 4662 Days. Here’s how you can do it faster!

2020-07-10 Thread Wesley

thanks for share. nice article.


Charlie Hull wrote:
Thought you might enjoy Eric's blog, it's taken him a while! Some good 
hints here for those of you interested in contributing more to Solr.


https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/ 



Re: I Became a Solr Committer in 4662 Days. Here’s how you can do it faster!

2020-07-10 Thread Vincenzo D'Amore
Thanks, interesting reading

On Fri, Jul 10, 2020 at 11:18 AM Charlie Hull  wrote:

> Hi all,
>
> Thought you might enjoy Eric's blog, it's taken him a while! Some good
> hints here for those of you interested in contributing more to Solr.
>
>
> https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/
>
> Cheers
>
> Charlie
>
> --
> Charlie Hull
> OpenSource Connections, previously Flax
>
> tel/fax: +44 (0)8700 118334
> mobile:  +44 (0)7767 825828
> web: www.o19s.com
>
>

-- 
Vincenzo D'Amore


PayloadCheckQParserPlugin Increase the operator parameter

2020-07-10 Thread Dawn
Hi:
Hope PayloadCheckQParserPlugin can support operator parameters, support 
the operator is a phrase or or multiple values.

As PayloadScoreQParserPlugin 

 1. Analyze parameters. localParams.get("operator", DEFAULT_OPERATOR);

 2. Create SpanQuery, call PayloadUtils.createSpanQuery(String field, 
String value, Analyzer analyzer, String operator) 




I Became a Solr Committer in 4662 Days. Here’s how you can do it faster!

2020-07-10 Thread Charlie Hull

Hi all,

Thought you might enjoy Eric's blog, it's taken him a while! Some good 
hints here for those of you interested in contributing more to Solr.


https://opensourceconnections.com/blog/2020/07/10/i-became-a-solr-committer-in-4662-days-heres-how-you-can-do-it-faster/

Cheers

Charlie

--
Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com



Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-10 Thread Ere Maijala
Walter already said that setting soft commit max time to 100 ms is a
recipe for disaster. That alone can be the issue, but if you're not
willing to try higher values, there's no way of being sure. And you have
huge JVM heaps without an explanation for the reason. If those do not
cause problems, you indicated that you also run some other software on
the same server. Is it possible that the other processes hog CPU, disk
or network and starve Solr?

I must add that Solr 6.1.0 is over four years old. You could be hitting
a bug that has been fixed for years, but even if you encounter an issue
that's still present, you will need to uprgade to get it fixed. If you
look at the number of fixes done in subsequent 6.x versions alone in the
changelog (https://lucene.apache.org/solr/8_5_1/changes/Changes.html)
you'll see that there are a lot of them. You could be hitting something
like SOLR-10420, which has been fixed for over three years.

Best,
Ere

vishal patel kirjoitti 10.7.2020 klo 7.52:
> I’ve been running Solr for a dozen years and I’ve never needed a heap larger 
> than 8 GB.
>>> What is your data size? same like us 1 TB? is your searching or indexing 
>>> frequently? NRT model?
> 
> My question is why replica is going into recovery? When replica went down, I 
> checked GC log but GC pause was not more than 2 seconds.
> Also, I cannot find out any reason for recovery from Solr log file. i want to 
> know the reason why replica goes into recovery.
> 
> Regards,
> Vishal Patel
> 
> From: Walter Underwood 
> Sent: Friday, July 10, 2020 3:03 AM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
> 
> Those are extremely large JVMs. Unless you have proven that you MUST
> have 55 GB of heap, use a smaller heap.
> 
> I’ve been running Solr for a dozen years and I’ve never needed a heap
> larger than 8 GB.
> 
> Also, there is usually no need to use one JVM per replica.
> 
> Your configuration is using 110 GB (two JVMs) just for Java
> where I would configure it with a single 8 GB JVM. That would
> free up 100 GB for file caches.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Jul 8, 2020, at 10:10 PM, vishal patel  
>> wrote:
>>
>> Thanks for reply.
>>
>> what you mean by "Shard1 Allocated memory”
 It means JVM memory of one solr node or instance.
>>
>> How many Solr JVMs are you running?
 In one server 2 solr JVMs in which one is shard and other is replica.
>>
>> What is the heap size for your JVMs?
 55GB of one Solr JVM.
>>
>> Regards,
>> Vishal Patel
>>
>> Sent from Outlook
>> 
>> From: Walter Underwood 
>> Sent: Wednesday, July 8, 2020 8:45 PM
>> To: solr-user@lucene.apache.org 
>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>>
>> I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know 
>> of
>> any way to dedicate system RAM to an application object like a replica.
>>
>> How many Solr JVMs are you running?
>>
>> What is the heap size for your JVMs?
>>
>> Setting soft commit max time to 100 ms does not magically make Solr super 
>> fast.
>> It makes Solr do too much work, makes the work queues fill up, and makes it 
>> fail.
>>
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>>
>>> On Jul 7, 2020, at 10:55 PM, vishal patel  
>>> wrote:
>>>
>>> Thanks for your reply.
>>>
>>> One server has total 320GB ram. In this 2 solr node one is shard1 and 
>>> second is shard2 replica. Each solr node have 55GB memory allocated. shard1 
>>> has 585GB data and shard2 replica has 492GB data. means almost 1TB data in 
>>> this server. server has also other applications and for that 60GB memory 
>>> allocated. So total 150GB memory is left.
>>>
>>> Proper formatting details:
>>> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
>>>
>>> Are you running multiple huge JVMs?
> Not huge but 60GB memory allocated for our 11 application. 150GB memory 
> are still free.
>>>
>>> The servers will be doing a LOT of disk IO, so look at the read and write 
>>> iops. I expect that the solr processes are blocked on disk reads almost all 
>>> the time.
> is it chance to go in recovery mode if more IO read and write or blocked?
>>>
>>> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
> Our requirement is NRT so we keep the less time
>>>
>>> Regards,
>>> Vishal Patel
>>> 
>>> From: Walter Underwood 
>>> Sent: Tuesday, July 7, 2020 8:15 PM
>>> To: solr-user@lucene.apache.org 
>>> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>>>
>>> This isn’t a support list, so nobody looks at issues. We do try to help.
>>>
>>> It looks like you have 1 TB of index on a system with 320 GB of RAM.
>>> I don’t know what "Shard1 Allocated memory” is, but maybe half of
>>> that RAM is