Re:json.facet floods the filterCache

2020-10-22 Thread Christine Poerschke (BLOOMBERG/ LONDON)
Hi Damien,

You mention about JSON term facets, I haven't explored w.r.t. that but we have 
observed what you describe for JSON range facets and I've started 
https://issues.apache.org/jira/browse/SOLR-14939 about it.

Hope that helps.

Regards,
Christine

From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:  
solr-user@lucene.apache.org
Subject: json.facet floods the filterCache

Hi,

I'm using a json.facet query on nested facets terms and am seeing very high
filterCache usage. Is it possible to somehow control this? With a fq it's
possible to specify fq={!cache=false}... but I don't see a similar thing
json.facet.

Kind regards,
Damien




When are the score values evaluated?

2020-10-22 Thread Taisuke Miyazaki
Hi,

If you use a high value for the score, the values on the smaller scale are
ignored.

Example :
bq = foo:(1.0)^1.0
bf = sum(200)

When I do this, the additional score for "foo" at 1.0 does not affect the
sort order.

I'm assuming this is an issue with the precision of the score floating
point, is that correct?

As a test, if we change the query as follows, the order will change as you
would expect, reflecting the additional score of "foo" when it is 1.0
bq = foo:(1.0)^10
bf = sum(200)

How can I avoid this?
The idea I'm thinking of at the moment is to divide the whole thing by an
appropriate number, such as bf= div(sum(200),100).
However, this may or may not work as expected depending on when the
floating point operations are done and rounded off.

At what point are score's floats rounded?

1. when sorting
2. when calculating the score
3. when evaluating each function for each bq and bf

Regards,
Taisuke


Re: When are the score values evaluated?

2020-10-22 Thread Erick Erickson
You’d get a much better idea of what goes on
if you added &explain=true and analyzed the
output. That’d show you exactly what is
calculated when.

Best,
Erick

> On Oct 22, 2020, at 4:05 AM, Taisuke Miyazaki  
> wrote:
> 
> Hi,
> 
> If you use a high value for the score, the values on the smaller scale are
> ignored.
> 
> Example :
> bq = foo:(1.0)^1.0
> bf = sum(200)
> 
> When I do this, the additional score for "foo" at 1.0 does not affect the
> sort order.
> 
> I'm assuming this is an issue with the precision of the score floating
> point, is that correct?
> 
> As a test, if we change the query as follows, the order will change as you
> would expect, reflecting the additional score of "foo" when it is 1.0
> bq = foo:(1.0)^10
> bf = sum(200)
> 
> How can I avoid this?
> The idea I'm thinking of at the moment is to divide the whole thing by an
> appropriate number, such as bf= div(sum(200),100).
> However, this may or may not work as expected depending on when the
> floating point operations are done and rounded off.
> 
> At what point are score's floats rounded?
> 
> 1. when sorting
> 2. when calculating the score
> 3. when evaluating each function for each bq and bf
> 
> Regards,
> Taisuke



Re: Solr 8.6.3

2020-10-22 Thread David Smiley
Kris,

>From a user's standpoint, the DIH is not deprecated.  I think we as a
project screwed up the messaging around components in Solr that are
*moving* in terms of code maintenance.  That is not deprecation yet we
referred to it as such, hence your understandable confusion.  I corrected
the warning about this in 8.7, so you won't see that again.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Oct 15, 2020 at 4:13 PM Kris Gurusamy 
wrote:

> I've just downloaded solr 8.6.3 and trying to create DIH for loading
> structured XML. I found out that DIH will be deprecated soon with version
> 9.0. What is the equivalent of DIH in new solr version? How do I import
> structured XML data which is very custom and index in Solr new version? Any
> help is appreciated.
>
> Regards
>
> Kris Gurusamy
> Director, Engineering
> kgurus...@xpanse.com
> www.xpanse.com
>
> On 10/15/20, 1:08 PM, "Anshum Gupta (Jira)"  wrote:
>
>
>  [
> https://issues.apache.org/jira/browse/SOLR-14938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
>
> Anshum Gupta resolved SOLR-14938.
> -
> Resolution: Invalid
>
> [~krisgurusamy] - Please ask questions regarding usage on the Solr
> user mailing list.
>
> JIRA is meant for issue tracking purposes.
>
> > Solr 8.6.3
> > --
> >
> > Key: SOLR-14938
> > URL:
> https://issues.apache.org/jira/browse/SOLR-14938
> > Project: Solr
> >  Issue Type: Bug
> >  Security Level: Public(Default Security Level. Issues are
> Public)
> >  Components: contrib - DataImportHandler
> >Reporter: Krishnan
> >Priority: Major
> >
> > I've just downloaded solr 8.6.3 and trying to create DIH for loading
> structured XML. I found out that DIH will be deprecated soon with version
> 9.0. What is the equivalent of DIH in new solr version? How do I import
> structured XML data which is very custom and index in Solr new version? Any
> help is appreciated.
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>
>


Re: json.facet floods the filterCache

2020-10-22 Thread Michael Gibney
Damien,
Are you able to share the actual json.facet request that you're using
(at least just the json.facet part)? I'm having a hard time being
confident that I'm correctly interpreting when you say "a json.facet
query on nested facets terms".
Michael

On Thu, Oct 22, 2020 at 3:52 AM Christine Poerschke (BLOOMBERG/
LONDON)  wrote:
>
> Hi Damien,
>
> You mention about JSON term facets, I haven't explored w.r.t. that but we 
> have observed what you describe for JSON range facets and I've started 
> https://issues.apache.org/jira/browse/SOLR-14939 about it.
>
> Hope that helps.
>
> Regards,
> Christine
>
> From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:  
> solr-user@lucene.apache.org
> Subject: json.facet floods the filterCache
>
> Hi,
>
> I'm using a json.facet query on nested facets terms and am seeing very high
> filterCache usage. Is it possible to somehow control this? With a fq it's
> possible to specify fq={!cache=false}... but I don't see a similar thing
> json.facet.
>
> Kind regards,
> Damien
>
>


Re: Backup fails despite allowPaths=* being set

2020-10-22 Thread Philipp Trulson
I'm sure that this is not the case. On the Java Properties page it says
"solr.allowPaths  *", on the dashboard I can verify that the
"-Dsolr.allowPaths=*" option is present.

Am Mi., 21. Okt. 2020 um 19:10 Uhr schrieb Jan Høydahl <
jan@cominvent.com>:

> Are you sure the * is not eaten by the shell since it’s a special char?
> You can view the sys props in admin UI to check.
>
> Jan Høydahl
>
> > 16. okt. 2020 kl. 19:39 skrev Philipp Trulson :
> >
> > Hello everyone,
> >
> > we are having problems with our backup script since we upgraded to Solr
> > 8.6.2 on kubernetes. To be more precise the message is
> > *Path /data/backup/2020-10-16/collection must be relative to SOLR_HOME,
> > SOLR_DATA_HOME coreRootDirectory. Set system property 'solr.allowPaths'
> to
> > add other allowed paths.*
> >
> > I executed the script by calling this endpoint
> > *curl
> > '
> http://solr.default.svc.cluster.local/solr/admin/collections?action=BACKUP&name=collection&collection=
> > <
> http://solr.default.svc.cluster.local/solr/admin/collections?action=BACKUP&name=collection&collection=
> >*
> > collection*&location=/data/backup/2020-10-16&async=1114'*
> >
> > The strange thing is that all 5 nodes are started with
> *-Dsolr.allowPaths=**,
> > so in theory it should work. The folder is an AWS EFS share, that's the
> > only reason I can imagine. Or can I check any other options?
> >
> > Thank you for your help!
> > Philipp
> >
> > --
> >
> >
> > 
> >
> > reBuy reCommerce GmbH* · *Potsdamer Str. 188* ·
> > *10783 Berlin* · *Geschäftsführer: Dr. Philipp GattnerSitz und
> > Registergericht: Berlin, Amtsgericht Charlottenburg, HRB 109344 B,
> > *USt-ID-Nr.:* DE237458635
>


-- 

Philipp Trulson

Platform Engineer
mail: p.trul...@rebuy.com · web: www.reBuy.de 

-- 


 

reBuy reCommerce GmbH* · *Potsdamer Str. 188* · 
*10783 Berlin* · *Geschäftsführer: Dr. Philipp GattnerSitz und 
Registergericht: Berlin, Amtsgericht Charlottenburg, HRB 109344 B, 
*USt-ID-Nr.:* DE237458635


Re: SolrCore Initialization Failures in Solr 8.0.0

2020-10-22 Thread Hari Krishnan
Hey we face similar kind of issue . Our version of Solr is 8.5.1.Also we run
as docker containers in ubuntu server.
Could you provide similar solution ?



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Possible to add a default "appends" fq except for queries in the admin GUI?

2020-10-22 Thread Batanun B
Hi,

We have multiple components that uses the Solr search feature on our websites. 
But we have some documents in the index that we never want to display in the 
search results (nothing secret or anything, just uninteresting for the user to 
see). So far, we have added a fq to all our queries, that filters out these 
documents. But we would like to not have to do this, since there is always a 
risk of us forgetting to add that fq parameter.

So, today i tried adding this fq in a "appends" list in the standard 
requestHandler. I needed to add it to the standard one, since that's the one 
that all the search components use (ie, no qt parameter defined, and i would 
prefer not to have to change that). That worked fine. Until I needed to do a 
query in the solr admin GUI, and realized that this filter query was used there 
too, effectively hiding a bunch of documents that I as an administrator need to 
see.

Is there a way to avoid this problem? Can I somehow configure Solr to not use 
this filter query when in the admin GUI? If I define a separate request handler 
in solrconfig, can i make the admin GUI always use this by default? I don't 
want to have to manually change the request handler in the admin GUI every time.

What I tried so far:

* Adding the fq in the "appends" in the standard request handler, as mentioned 
above. Causing the filter to always be in effect, even in admin GUI
* Keeping the configuration as above, but also adding a request handler with 
name="/select", that doesn't have this fq defined. Then the filter was never 
applied, not in admin GUI and not on any website search


Re: Possible to add a default "appends" fq except for queries in the admin GUI?

2020-10-22 Thread Alexandre Rafalovitch
Why not have a custom handler endpoint for your online queries? You
will be modifying them anyway to remove fq.

Or even create individual endpoints for every significant use-case.
You can share the configuration between them with initParams or
useParams, but have more flexibility going forward.

Admin UI allows you to change /select, but - like you said - manually
and every time.

Regards,
  Alex.

On Thu, 22 Oct 2020 at 14:18, Batanun B  wrote:
>
> Hi,
>
> We have multiple components that uses the Solr search feature on our 
> websites. But we have some documents in the index that we never want to 
> display in the search results (nothing secret or anything, just uninteresting 
> for the user to see). So far, we have added a fq to all our queries, that 
> filters out these documents. But we would like to not have to do this, since 
> there is always a risk of us forgetting to add that fq parameter.
>
> So, today i tried adding this fq in a "appends" list in the standard 
> requestHandler. I needed to add it to the standard one, since that's the one 
> that all the search components use (ie, no qt parameter defined, and i would 
> prefer not to have to change that). That worked fine. Until I needed to do a 
> query in the solr admin GUI, and realized that this filter query was used 
> there too, effectively hiding a bunch of documents that I as an administrator 
> need to see.
>
> Is there a way to avoid this problem? Can I somehow configure Solr to not use 
> this filter query when in the admin GUI? If I define a separate request 
> handler in solrconfig, can i make the admin GUI always use this by default? I 
> don't want to have to manually change the request handler in the admin GUI 
> every time.
>
> What I tried so far:
>
> * Adding the fq in the "appends" in the standard request handler, as 
> mentioned above. Causing the filter to always be in effect, even in admin GUI
> * Keeping the configuration as above, but also adding a request handler with 
> name="/select", that doesn't have this fq defined. Then the filter was never 
> applied, not in admin GUI and not on any website search


Re: Possible to add a default "appends" fq except for queries in the admin GUI?

2020-10-22 Thread Batanun B
Well, we are not making the http requests to solr, we are using a 3rd party 
component for that, which as configuration basically only takes a base URL (ie 
domain, port etc, without path) and the name of the collection. So it is not 
possible to define the request parser here. So we would have to specify it as a 
qt parameter when performing the search, but that is more or less the same as 
having to define the fq parameter to filter out unwanted documents. We would 
like the fundamental basic "no frills" default search to include this fq, while 
still not being hindered by it when using the solr admin gui. But, if I 
interpret you correctly that is impossible, right?

From: Alexandre Rafalovitch 
Sent: Thursday, October 22, 2020 8:50 PM
To: solr-user 
Subject: Re: Possible to add a default "appends" fq except for queries in the 
admin GUI?

Why not have a custom handler endpoint for your online queries? You
will be modifying them anyway to remove fq.

Or even create individual endpoints for every significant use-case.
You can share the configuration between them with initParams or
useParams, but have more flexibility going forward.

Admin UI allows you to change /select, but - like you said - manually
and every time.

Regards,
  Alex.

On Thu, 22 Oct 2020 at 14:18, Batanun B  wrote:
>
> Hi,
>
> We have multiple components that uses the Solr search feature on our 
> websites. But we have some documents in the index that we never want to 
> display in the search results (nothing secret or anything, just uninteresting 
> for the user to see). So far, we have added a fq to all our queries, that 
> filters out these documents. But we would like to not have to do this, since 
> there is always a risk of us forgetting to add that fq parameter.
>
> So, today i tried adding this fq in a "appends" list in the standard 
> requestHandler. I needed to add it to the standard one, since that's the one 
> that all the search components use (ie, no qt parameter defined, and i would 
> prefer not to have to change that). That worked fine. Until I needed to do a 
> query in the solr admin GUI, and realized that this filter query was used 
> there too, effectively hiding a bunch of documents that I as an administrator 
> need to see.
>
> Is there a way to avoid this problem? Can I somehow configure Solr to not use 
> this filter query when in the admin GUI? If I define a separate request 
> handler in solrconfig, can i make the admin GUI always use this by default? I 
> don't want to have to manually change the request handler in the admin GUI 
> every time.
>
> What I tried so far:
>
> * Adding the fq in the "appends" in the standard request handler, as 
> mentioned above. Causing the filter to always be in effect, even in admin GUI
> * Keeping the configuration as above, but also adding a request handler with 
> name="/select", that doesn't have this fq defined. Then the filter was never 
> applied, not in admin GUI and not on any website search


Re: solr performance with >1 NUMAs

2020-10-22 Thread Wei
Hi Shawn,

I.m circling back with some new findings with our 2 NUMA issue.  After a
few iterations, we do see improvement with the useNUMA flag and other JVM
setting changes. Here are the current settings, with Java 11:

-XX:+UseNUMA

-XX:+UseG1GC

-XX:+AlwaysPreTouch

-XX:+UseTLAB

-XX:G1MaxNewSizePercent=20

-XX:MaxGCPauseMillis=150

-XX:+DisableExplicitGC

-XX:+DoEscapeAnalysis

-XX:+ParallelRefProcEnabled

-XX:+UnlockDiagnosticVMOptions

-XX:+UnlockExperimentalVMOptions


Compared to previous Java 8 + CMS on 2 NUMA servers,  P99 latency has
improved over 20%.


Thanks,

Wei




On Mon, Sep 28, 2020 at 4:02 PM Shawn Heisey  wrote:

> On 9/28/2020 12:17 PM, Wei wrote:
> > Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do
> you
> > see any backward compatibility issue for Solr 8 with Java 11? Can we run
> > Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
> > 11 JDK?
>
> I do not know of any problems running the binary release of Solr 8
> (which is most likely built with the Java 8 JDK) with a newer release
> like Java 11 or higher.
>
> I think Sun was really burned by such problems cropping up in the days
> of Java 5 and 6, and their developers have worked really hard to make
> sure that never happens again.
>
> If you're running Java 11, you will need to pick a different garbage
> collector if you expect the NUMA flag to function.  The most recent
> releases of Solr are defaulting to G1GC, which as previously mentioned,
> did not gain NUMA optimizations until Java 14.
>
> It is not clear to me whether the NUMA optimizations will work with any
> collector other than Parallel until Java 14.  You would need to check
> Java documentation carefully or ask someone involved with development of
> Java.
>
> If you do see an improvement using the NUMA flag with Java 11, please
> let us know exactly what options Solr was started with.
>
> Thanks,
> Shawn
>


Re: solr performance with >1 NUMAs

2020-10-22 Thread matthew sporleder
Great updates.  Thanks for keeping us all in the loop!

On Thu, Oct 22, 2020 at 7:43 PM Wei  wrote:
>
> Hi Shawn,
>
> I.m circling back with some new findings with our 2 NUMA issue.  After a
> few iterations, we do see improvement with the useNUMA flag and other JVM
> setting changes. Here are the current settings, with Java 11:
>
> -XX:+UseNUMA
>
> -XX:+UseG1GC
>
> -XX:+AlwaysPreTouch
>
> -XX:+UseTLAB
>
> -XX:G1MaxNewSizePercent=20
>
> -XX:MaxGCPauseMillis=150
>
> -XX:+DisableExplicitGC
>
> -XX:+DoEscapeAnalysis
>
> -XX:+ParallelRefProcEnabled
>
> -XX:+UnlockDiagnosticVMOptions
>
> -XX:+UnlockExperimentalVMOptions
>
>
> Compared to previous Java 8 + CMS on 2 NUMA servers,  P99 latency has
> improved over 20%.
>
>
> Thanks,
>
> Wei
>
>
>
>
> On Mon, Sep 28, 2020 at 4:02 PM Shawn Heisey  wrote:
>
> > On 9/28/2020 12:17 PM, Wei wrote:
> > > Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do
> > you
> > > see any backward compatibility issue for Solr 8 with Java 11? Can we run
> > > Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java
> > > 11 JDK?
> >
> > I do not know of any problems running the binary release of Solr 8
> > (which is most likely built with the Java 8 JDK) with a newer release
> > like Java 11 or higher.
> >
> > I think Sun was really burned by such problems cropping up in the days
> > of Java 5 and 6, and their developers have worked really hard to make
> > sure that never happens again.
> >
> > If you're running Java 11, you will need to pick a different garbage
> > collector if you expect the NUMA flag to function.  The most recent
> > releases of Solr are defaulting to G1GC, which as previously mentioned,
> > did not gain NUMA optimizations until Java 14.
> >
> > It is not clear to me whether the NUMA optimizations will work with any
> > collector other than Parallel until Java 14.  You would need to check
> > Java documentation carefully or ask someone involved with development of
> > Java.
> >
> > If you do see an improvement using the NUMA flag with Java 11, please
> > let us know exactly what options Solr was started with.
> >
> > Thanks,
> > Shawn
> >


Metric Trigger not being recognised & picked up

2020-10-22 Thread Jonathan Tan
Hi All

I've been trying to get a metric trigger set up in SolrCloud 8.4.1, but
it's not working, and was hoping for some help.

I've created a metric trigger using this:

```
POST /solr/admin/autoscaling {
  "set-trigger": {
"name": "metric_trigger",
"event": "metric",
"waitFor": "10s",
"metric": "metrics:solr.jvm:os.systemCpuLoad",
"above": 0.7,
"preferredOperation": "MOVEREPLICA",
"enabled": true
  }
}
```

And I get a successful response.

I can also see the new trigger in the `files -> tree -> autoscaling.json`.

However, I don't see any difference in the logs (I had the autoscaling
logging set to debug), and it's definitely not moving any replicas around
when under load, and the node is consistently in the > 85% overall
systemCpuLoad. (I can see this as well when I use the `/metrics` endpoint
with the above key.)


I then restarted all the nodes, and saw this error on startup, saying it
couldn't set the state during a restore, with the worrying part saying that
it is discarding the trigger...

I'd really like some help with this.

We've been seeing that out of the 3 nodes, there's always - seemingly
randomly - massively utilised on CPU (maxed out 8 cores, and it's not
always the one with overseer), so we were hoping that we could let the
Metric Trigger sort it out in the short term.

```
2020-10-22 23:03:19.905 ERROR (ScheduledTrigger-7-thread-3) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state jvm_cpu_trigger
=> java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
2020-10-22 23:03:19.912 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Failed to re-play event, discarding: {
  "id":"dd2ebf3d56bTboddkoovyjxdvy1hauq2zskpt",
  "source":"metric_trigger",
  "eventTime":15199552918891,
  "eventType":"METRIC",
  "properties":{

"node":{"mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr":0.7322834645669292},
"_dequeue_time_":261690991035,
"metric":"metrics:solr.jvm:os.systemCpuLoad",
"preferredOperation":"MOVEREPLICA",
"_enqueue_time_":15479182216601,
"requestedOps":[{
"action":"MOVEREPLICA",

"hints":{"SRC_NODE":["mycoll-solr-solr-service-1.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr"]}}],
"replaying":true}}
2020-10-22 23:03:19.913 INFO
 
(OverseerStateUpdate-144115201265369088-mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr-n_000199)
[   ] o.a.s.c.o.SliceMutator createReplica() {
  "operation":"addreplica",
  "collection":"mycoll-2",
  "shard":"shard5",
  "core":"mycoll-2_shard5_replica_n122",
  "state":"down",
  "base_url":"
http://mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983/solr
",

"node_name":"mycoll-solr-solr-service-0.mycoll-solr-solr-service-headless.mycoll-solr-test:8983_solr",
  "type":"NRT"}
2020-10-22 23:03:19.921 ERROR (ScheduledTrigger-7-thread-1) [   ]
o.a.s.c.a.ScheduledTriggers Error restoring trigger state metric_trigger =>
java.lang.NullPointerException
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
java.lang.NullPointerException: null
at
org.apache.solr.cloud.autoscaling.MetricTrigger.setState(MetricTrigger.java:94)
~[?:?]
at
org.apache.solr.cloud.autoscaling.TriggerBase.restoreState(TriggerBase.java:279)
~[?:?]
at
org.apache.solr.cloud.autoscaling.ScheduledTriggers$TriggerWrapper.run(ScheduledTriggers.java:638)
~[?:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
~[?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[?:?]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]

```


Any help please?
Thank you
Jonathan


Massively unbalanced CPU by different SOLR Nodes

2020-10-22 Thread Jonathan Tan
Hi,

We've got a 3 node SolrCloud cluster running on GKE, each on their own kube
node (which is in itself, relatively empty of other things).

Our collection has ~18m documents of 36gb in size, split into 6 shards with
2 replicas each, and they are evenly distributed across the 3 nodes. Our
JVMs are currently sized to ~14gb min & max , and they are running on SSDs.


[image: Screen Shot 2020-10-23 at 2.15.48 pm.png]

Graph also available here: https://pasteboard.co/JwUQ98M.png

Under perf testing of ~30 requests per second, we start seeing really bad
response times (around 3s in the 90th percentile, and *one* of the nodes
would be fully maxed out on CPU. At about 15 requests per second, our
response times are reasonable enough for our purposes (~0.8-1.1s), but as
is visible in the graph, it's definitely *not* an even distribution of the
CPU load. One of the nodes is running at around 13cores, whilst the other 2
are running at ~8cores and 6 cores respectively.

We've tracked in our monitoring tools that the 3 nodes *are* getting an
even distribution of requests, and we're using a Kube service which is in
itself a fairly well known tool for load balancing pods. We've also used
kube services heaps for load balancing of other apps and haven't seen such
a problem, so we doubt it's the load balancer that is the problem.

All 3 nodes are built from the same kubernetes statefulset deployment so
they'd all have the same configuration & setup. Additionally, over the
course of the day, it may suddenly change so that an entirely different
node is the one that is majorly overloaded on CPU.

All this is happening only under queries, and we are doing no indexing at
that time.

We'd initially thought it might be the overseer that is being majorly
overloaded when under queries (although we were surprised) until we did
more testing and found that even the nodes that weren't overseer would
sometimes have that disparity. We'd also tried using the `ADDROLE` API to
force an overseer change in the middle of a test, and whilst the tree
updated to show that the overseer had changed, it made no difference to the
highest CPU load.

Directing queries directly to the non-busy nodes do actually give us back
decent response times.

We're quite puzzled by this and would really like some help figuring out
*why* the CPU on one is so much higher. I did try to get the jaeger tracing
working (we already have jaeger in our cluster), but we just kept getting
errors on startup with solr not being able to load the main function...


Thank you in advance!
Cheers
Jonathan


Re: json.facet floods the filterCache

2020-10-22 Thread damienk
Im dong a nested facet (
https://lucene.apache.org/solr/guide/8_6/json-facet-api.html#nested-facets)
or sub-facets, and am using the 'terms' facet.

Digging around more looks like I can set 'cacheDf=-1' to disable the use of
the cache.

On Fri, 23 Oct 2020 at 00:14, Michael Gibney 
wrote:

> Damien,
> Are you able to share the actual json.facet request that you're using
> (at least just the json.facet part)? I'm having a hard time being
> confident that I'm correctly interpreting when you say "a json.facet
> query on nested facets terms".
> Michael
>
> On Thu, Oct 22, 2020 at 3:52 AM Christine Poerschke (BLOOMBERG/
> LONDON)  wrote:
> >
> > Hi Damien,
> >
> > You mention about JSON term facets, I haven't explored w.r.t. that but
> we have observed what you describe for JSON range facets and I've started
> https://issues.apache.org/jira/browse/SOLR-14939 about it.
> >
> > Hope that helps.
> >
> > Regards,
> > Christine
> >
> > From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:
> solr-user@lucene.apache.org
> > Subject: json.facet floods the filterCache
> >
> > Hi,
> >
> > I'm using a json.facet query on nested facets terms and am seeing very
> high
> > filterCache usage. Is it possible to somehow control this? With a fq it's
> > possible to specify fq={!cache=false}... but I don't see a similar thing
> > json.facet.
> >
> > Kind regards,
> > Damien
> >
> >
>


Re: When are the score values evaluated?

2020-10-22 Thread Taisuke Miyazaki
Thanks.

I analyzed it as explain=true and this is what I found.
Why does this behave this way?

fq=foo:1
bq=foo:(1)^1
bf=sum(200)

If you do this, the score will be boosted by bq.
However, if you remove fq, the score will not be boosted by bq.
However, if you change the boost value of bq to 2, bq will be boosted
regardless of whether you have fq or not.

This behavior seems very strange to me. (I'm not familiar with the
internals of Solr or Lucene).

By the way, this doesn't happen if you change the sum number to a value
that doesn't need to be expressed as an exponent. (20,000,000 is marked as
2.0E7 on EXPLAIN.)

Regards,
Taisuke

2020年10月22日(木) 21:41 Erick Erickson :

> You’d get a much better idea of what goes on
> if you added &explain=true and analyzed the
> output. That’d show you exactly what is
> calculated when.
>
> Best,
> Erick
>
> > On Oct 22, 2020, at 4:05 AM, Taisuke Miyazaki <
> miyazakitais...@lifull.com> wrote:
> >
> > Hi,
> >
> > If you use a high value for the score, the values on the smaller scale
> are
> > ignored.
> >
> > Example :
> > bq = foo:(1.0)^1.0
> > bf = sum(200)
> >
> > When I do this, the additional score for "foo" at 1.0 does not affect the
> > sort order.
> >
> > I'm assuming this is an issue with the precision of the score floating
> > point, is that correct?
> >
> > As a test, if we change the query as follows, the order will change as
> you
> > would expect, reflecting the additional score of "foo" when it is 1.0
> > bq = foo:(1.0)^10
> > bf = sum(200)
> >
> > How can I avoid this?
> > The idea I'm thinking of at the moment is to divide the whole thing by an
> > appropriate number, such as bf= div(sum(200),100).
> > However, this may or may not work as expected depending on when the
> > floating point operations are done and rounded off.
> >
> > At what point are score's floats rounded?
> >
> > 1. when sorting
> > 2. when calculating the score
> > 3. when evaluating each function for each bq and bf
> >
> > Regards,
> > Taisuke
>
>