Solaris Install Package

2019-10-07 Thread Andrew Corbett
I have been trying to research the possibility of adding Solr to servers 
running the Solaris 10 and 11 operating systems. Solaris isn't mentioned in the 
documentation. Would adding Solr to these servers be possible? Would I need to 
make a feature request?


Re: How to block expensive solr queries

2019-10-07 Thread Wei
Hi Mikhail,

Yes I have the timeAllowed parameter configured, still is this case it
doesn't seem to prevent the stats request from blocking other normal
queries.  Is it possible to drop the request before solr executes it? maybe
at the jetty request filter?

Thanks,
Wei

On Mon, Oct 7, 2019 at 1:39 PM Mikhail Khludnev  wrote:

> Hello, Wei.
>
> Have you tried to abandon heavy queries with
>
> https://lucene.apache.org/solr/guide/8_1/common-query-parameters.html#CommonQueryParameters-ThetimeAllowedParameter
>  ?
> It may or may not be able to stop stats.
>
> https://github.com/apache/lucene-solr/blob/25eda17c66f0091dbf6570121e38012749c07d72/solr/core/src/test/org/apache/solr/cloud/CloudExitableDirectoryReaderTest.java#L223
> can clarify it.
>
> On Mon, Oct 7, 2019 at 8:19 PM Wei  wrote:
>
> > Hi,
> >
> > Recently we encountered a problem when solr cloud query latency suddenly
> > increase, many simple queries that has small recall gets time out. After
> > digging a bit I found that the root cause is some stats queries happen at
> > the same time, such as
> >
> >
> >
> /solr/mycollection/select?stats=true=unique_ids=true
> >
> >
> >
> > I see unique_ids is a high cardinality field so this query is quite
> > expensive. But why a small volume of such query blocks other queries and
> > make simple queries time out?  I checked the solr thread pool and see
> there
> > are plenty of idle threads available.  We are using solr 7.6.2 with a 10
> > shard cloud set up.
> >
> > Is there a way to block certain solr queries based on url pattern? i.e.
> > ignore the stats.calcdistinct request in this case.
> >
> >
> > Thanks,
> >
> > Wei
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>


investigating high heap memory usage particularly on overseer / collection leaders

2019-10-07 Thread dshih
3-node SOLR 7.4.0
24gb max heap memory
13 collections, each with 500mb-2gb index (on disk)

We are investigating high heap memory usage/spikes with our SOLR cluster
(details above).  After rebooting the cluster, all three instances stay
under 2gb for about a day.  Then suddenly, one instance (srch01 in the below
graph) spikes to about 7.5gb and begins a cycle of 3gb-7.5gb ups-and-downs. 
On this cluster, srch01 is both the overseer and the leader for all
collections.  A few days later, the same trend begins occurring for another
node (srch02).

Are there known usage patterns that would cause this kind of memory usage
with SOLR?  In particular, it seems odd that it would only affect the
overseer/leaders node for days.  Also, any tips on investigation?  We
haven't been able to deduce much from visualvm profiling.

Additional context.  For years, we set max heap memory to 4gb.  But our SOLR
instances recently began to OOM.  Increasing to 8gb helped, but the OOMs
still eventually occurred.  This is how we eventually set it to 24gb
(following SOLR documentation saying 10-20gb was not uncommon for production
instances).  But the recent change is what makes us suspicious that some
client usage pattern is the root cause.

 



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: How to block expensive solr queries

2019-10-07 Thread Mikhail Khludnev
Hello, Wei.

Have you tried to abandon heavy queries with
https://lucene.apache.org/solr/guide/8_1/common-query-parameters.html#CommonQueryParameters-ThetimeAllowedParameter
 ?
It may or may not be able to stop stats.
https://github.com/apache/lucene-solr/blob/25eda17c66f0091dbf6570121e38012749c07d72/solr/core/src/test/org/apache/solr/cloud/CloudExitableDirectoryReaderTest.java#L223
can clarify it.

On Mon, Oct 7, 2019 at 8:19 PM Wei  wrote:

> Hi,
>
> Recently we encountered a problem when solr cloud query latency suddenly
> increase, many simple queries that has small recall gets time out. After
> digging a bit I found that the root cause is some stats queries happen at
> the same time, such as
>
>
> /solr/mycollection/select?stats=true=unique_ids=true
>
>
>
> I see unique_ids is a high cardinality field so this query is quite
> expensive. But why a small volume of such query blocks other queries and
> make simple queries time out?  I checked the solr thread pool and see there
> are plenty of idle threads available.  We are using solr 7.6.2 with a 10
> shard cloud set up.
>
> Is there a way to block certain solr queries based on url pattern? i.e.
> ignore the stats.calcdistinct request in this case.
>
>
> Thanks,
>
> Wei
>


-- 
Sincerely yours
Mikhail Khludnev


How to block expensive solr queries

2019-10-07 Thread Wei
Hi,

Recently we encountered a problem when solr cloud query latency suddenly
increase, many simple queries that has small recall gets time out. After
digging a bit I found that the root cause is some stats queries happen at
the same time, such as

/solr/mycollection/select?stats=true=unique_ids=true



I see unique_ids is a high cardinality field so this query is quite
expensive. But why a small volume of such query blocks other queries and
make simple queries time out?  I checked the solr thread pool and see there
are plenty of idle threads available.  We are using solr 7.6.2 with a 10
shard cloud set up.

Is there a way to block certain solr queries based on url pattern? i.e.
ignore the stats.calcdistinct request in this case.


Thanks,

Wei


Re: json.facet throws ClassCastException

2019-10-07 Thread Mikhail Khludnev
Note. It seems like it's addressed already
https://issues.apache.org/jira/browse/SOLR-12330
https://gitbox.apache.org/repos/asf?p=lucene-solr.git;a=commitdiff;h=bf69a40#patch2



On Sat, Oct 5, 2019 at 10:43 AM Andrea Gazzarini 
wrote:

> Hi, problem should be caused by missing surrounding curly brackets.
> That is, your query is
>
> json.facet=prod:{type:terms,field:product,mincount:1,limit:8}
>
> instead it should be
>
> json.facet=*{*prod:{type:terms,field:product,mincount:1,limit:8}*}*
>
> that causes the wrong interpretation of the "json/facet" parameter
> (String instead of Map)
>
> Cheers,
> Andrea
>
> On 04/10/2019 22:55, Mikhail Khludnev wrote:
> > Gosh, obviously. see the clue
> >
> https://github.com/apache/lucene-solr/blob/7d3dcd220f92f25a997cf1559a91b6d9e1b57c6d/solr/core/src/java/org/apache/solr/search/facet/FacetModule.java#L78
> >
> > On Fri, Oct 4, 2019 at 10:47 PM Webster Homer <
> > webster.ho...@milliporesigma.com> wrote:
> >
> >> Sometimes it comes back in the reply
> >> "java.lang.ClassCastException: java.lang.String cannot be cast to
> >> java.util.Map\n\tat
> >>
> org.apache.solr.search.facet.FacetModule.prepare(FacetModule.java:78)\n\tat
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)\n\tat
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)\n\tat
> >> org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)\n\tat
> >>
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)\n\tat
> >> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)\n\tat
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)\n\tat
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)\n\tat
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)\n\tat
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\n\tat
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\n\tat
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\n\tat
> >>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\n\tat
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\n\tat
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\n\tat
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\n\tat
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\n\tat
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
> >>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
> >>
> org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:169)\n\tat
> >>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\n\tat
> >> org.eclipse.jetty.server.Server.handle(Server.java:534)\n\tat
> >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\n\tat
> >>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\n\tat
> >> org.eclipse.jetty.io
> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\n\tat
> >> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\n\tat
> >> org.eclipse.jetty.io
> .SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\n\tat
> >>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\n\tat
> >>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\n\tat
> >>
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\n\tat
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\n\tat
> >>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\n\tat
> >> java.lang.Thread.run(Thread.java:748)\n",
> >>
> >> -Original Message-
> >> From: Mikhail Khludnev 
> >> Sent: Friday, October 04, 2019 2:28 PM
> >> To: solr-user 
> >> Subject: Re: json.facet throws ClassCastException
> >>
> >> Hello, Webster.
> >>
> >> Have you managed to capture stacktrace?
> >>
> >> On Fri, Oct 4, 2019 at 8:24 PM Webster Homer <
> >> webster.ho...@milliporesigma.com> wrote:
> >>
> >>> I'm trying to understand what is wrong with my query or collection.
> >>>
> >>> I have a functioning solr schema and collection. I'm running Solr 7.2
> >>>
> >>> When I run with a facet.field it works, but 

Re: Metrics API - Documentation

2019-10-07 Thread Emir Arnautović
Hi Richard,
We do not use API to collect metrics but JMX, but I believe that those are the 
same (did not verify it in code). You can see how we handled those metrics into 
reports/charts or even use our agent to send data to Prometheus: 
https://github.com/sematext/sematext-agent-integrations/tree/master/solr 


You can also see some links to Solr metric related blog posts in this repo. If 
you find out that managing your own monitoring stack is overwhelming, you can 
try our Solr integration.

HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 7 Oct 2019, at 12:40, Richard Goodman  wrote:
> 
> Hi there,
> 
> I'm currently working on using the prometheus exporter to provide some 
> detailed insights for our Solr Cloud clusters.
> 
> Using the provided template killed our prometheus server, as well as the 
> exporter due to the size of our clusters (each cluster is around 96 nodes, 
> ~300 collections with 3way replication and 16 shards), so you can imagine the 
> amount of data that comes through /admin/metrics and not filtering it down 
> first.
> 
> I've began working on writing my own template to reduce the amount of data 
> being requested and it's working fine, and I'm starting to build some nice 
> graphs in Grafana.
> 
> The only difficulty I'm having with this, is I'm struggling to find decent 
> documentation on the metrics themselves. I was using the resources metrics 
> reporting - metrics-api 
>  
> and monitoring solr with prometheus and grafana 
> 
>  but there is a lack of information on most metrics. 
> 
> For example:
> "ADMIN./admin/collections.totalTime":6715327903,
> I understand this is a counter, however, I'm not sure what unit this would be 
> represented when displaying it, for example:
> 
> 
> 
> A latency of 1mil, not sure if this means milliseconds, million, etc., 
> Another example would be the GC metrics:
>   "gc.ConcurrentMarkSweep.count":7,
>   "gc.ConcurrentMarkSweep.time":1247,
>   "gc.ParNew.count":16759,
>   "gc.ParNew.time":884173,
> Which when displayed, doesn't give the clearest insight as to what the unit 
> is:
> 
> 
> If anyone has any advice / guidance, that would be greatly appreciated. If 
> there isn't documentation for the API, then this would also be something I'll 
> look into help contributing with too.
> 
> Thanks,
> -- 
> Richard Goodman



Metrics API - Documentation

2019-10-07 Thread Richard Goodman
Hi there,

I'm currently working on using the prometheus exporter to provide some
detailed insights for our Solr Cloud clusters.

Using the provided template killed our prometheus server, as well as the
exporter due to the size of our clusters *(each cluster is around 96 nodes,
~300 collections with 3way replication and 16 shards)*, so you can imagine
the amount of data that comes through /admin/metrics and not filtering it
down first.

I've began working on writing my own template to reduce the amount of data
being requested and it's working fine, and I'm starting to build some nice
graphs in Grafana.

The only difficulty I'm having with this, is I'm struggling to find decent
documentation on the metrics themselves. I was using the resources metrics
reporting - metrics-api

 and monitoring solr with prometheus and grafana

but
there is a lack of information on most metrics.

For example:

"ADMIN./admin/collections.totalTime":6715327903,

I understand this is a counter, however, I'm not sure what unit this would
be represented when displaying it, for example:

[image: image.png]

A latency of 1mil, not sure if this means milliseconds, million, etc.,
Another example would be the GC metrics:

  "gc.ConcurrentMarkSweep.count":7,
  "gc.ConcurrentMarkSweep.time":1247,
  "gc.ParNew.count":16759,
  "gc.ParNew.time":884173,

Which when displayed, doesn't give the clearest insight as to what the unit is:

[image: image.png]

If anyone has any advice / guidance, that would be greatly
appreciated. If there isn't documentation for the API, then this would
also be something I'll look into help contributing with too.

Thanks,

-- 

Richard Goodman