On 10/21/2018 11:10 PM, Clemens Wyss DEV wrote:
For the UpdateRequests it is the "commitWithinMs"-parameter? To me
this parameter sounds like telling the solr-server I need to see this
data within "x ms". As we have autoCommit and autoSoftCommit
The commitWithin parameter is effectively equivalent to autoSoftCommit.
If you wanted to have different timeframes for visibility on some
updates, you could achieve that using commitWithin with a shorter
interval than what's in autoSoftCommit.
The ten seconds you have on autoSoftCommit is pretty aggressive. If
your commits are taking 1-2 seconds or less, an interval that small
might be OK ... but Solr will be spending a LOT of resources doing
commits, which can become a performance problem.
The 3 minutes on autoCommit is quite long. I'd probably go with 60
seconds, but a longer value isn't going to hurt anything and will result
in fewer resources being used for that operation. The default in Solr's
example configs is 15 seconds ... which I personally feel is a little
too frequent, but it works very well for a lot of people.
What about when doing a normal query/search, i.e.
solrClient.query( solrQuery );
Where can I reduce the max-search-time I am willing to wait? Or shouldn't I?
In general, this is not something you want to do. But there is
something along those lines. It's not guaranteed to always work,
depending on what phase of the query is taking a long time, but it is
sometimes very effective:
https://lucene.apache.org/solr/guide/7_5/common-query-parameters.html#CommonQueryParameters-ThetimeAllowedParameter
Configuring these timeouts is generally done at the client level, not
the request level.
Does this also mean I should NOT be setting any timeouts (neither connect nor
so) when creating a SolrClient?
The connect timeout is not a bad thing to have. I'd personally set it
to something around (or less than) five seconds. If it takes longer
than that to establish the connection, it's probably never going to
happen. I've seen fifteen seconds here, which is REALLY long for that
timeout.
Socket timeouts are something that either you don't want, or you want to
be quite long, like two minutes. If you issue a query that takes 30
seconds to run, and you set the socket timeout to 15 seconds, you're
never going to see the result. The client will disconnect before the
server has a chance to respond. Setting the socket timeout just to make
sure the client doesn't stay connected forever is a good idea, but the
timeout must be much longer than you expect a query to ever take.
stacktrace for the threads that worry you. Do you have that?
All the same:
Thread.sleep(long) line: not available [native method]
IdleConnectionEvictor$1.run() line: 66
Thread.run() line: 748
The IdleConnectionEvictor class was added in 5.5.3 and 6.2.0 by this issue:
https://issues.apache.org/jira/browse/SOLR-9290
Shalin worked on that issue, maybe they can shed some light on it and
indicate whether there should be many threads running that code. I
won't discount the possibility that the thread count is excessive. It
does seem like you wouldn't need more than one evictor thread per
client, but I didn't design it, so I can't say for sure.
Thanks,
Shawn