On 10/16/2018 6:04 AM, zoolette wrote:
We are today running under SOLR 6.6 on our production environnement.
On the end of august, i planned to upgrade SOLR to 7.4 (7.5 since that
moment) but I encounter some trouble.
Our master SOLR is replicated to a slave SOLR. I tried to upgrade the
replica first, this is this one that makes me trouble.
This is a shared server half for a mysql replication server and the
replication SOLR server.
The server is running under debian 7 (wheezy) and java 1.8.0u45
The SOLR java HEAP is configured with a 12G Xmx value.

On this SOLR instance there is 6 cores.
- 2 cores are dedicated to main search on 2 different website (they are
each 20Gb)
- 2 cores are dedicated for the autocpletion feature of these 2 websites
(~2Gb each)
- 2 other cores very small occasionnaly used by one of the website

With about 45GB of index data and a 12GB heap, ideal performance is going to require 64GB of total memory -- and that's if Solr is the only software on the machine.  You might be able to get good performance with less memory, but your query rates are very high, so that's probably not a good idea.

Adding MySQL, if the databases are of any significant size, could require significantly more memory.

The SOLR instance in 7.5 is up and ready but no trafic is sent to it.
On the 2 websites, one generated approximately between 5000 and 8000
requests / minute on SOLR on 2 handlers.
One search handler is dedicated to complex search from the search bar and
the other handler treat back search such a return document for a specified
id or return the chained documents, this kind of stuff.

The second website use identical handlers than the first one, the only
difference is that it generates less SOLR requests  : 1000 to 2000 requests
/ minute.

As I mentioned above, this is a significant query rate. Handling that with only two servers will require a a LOT of memory for caching purposes, and you'll want the servers to be dedicated to Solr -- not running MySQL as well.

To upgrade the master I need to send all the SOLR trafic on this instance.
I first redirect the bigger one. The reponse time grown a lot but SOLR
stabilized it quickly. After 10 minutes as all was ok, I redirect the the
website with the lower trafic rate. And immediatly, the number of java
processes quickly increased, on munin the device busy increased to 100%
(read operations) and the load average of the server drastically grown, it
reach 120, SOLR began to not respond.

A high load average often means that there's a lot of disk I/O, and processes are spending a lot of time waiting for that I/O. On Linux, run the "top" program and look for the iowait percentage, sometimes abbreviated "wa".  This should be as close to zero as you can get it.  Even a small number in iowait can cause major performance issues.  For Solr, whenever Solr must actually read the disk (instead of reading index data from memory -- the OS disk cache) performance is going to be terrible.

https://wiki.apache.org/solr/SolrPerformanceProblems#RAM

For this upgrade, I also changed the basic fields type from tried fields to
pointed fields but I don't think that make a difference.

Trie fields have really good performance for both range queries and single value lookups.  Point fields have better performance for range queries, but absolutely terrible performance for field:value (single value lookup) queries.

And the more incomprehensible is that all works fine in SOLR 6.6.I cna
switch all the traffic without any issue.

Does anybody have an idea of what can go wrong. Debian version ? java
version ? configuration problem ?

Best guess is one (or both) of these problems:
1) Limitations of Point field types
2) Not enough memory.

Another possibility, which I think is less likely but I can't rule out with the info I have, is that a 12GB heap is big enough for 6.6, but not quite big enough for the same indexes on 7.x.  Making the heap larger would answer that question.

Thanks,
Shawn

Reply via email to