Re: SOLR cache tuning

2020-06-01 Thread Tarun Jain
 Hi,Thanks for the replies so far.
Walter: We have a few more solr cores. So the JVM is sized accordingly. I know 
we can separate the cores but for easier maintainability we have only one core. 
Also only one core is being used majority of the times. 
Jorn: I dont have a particular performance number in mind. I am exploring what 
kind of tuning can be done on a read-only slave on a server with tons of ram.
--Earlier today while reading the SOLR documentation I saw that 
CaffeineCache is the preferred caching implementation. So I switched my solr 
core to use CaffeineCache and the benchmarking results are very good.The 
reading times for 1.8 million documents has gone down from 210+ secs to ~130 
secs by just using CaffeineCache! So a 40% gain. 
I would recommend switching to CaffeineCache asap as it seems to be a simple 
change to get a very good speed up. 
I tried various numbers and looks like the default 512 size for filterCache & 
queryResultCache. The document size in my case is giving slightly better 
results with size=8192
If anyone else has any other tips on improving performance by changing 
parameters please let me know.Thanks for the replies so far.
Tarun Jain-=-On Monday, June 1, 2020, 01:55:56 PM EDT, Jörn Franke 
 wrote:  
 
 You should not have other processes/container running on the same node. They 
potentially screw up your os cache making things slow, eg if the other 
processes also read files etc they can remove things from Solr from the Os 
cache and then the os cache needs to be filled again.

What performance do you have now and what performance do you expect?

For full queries I would try to export daily all the data and offer it as a 
simple https download/on a object store. Maybe when you process the documents 
for indexing you can already put them on a object store or similar - so you 
don’t need Solr at all to export all of the documents.


See also Walters message.

> Am 01.06.2020 um 17:29 schrieb Tarun Jain :
> 
> Hi,I have a SOLR installation in master-slave configuration. The slave is 
> used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of 
> the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server 
> has 256 GB of RAM with about 50gb free (rest being used by other services on 
> the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a 
> few times all 2 million documents are read.My initial idea to speed up SOLR 
> is that given the amount of memory available, SOLR should be able to keep the 
> entire index on the heap (I know OS will also cache the disk blocks) 
> My solrconfig has the following:
>  20  class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> 
>  autowarmCount="0" />  initialSize="8192" autowarmCount="0" />  class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" 
> regenerator="solr.NoOpRegenerator" /> 
> true 
> 20 
> 200 
> false 
> 2 
> I have modified the documentCache size to 8192 from 512 but it has not helped 
> much. 
> I know this question has probably been asked a few times and I have read 
> everything I could find out about SOLR cache tuning. I am looking for some 
> more ideas.
> 
> Any ideas?
> Tarun Jain-=-  

Re: SOLR cache tuning

2020-06-01 Thread Jörn Franke
You should not have other processes/container running on the same node. They 
potentially screw up your os cache making things slow, eg if the other 
processes also read files etc they can remove things from Solr from the Os 
cache and then the os cache needs to be filled again.

What performance do you have now and what performance do you expect?

For full queries I would try to export daily all the data and offer it as a 
simple https download/on a object store. Maybe when you process the documents 
for indexing you can already put them on a object store or similar - so you 
don’t need Solr at all to export all of the documents.


See also Walters message.

> Am 01.06.2020 um 17:29 schrieb Tarun Jain :
> 
> Hi,I have a SOLR installation in master-slave configuration. The slave is 
> used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of 
> the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server 
> has 256 GB of RAM with about 50gb free (rest being used by other services on 
> the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a 
> few times all 2 million documents are read.My initial idea to speed up SOLR 
> is that given the amount of memory available, SOLR should be able to keep the 
> entire index on the heap (I know OS will also cache the disk blocks) 
> My solrconfig has the following:
>  20  class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> 
>  autowarmCount="0" />  initialSize="8192" autowarmCount="0" />  class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" 
> regenerator="solr.NoOpRegenerator" /> 
> true 
> 20 
> 200 
> false 
> 2 
> I have modified the documentCache size to 8192 from 512 but it has not helped 
> much. 
> I know this question has probably been asked a few times and I have read 
> everything I could find out about SOLR cache tuning. I am looking for some 
> more ideas.
> 
> Any ideas?
> Tarun Jain-=-


Re: SOLR cache tuning

2020-06-01 Thread Walter Underwood
Reading all the documents is going to be slow. If you want to do that, use a 
database.

You do NOT keep all of the index in heap. Solr doesn’t work like that.

Your JVM heap is probably way too big for 2 million documents, but I doubt that 
is the performance issue. We use an 8 GB heap for all of our Solr instances, 
including one with about 5 million docs per shard.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 1, 2020, at 8:28 AM, Tarun Jain  wrote:
> 
> Hi,I have a SOLR installation in master-slave configuration. The slave is 
> used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of 
> the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server 
> has 256 GB of RAM with about 50gb free (rest being used by other services on 
> the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a 
> few times all 2 million documents are read.My initial idea to speed up SOLR 
> is that given the amount of memory available, SOLR should be able to keep the 
> entire index on the heap (I know OS will also cache the disk blocks) 
> My solrconfig has the following:
>  20  class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> 
>  autowarmCount="0" />  initialSize="8192" autowarmCount="0" />  class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" 
> regenerator="solr.NoOpRegenerator" /> 
> true 
> 20 
> 200 
> false 
> 2 
> I have modified the documentCache size to 8192 from 512 but it has not helped 
> much. 
> I know this question has probably been asked a few times and I have read 
> everything I could find out about SOLR cache tuning. I am looking for some 
> more ideas.
> 
> Any ideas?
> Tarun Jain-=-