Re: SOLR cache tuning

2020-06-01 Thread Tarun Jain
 Hi,Thanks for the replies so far.
Walter: We have a few more solr cores. So the JVM is sized accordingly. I know 
we can separate the cores but for easier maintainability we have only one core. 
Also only one core is being used majority of the times. 
Jorn: I dont have a particular performance number in mind. I am exploring what 
kind of tuning can be done on a read-only slave on a server with tons of ram.
--Earlier today while reading the SOLR documentation I saw that 
CaffeineCache is the preferred caching implementation. So I switched my solr 
core to use CaffeineCache and the benchmarking results are very good.The 
reading times for 1.8 million documents has gone down from 210+ secs to ~130 
secs by just using CaffeineCache! So a 40% gain. 
I would recommend switching to CaffeineCache asap as it seems to be a simple 
change to get a very good speed up. 
I tried various numbers and looks like the default 512 size for filterCache & 
queryResultCache. The document size in my case is giving slightly better 
results with size=8192
If anyone else has any other tips on improving performance by changing 
parameters please let me know.Thanks for the replies so far.
Tarun Jain-=-On Monday, June 1, 2020, 01:55:56 PM EDT, Jörn Franke 
 wrote:  
 
 You should not have other processes/container running on the same node. They 
potentially screw up your os cache making things slow, eg if the other 
processes also read files etc they can remove things from Solr from the Os 
cache and then the os cache needs to be filled again.

What performance do you have now and what performance do you expect?

For full queries I would try to export daily all the data and offer it as a 
simple https download/on a object store. Maybe when you process the documents 
for indexing you can already put them on a object store or similar - so you 
don’t need Solr at all to export all of the documents.


See also Walters message.

> Am 01.06.2020 um 17:29 schrieb Tarun Jain :
> 
> Hi,I have a SOLR installation in master-slave configuration. The slave is 
> used only for reads and master for writes.
> I wanted to know if there is anything I can do to improve the performance of 
> the readonly Slave instance?
> I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server 
> has 256 GB of RAM with about 50gb free (rest being used by other services on 
> the server)The index is 15gb in size with about 2 million documents.
> We do a lot of queries where documents are fetched using filter queries and a 
> few times all 2 million documents are read.My initial idea to speed up SOLR 
> is that given the amount of memory available, SOLR should be able to keep the 
> entire index on the heap (I know OS will also cache the disk blocks) 
> My solrconfig has the following:
>  20  class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0" /> 
>  autowarmCount="0" />  initialSize="8192" autowarmCount="0" />  class="solr.search.LRUCache" size="10" initialSize="0" autowarmCount="10" 
> regenerator="solr.NoOpRegenerator" /> 
> true 
> 20 
> 200 
> false 
> 2 
> I have modified the documentCache size to 8192 from 512 but it has not helped 
> much. 
> I know this question has probably been asked a few times and I have read 
> everything I could find out about SOLR cache tuning. I am looking for some 
> more ideas.
> 
> Any ideas?
> Tarun Jain-=-  

SOLR cache tuning

2020-06-01 Thread Tarun Jain
Hi,I have a SOLR installation in master-slave configuration. The slave is used 
only for reads and master for writes.
I wanted to know if there is anything I can do to improve the performance of 
the readonly Slave instance?
I am running SOLR 8.5 and Java 14. The JVM has 24GB of ram allocated. Server 
has 256 GB of RAM with about 50gb free (rest being used by other services on 
the server)The index is 15gb in size with about 2 million documents.
We do a lot of queries where documents are fetched using filter queries and a 
few times all 2 million documents are read.My initial idea to speed up SOLR is 
that given the amount of memory available, SOLR should be able to keep the 
entire index on the heap (I know OS will also cache the disk blocks) 
My solrconfig has the following:
  20  
   
true 
20 
200 
false 
2 
I have modified the documentCache size to 8192 from 512 but it has not helped 
much. 
I know this question has probably been asked a few times and I have read 
everything I could find out about SOLR cache tuning. I am looking for some more 
ideas.

Any ideas?
Tarun Jain-=-

Re: CRUD on solr Index while replicating between master/slave

2011-12-14 Thread Tarun Jain
Hi,
We do optimize the whole index because we index our entire content every 4 hrs. 
From an application/business point of view the replication time if acceptable. 

Thanks for the information though. We will try to change this behaviour in the 
future so that replication time if reduced.

Tarun Jain
-=-


From: Erick Erickson 
To: solr-user@lucene.apache.org; Otis Gospodnetic  
Sent: Wednesday, December 14, 2011 1:52 PM
Subject: Re: CRUD on solr Index while replicating between master/slave

Whoa! Replicating takes 15 mins? That's a really long time. Are you including
about the polling interval here? Or is this just raw replication time?

Because this is really suspicious. Are you optimizing your index all the time
or something? Replication should pull down ONLY the changed segments.
But optimizing changes *all* the segments (really, collapses them into one)
and you'd be copying the full index each replication.

Or are you committing after every few documents? Or?

You need to understand why replication takes s long before going
any further IMO. It may be perfectly legitimate, but on the surface it sure
doesn't seem right.

Best
Erick

On Wed, Dec 14, 2011 at 10:52 AM, Otis Gospodnetic
 wrote:
> Hi,
>
> The slave will get the changes next time it polls the master and master tells 
> it the index has changed.
> Note that master doesn't replicate to slave, but rather the slave copies 
> changes from the master.
>
> Otis
> 
> Performance Monitoring SaaS for Solr - 
> http://sematext.com/spm/solr-performance-monitoring/index.html
>
>
>
>>
>> From: Tarun Jain 
>>To: "solr-user@lucene.apache.org" 
>>Sent: Wednesday, December 14, 2011 10:43 AM
>>Subject: Re: CRUD on solr Index while replicating between master/slave
>>
>>Hi,
>>We have an index which needs constant updates in the master.
>>
>>One more question..
>>The scenario is
>>1) Master starts replicating to slave (takes approx 15 mins)
>>
>>2) We do some changes to index on master while it is replicating
>>
>>So question is what happens to the changes in master index while it is 
>>replicating.
>>Will the slave get it or not?
>>
>>
>>Tarun Jain
>>-=-
>>
>>
>>
>>
>>- Original Message -
>>From: Erick Erickson 
>>To: solr-user@lucene.apache.org; Tarun Jain 
>>Cc:
>>Sent: Tuesday, December 13, 2011 4:18 PM
>>Subject: Re: CRUD on solr Index while replicating between master/slave
>>
>>No, you can search on the master when replicating, no
>>problem.
>>
>>But why do you want to? The whole point of master/slave
>>setups is to separate indexing from searching machines.
>>
>>Best
>>Erick
>>
>>On Tue, Dec 13, 2011 at 4:10 PM, Tarun Jain  wrote:
>>> Hi,
>>> Thanks.
>>> So just to clarify here again while replicating we cannot search on master 
>>> index ?
>>>
>>> Tarun Jain
>>> -=-
>>>
>>>
>>>
>>> - Original Message -
>>> From: Otis Gospodnetic 
>>> To: "solr-user@lucene.apache.org" 
>>> Cc:
>>> Sent: Tuesday, December 13, 2011 3:03 PM
>>> Subject: Re: CRUD on solr Index while replicating between master/slave
>>>
>>> Hi,
>>>
>>> Master: Update/insert/delete docs    -->    Yes
>>> Slaves: Search                              -->   Yes
>>>
>>> Otis
>>> 
>>>
>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> Lucene ecosystem search :: http://search-lucene.com/
>>>
>>>
>>>>
>>>> From: Tarun Jain 
>>>>To: "solr-user@lucene.apache.org" 
>>>>Sent: Tuesday, December 13, 2011 11:15 AM
>>>>Subject: CRUD on solr Index while replicating between master/slave
>>>>
>>>>Hi,
>>>>When replication is happening between master to slave what operations can 
>>>>we do on the master & what operations are possible on the slave?
>>>>I know it is not adivisable to do DML on the slave index but I wanted to 
>>>>know this anyway. Also I understand that doing DML on a slave will make the 
>>>>slave index incompatible with the master.
>>>>
>>>>Master
>>>>
>>>>Search                              -->   Yes/No
>>>>Update/insert/delete docs    -->    Yes/No
>>>>
>>>>Slave
>>>>=
>>>>Search                              -->    Yes/No
>>>>Update/insert/delete docs    -->    Yes/No
>>>>
>>>>Please share any other caveats that you have discovered regarding the above 
>>>>scenario that might be helpful.
>>>>
>>>>Thanks
>>>>-=-
>>>>
>>>>
>>>>
>>
>>
>>
>>


Re: CRUD on solr Index while replicating between master/slave

2011-12-14 Thread Tarun Jain
Hi,
We have an index which needs constant updates in the master.

One more question..
The scenario is
1) Master starts replicating to slave (takes approx 15 mins)

2) We do some changes to index on master while it is replicating

So question is what happens to the changes in master index while it is 
replicating.
Will the slave get it or not? 


Tarun Jain
-=-




- Original Message -
From: Erick Erickson 
To: solr-user@lucene.apache.org; Tarun Jain 
Cc: 
Sent: Tuesday, December 13, 2011 4:18 PM
Subject: Re: CRUD on solr Index while replicating between master/slave

No, you can search on the master when replicating, no
problem.

But why do you want to? The whole point of master/slave
setups is to separate indexing from searching machines.

Best
Erick

On Tue, Dec 13, 2011 at 4:10 PM, Tarun Jain  wrote:
> Hi,
> Thanks.
> So just to clarify here again while replicating we cannot search on master 
> index ?
>
> Tarun Jain
> -=-
>
>
>
> - Original Message -
> From: Otis Gospodnetic 
> To: "solr-user@lucene.apache.org" 
> Cc:
> Sent: Tuesday, December 13, 2011 3:03 PM
> Subject: Re: CRUD on solr Index while replicating between master/slave
>
> Hi,
>
> Master: Update/insert/delete docs    -->    Yes
> Slaves: Search                              -->   Yes
>
> Otis
> 
>
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>>
>> From: Tarun Jain 
>>To: "solr-user@lucene.apache.org" 
>>Sent: Tuesday, December 13, 2011 11:15 AM
>>Subject: CRUD on solr Index while replicating between master/slave
>>
>>Hi,
>>When replication is happening between master to slave what operations can we 
>>do on the master & what operations are possible on the slave?
>>I know it is not adivisable to do DML on the slave index but I wanted to know 
>>this anyway. Also I understand that doing DML on a slave will make the slave 
>>index incompatible with the master.
>>
>>Master
>>
>>Search                              -->   Yes/No
>>Update/insert/delete docs    -->    Yes/No
>>
>>Slave
>>=
>>Search                              -->    Yes/No
>>Update/insert/delete docs    -->    Yes/No
>>
>>Please share any other caveats that you have discovered regarding the above 
>>scenario that might be helpful.
>>
>>Thanks
>>-=-
>>
>>
>>



Re: CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Tarun Jain
Hi,
Thanks.
So just to clarify here again while replicating we cannot search on master 
index ?

Tarun Jain
-=-



- Original Message -
From: Otis Gospodnetic 
To: "solr-user@lucene.apache.org" 
Cc: 
Sent: Tuesday, December 13, 2011 3:03 PM
Subject: Re: CRUD on solr Index while replicating between master/slave

Hi,

Master: Update/insert/delete docs    -->    Yes
Slaves: Search                              -->   Yes

Otis


Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/


>____
> From: Tarun Jain 
>To: "solr-user@lucene.apache.org"  
>Sent: Tuesday, December 13, 2011 11:15 AM
>Subject: CRUD on solr Index while replicating between master/slave
> 
>Hi,
>When replication is happening between master to slave what operations can we 
>do on the master & what operations are possible on the slave?
>I know it is not adivisable to do DML on the slave index but I wanted to know 
>this anyway. Also I understand that doing DML on a slave will make the slave 
>index incompatible with the master.
>
>Master
>
>Search                              -->   Yes/No
>Update/insert/delete docs    -->    Yes/No
>
>Slave
>=
>Search                              -->    Yes/No
>Update/insert/delete docs    -->    Yes/No
>
>Please share any other caveats that you have discovered regarding the above 
>scenario that might be helpful.
>
>Thanks
>-=-
>
>
>


CRUD on solr Index while replicating between master/slave

2011-12-13 Thread Tarun Jain
Hi,
When replication is happening between master to slave what operations can we do 
on the master & what operations are possible on the slave?
I know it is not adivisable to do DML on the slave index but I wanted to know 
this anyway. Also I understand that doing DML on a slave will make the slave 
index incompatible with the master.

Master

Search                              -->   Yes/No
Update/insert/delete docs    -->    Yes/No

Slave
=
Search                              -->    Yes/No
Update/insert/delete docs    -->    Yes/No

Please share any other caveats that you have discovered regarding the above 
scenario that might be helpful.

Thanks
-=-


alphanumeric queries using LuceneQParser

2009-09-28 Thread Tarun Jain
Hi,
I have created an index where the fields have been indexed with 
omitNorms="true" omitTermFreqAndPositions="true" 
to improve indexing performance. One of the side effects of this is that some 
of the searches with alphanumeric words are not working correctly.
Example..
Below is the debugQuery part of a query response
===

  text_ar:1SAM55R1009 
  text_ar:1SAM55R1009 
  PhraseQuery(text_ar:"1 sam 55 r 1009") 
  text_ar:"1 sam 55 r 1009" 
   
  LuceneQParser 

===

Also I have changed the definition of the text fieldType in the schema.xml to 
this (removed the WorkDelimiterFilterFactory)..
===














=

I would like the query parser to not breakup alphanumeric query parameters.
How do I do this?

Tarun
-=-


Issues with facet.prefix & multiple facet.field

2008-12-26 Thread Tarun Jain
Hi,
I am trying to facet on multiple fields and I want to limit the values to facet 
on using facet.prefix

The wiki says that the facet.prefix parameter can be specified on a per field 
basis

Example:

http://localhost:8983/solr/select?q=*:*&indent=on&facet=on&rows=0&facet.field=cidsWithParent_CountySpecific&facet.prefix=ROOT_&facet.field=cidsWithParent&facet.prefix=9AAC1001

However Solr is only taking into consideration the first facet.prefix and 
applies it to all facet.field. The second facet.prefix is ignored

Any suggestions or workarounds ??

Thanks
Tarun Jain
-=-