Re: clearing document cache || solr 6.6

2019-01-30 Thread Shawn Heisey

On 1/30/2019 2:27 AM, sachin gk wrote:

To support an existing functionality we have turned the opensearcher to
false. Is there a way to flush the cache programiticaly.


Executing a commit with openSearcher=true is the only way I know of 
without custom code.


When you commit with openSearcher set to false, it's generally much 
faster than with openSearcher set to true ... but any changes made to 
the index will not be visible to people making queries, because those 
will continue using the existing searcher that has no idea anything has 
changed.


If you're willing to write your own code, you can usually do just about 
anything you want.  There are no guarantees that what you want will 
actually be beneficial ... I agree with Erick on wondering why you would 
try to cripple Solr in this way.  Solr's caches are almost always a good 
thing when used appropriately.


Thanks,
Shawn


Re: clearing document cache || solr 6.6

2019-01-30 Thread Erick Erickson
I'd also ask why you care? What benefit do you think you'd get
if you did explicitly flush the document cache?

You seem to think there's some benefit to programmatically
flushing the cache, but you haven't stated what that benefit is.

I suspect that you are making some assumptions that are not true
and that this is a waste of effort.

Best,
Erick

On Wed, Jan 30, 2019 at 9:46 AM Walter Underwood  wrote:
>
> You don’t need to do that. When there is a commit, Solr creates a new Searcher
> with an empty document cache.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> > On Jan 29, 2019, at 10:27 PM, sachin gk  wrote:
> >
> > Hi All,
> >
> > Is there a way to clear the *document cache* after we commit to the indexer.
> >
> > --
> > Regards,
> > Sachin
>


Re: clearing document cache || solr 6.6

2019-01-30 Thread Walter Underwood
You don’t need to do that. When there is a commit, Solr creates a new Searcher
with an empty document cache.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jan 29, 2019, at 10:27 PM, sachin gk  wrote:
> 
> Hi All,
> 
> Is there a way to clear the *document cache* after we commit to the indexer.
> 
> -- 
> Regards,
> Sachin



Re: clearing document cache || solr 6.6

2019-01-30 Thread sachin gk
Thanks Shawn,

To support an existing functionality we have turned the opensearcher to
false. Is there a way to flush the cache programiticaly.

Regards,
Sachin

On Wed, Jan 30, 2019, 12:58 PM Shawn Heisey  On 1/29/2019 11:27 PM, sachin gk wrote:
> > Is there a way to clear the *document cache* after we commit to the
> indexer.
>
> All Solr caches are invalidated when you issue a commit with
> openSearcher set to true.  The default setting is true, and normally it
> doesn't get set to false unless you explicitly set it.  Most of the
> time, autoCommit has openSearcher set to false.
>
> The documentCache cannot be warmed directly, but it does get items added
> to it if there are any warming queries, which may come from autowarming
> queryResultCache.
>
> Thanks,
> Shawn
>


Re: clearing document cache || solr 6.6

2019-01-29 Thread Shawn Heisey

On 1/29/2019 11:27 PM, sachin gk wrote:

Is there a way to clear the *document cache* after we commit to the indexer.


All Solr caches are invalidated when you issue a commit with 
openSearcher set to true.  The default setting is true, and normally it 
doesn't get set to false unless you explicitly set it.  Most of the 
time, autoCommit has openSearcher set to false.


The documentCache cannot be warmed directly, but it does get items added 
to it if there are any warming queries, which may come from autowarming 
queryResultCache.


Thanks,
Shawn


clearing document cache || solr 6.6

2019-01-29 Thread sachin gk
Hi All,

Is there a way to clear the *document cache* after we commit to the indexer.

-- 
Regards,
Sachin


Re: Document Cache

2016-03-19 Thread Emir Arnautovic

Hi,
Your cache will be cleared on soft commits - every two minutes. It seems 
that it is either configured to be huge or you have big documents and 
retrieving all fields or dont have lazy field loading set to true.


Can you please share your document cache config and heap settings.

Thanks,
Emir

On 17.03.2016 22:24, Rallavagu wrote:

comments in line...

On 3/17/16 2:16 PM, Erick Erickson wrote:

First, I want to make sure when you say "TTL", you're talking about
documents being evicted from the documentCache and not the "Time To 
Live"

option whereby documents are removed completely from the index.


May be TTL was not the right word to use here. I wanted learn the 
criteria for an entry to be ejected.




The time varies with the number of new documents fetched. This is an LRU
cache whose size is configured in solrconfig.xml. It's pretty much
unpredictable. If for some odd reason every request gets the same 
document
it'll never be aged out. If no two queries return the same document, 
when

"cache size" docs are fetched by subsequent requests.

The entire thing is thrown out whenever a new searcher is opened (i.e.
softCommit or hardCommit with openSearcher=true)




But maybe this is an XY problem. Why do you care? Is there something 
you're
seeing that you're trying to understand or is this just a general 
interest

question?

I have following configuration,

${solr.autoCommit.maxTime:15000}false 



${solr.autoSoftCommit.maxTime:12} 



As you can see, openSearcher is set to "false". What I am seeing is 
(from heap dump due to OutOfMemory error) that the LRUCache pertaining 
"Document Cache" occupies around 85% of available heap and that is 
causing OOM errors. So, trying to understand the behavior to address 
the OOM issues.


Thanks



Best,
Erick

On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu  wrote:


Solr 5.4 embedded Jetty

Is it the right assumption that whenever a document that is returned 
as a

response to a query is cached in "Document Cache"?

Essentially, if I request for any entry like /select?q=id:
will it be cached in "Document Cache"? If yes, what is the TTL?

Thanks in advance





--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Document Cache

2016-03-19 Thread Emir Arnautovic
Problem starts with autowarmCount="5000" - that executes 5000 queries 
when new searcher is created and as queries are executed, document cache 
is filled. If you have large queryResultWindowSize and queries return 
big number of documents, that will eat up memory before new search is 
executed. It probably takes some time as well.


This is also combined with filter cache. How big is your index?

Thanks,
Emir

On 18.03.2016 15:43, Rallavagu wrote:
Thanks for the recommendations Shawn. Those are the lines I am 
thinking as well. I am reviewing application also.


Going with the note on cache invalidation for every two minutes due to 
soft commit, wonder how would it go OOM in simply two minutes or is it 
likely that a thread is holding the searcher due to long running query 
that might be potentially causing OOM? Was trying to reproduce but 
could not so far.


Here is the filter cache config

autowarmCount="1000"/>


Query Results cache

initialSize="2" autowarmCount="5000"/>


On 3/18/16 7:31 AM, Shawn Heisey wrote:

On 3/18/2016 8:22 AM, Rallavagu wrote:

So, each soft commit would create a new searcher that would invalidate
the old cache?

Here is the configuration for Document Cache



true


In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the program
requires by changing something -- that might be the code, the config, or
how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too.  The
filterCache typically eats up a LOT of memory, because each entry in the
cache is very large.

Thanks,
Shawn



--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Document Cache

2016-03-19 Thread Rallavagu

comments in line...

On 3/17/16 2:16 PM, Erick Erickson wrote:

First, I want to make sure when you say "TTL", you're talking about
documents being evicted from the documentCache and not the "Time To Live"
option whereby documents are removed completely from the index.


May be TTL was not the right word to use here. I wanted learn the 
criteria for an entry to be ejected.




The time varies with the number of new documents fetched. This is an LRU
cache whose size is configured in solrconfig.xml. It's pretty much
unpredictable. If for some odd reason every request gets the same document
it'll never be aged out. If no two queries return the same document, when
"cache size" docs are fetched by subsequent requests.

The entire thing is thrown out whenever a new searcher is opened (i.e.
softCommit or hardCommit with openSearcher=true)




But maybe this is an XY problem. Why do you care? Is there something you're
seeing that you're trying to understand or is this just a general interest
question?

I have following configuration,

${solr.autoCommit.maxTime:15000}false

${solr.autoSoftCommit.maxTime:12}

As you can see, openSearcher is set to "false". What I am seeing is 
(from heap dump due to OutOfMemory error) that the LRUCache pertaining 
"Document Cache" occupies around 85% of available heap and that is 
causing OOM errors. So, trying to understand the behavior to address the 
OOM issues.


Thanks



Best,
Erick

On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu  wrote:


Solr 5.4 embedded Jetty

Is it the right assumption that whenever a document that is returned as a
response to a query is cached in "Document Cache"?

Essentially, if I request for any entry like /select?q=id:
will it be cached in "Document Cache"? If yes, what is the TTL?

Thanks in advance





Re: Document Cache

2016-03-19 Thread Emir Arnautovic
Running single query that returns all docs and all fields will actually 
load as many document as queryResultWindowSize is.
What you need to do is run multiple queries that will return different 
documents. In case your id is numeric, you can run something like id:[1 
TO 100] and then id:[100 TO 200] etc. Make sure that it is done within 
those two minute period if there is any indexing activities.


Your index is relatively small so filter cache of initial size of 1000 
entries should take around 20MB (assuming single shard)


Thanks,
Emir

On 18.03.2016 17:02, Rallavagu wrote:



On 3/18/16 8:56 AM, Emir Arnautovic wrote:

Problem starts with autowarmCount="5000" - that executes 5000 queries
when new searcher is created and as queries are executed, document cache
is filled. If you have large queryResultWindowSize and queries return
big number of documents, that will eat up memory before new search is
executed. It probably takes some time as well.

This is also combined with filter cache. How big is your index?


Index is not very large.


numDocs:
85933

maxDoc:
161115

deletedDocs:
75182

Size
1.08 GB

I have run a query to return all documents with all fields. I could 
not reproduce OOM. I understand that I need to reduce cache sizes but 
wondering what conditions could have caused OOM so I can keep a watch.


Thanks



Thanks,
Emir

On 18.03.2016 15:43, Rallavagu wrote:

Thanks for the recommendations Shawn. Those are the lines I am
thinking as well. I am reviewing application also.

Going with the note on cache invalidation for every two minutes due to
soft commit, wonder how would it go OOM in simply two minutes or is it
likely that a thread is holding the searcher due to long running query
that might be potentially causing OOM? Was trying to reproduce but
could not so far.

Here is the filter cache config



Query Results cache



On 3/18/16 7:31 AM, Shawn Heisey wrote:

On 3/18/2016 8:22 AM, Rallavagu wrote:
So, each soft commit would create a new searcher that would 
invalidate

the old cache?

Here is the configuration for Document Cache



true


In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the 
program
requires by changing something -- that might be the code, the 
config, or

how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too.  
The
filterCache typically eats up a LOT of memory, because each entry 
in the

cache is very large.

Thanks,
Shawn





--
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/



Re: Document Cache

2016-03-19 Thread Rallavagu



On 3/18/16 9:27 AM, Emir Arnautovic wrote:

Running single query that returns all docs and all fields will actually
load as many document as queryResultWindowSize is.
What you need to do is run multiple queries that will return different
documents. In case your id is numeric, you can run something like id:[1
TO 100] and then id:[100 TO 200] etc. Make sure that it is done within
those two minute period if there is any indexing activities.
Would the existing cache be cleared while a active thread is 
performing/receiving query?




Your index is relatively small so filter cache of initial size of 1000
entries should take around 20MB (assuming single shard)

Thanks,
Emir

On 18.03.2016 17:02, Rallavagu wrote:



On 3/18/16 8:56 AM, Emir Arnautovic wrote:

Problem starts with autowarmCount="5000" - that executes 5000 queries
when new searcher is created and as queries are executed, document cache
is filled. If you have large queryResultWindowSize and queries return
big number of documents, that will eat up memory before new search is
executed. It probably takes some time as well.

This is also combined with filter cache. How big is your index?


Index is not very large.


numDocs:
85933

maxDoc:
161115

deletedDocs:
75182

Size
1.08 GB

I have run a query to return all documents with all fields. I could
not reproduce OOM. I understand that I need to reduce cache sizes but
wondering what conditions could have caused OOM so I can keep a watch.

Thanks



Thanks,
Emir

On 18.03.2016 15:43, Rallavagu wrote:

Thanks for the recommendations Shawn. Those are the lines I am
thinking as well. I am reviewing application also.

Going with the note on cache invalidation for every two minutes due to
soft commit, wonder how would it go OOM in simply two minutes or is it
likely that a thread is holding the searcher due to long running query
that might be potentially causing OOM? Was trying to reproduce but
could not so far.

Here is the filter cache config



Query Results cache



On 3/18/16 7:31 AM, Shawn Heisey wrote:

On 3/18/2016 8:22 AM, Rallavagu wrote:

So, each soft commit would create a new searcher that would
invalidate
the old cache?

Here is the configuration for Document Cache



true


In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the
program
requires by changing something -- that might be the code, the
config, or
how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too. The
filterCache typically eats up a LOT of memory, because each entry
in the
cache is very large.

Thanks,
Shawn







Re: Document Cache

2016-03-19 Thread Rallavagu



On 3/18/16 8:56 AM, Emir Arnautovic wrote:

Problem starts with autowarmCount="5000" - that executes 5000 queries
when new searcher is created and as queries are executed, document cache
is filled. If you have large queryResultWindowSize and queries return
big number of documents, that will eat up memory before new search is
executed. It probably takes some time as well.

This is also combined with filter cache. How big is your index?


Index is not very large.


numDocs:
85933

maxDoc:
161115

deletedDocs:
75182

Size
1.08 GB

I have run a query to return all documents with all fields. I could not 
reproduce OOM. I understand that I need to reduce cache sizes but 
wondering what conditions could have caused OOM so I can keep a watch.


Thanks



Thanks,
Emir

On 18.03.2016 15:43, Rallavagu wrote:

Thanks for the recommendations Shawn. Those are the lines I am
thinking as well. I am reviewing application also.

Going with the note on cache invalidation for every two minutes due to
soft commit, wonder how would it go OOM in simply two minutes or is it
likely that a thread is holding the searcher due to long running query
that might be potentially causing OOM? Was trying to reproduce but
could not so far.

Here is the filter cache config



Query Results cache



On 3/18/16 7:31 AM, Shawn Heisey wrote:

On 3/18/2016 8:22 AM, Rallavagu wrote:

So, each soft commit would create a new searcher that would invalidate
the old cache?

Here is the configuration for Document Cache



true


In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the program
requires by changing something -- that might be the code, the config, or
how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too.  The
filterCache typically eats up a LOT of memory, because each entry in the
cache is very large.

Thanks,
Shawn





Document Cache

2016-03-19 Thread Rallavagu

Solr 5.4 embedded Jetty

Is it the right assumption that whenever a document that is returned as 
a response to a query is cached in "Document Cache"?


Essentially, if I request for any entry like /select?q=id: 
will it be cached in "Document Cache"? If yes, what is the TTL?


Thanks in advance


Re: Document Cache

2016-03-19 Thread Rallavagu
Thanks for the recommendations Shawn. Those are the lines I am thinking 
as well. I am reviewing application also.


Going with the note on cache invalidation for every two minutes due to 
soft commit, wonder how would it go OOM in simply two minutes or is it 
likely that a thread is holding the searcher due to long running query 
that might be potentially causing OOM? Was trying to reproduce but could 
not so far.


Here is the filter cache config

autowarmCount="1000"/>


Query Results cache

autowarmCount="5000"/>


On 3/18/16 7:31 AM, Shawn Heisey wrote:

On 3/18/2016 8:22 AM, Rallavagu wrote:

So, each soft commit would create a new searcher that would invalidate
the old cache?

Here is the configuration for Document Cache



true


In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the program
requires by changing something -- that might be the code, the config, or
how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too.  The
filterCache typically eats up a LOT of memory, because each entry in the
cache is very large.

Thanks,
Shawn



Re: Document Cache

2016-03-19 Thread Erick Erickson
First, I want to make sure when you say "TTL", you're talking about
documents being evicted from the documentCache and not the "Time To Live"
option whereby documents are removed completely from the index.

The time varies with the number of new documents fetched. This is an LRU
cache whose size is configured in solrconfig.xml. It's pretty much
unpredictable. If for some odd reason every request gets the same document
it'll never be aged out. If no two queries return the same document, when
"cache size" docs are fetched by subsequent requests.

The entire thing is thrown out whenever a new searcher is opened (i.e.
softCommit or hardCommit with openSearcher=true)

But maybe this is an XY problem. Why do you care? Is there something you're
seeing that you're trying to understand or is this just a general interest
question?

Best,
Erick

On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu  wrote:

> Solr 5.4 embedded Jetty
>
> Is it the right assumption that whenever a document that is returned as a
> response to a query is cached in "Document Cache"?
>
> Essentially, if I request for any entry like /select?q=id:
> will it be cached in "Document Cache"? If yes, what is the TTL?
>
> Thanks in advance
>


Re: Document Cache

2016-03-18 Thread Shawn Heisey
On 3/18/2016 8:22 AM, Rallavagu wrote:
> So, each soft commit would create a new searcher that would invalidate
> the old cache?
>
> Here is the configuration for Document Cache
>
>  initialSize="10" autowarmCount="0"/>
>
> true

In an earlier message, you indicated you're running into OOM.  I think
we can see why with this cache definition.

There are exactly two ways to deal with OOM.  One is to increase the
heap size.  The other is to reduce the amount of memory that the program
requires by changing something -- that might be the code, the config, or
how you're using it.

Start by reducing that cache size to 4096 or 1024.

https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap

If yuo've also got a very large filterCache, reduce that size too.  The
filterCache typically eats up a LOT of memory, because each entry in the
cache is very large.

Thanks,
Shawn



Re: Document Cache

2016-03-18 Thread Rallavagu
So, each soft commit would create a new searcher that would invalidate 
the old cache?


Here is the configuration for Document Cache

autowarmCount="0"/>


true

Thanks

On 3/18/16 12:45 AM, Emir Arnautovic wrote:

Hi,
Your cache will be cleared on soft commits - every two minutes. It seems
that it is either configured to be huge or you have big documents and
retrieving all fields or dont have lazy field loading set to true.

Can you please share your document cache config and heap settings.

Thanks,
Emir

On 17.03.2016 22:24, Rallavagu wrote:

comments in line...

On 3/17/16 2:16 PM, Erick Erickson wrote:

First, I want to make sure when you say "TTL", you're talking about
documents being evicted from the documentCache and not the "Time To
Live"
option whereby documents are removed completely from the index.


May be TTL was not the right word to use here. I wanted learn the
criteria for an entry to be ejected.



The time varies with the number of new documents fetched. This is an LRU
cache whose size is configured in solrconfig.xml. It's pretty much
unpredictable. If for some odd reason every request gets the same
document
it'll never be aged out. If no two queries return the same document,
when
"cache size" docs are fetched by subsequent requests.

The entire thing is thrown out whenever a new searcher is opened (i.e.
softCommit or hardCommit with openSearcher=true)




But maybe this is an XY problem. Why do you care? Is there something
you're
seeing that you're trying to understand or is this just a general
interest
question?

I have following configuration,

${solr.autoCommit.maxTime:15000}false


${solr.autoSoftCommit.maxTime:12}


As you can see, openSearcher is set to "false". What I am seeing is
(from heap dump due to OutOfMemory error) that the LRUCache pertaining
"Document Cache" occupies around 85% of available heap and that is
causing OOM errors. So, trying to understand the behavior to address
the OOM issues.

Thanks



Best,
Erick

On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu  wrote:


Solr 5.4 embedded Jetty

Is it the right assumption that whenever a document that is returned
as a
response to a query is cached in "Document Cache"?

Essentially, if I request for any entry like /select?q=id:
will it be cached in "Document Cache"? If yes, what is the TTL?

Thanks in advance







Re: Ignoring the Document Cache per query

2015-05-29 Thread Erick Erickson
This is totally weird. The document cache should really have nothing
to do with whether MLT returns documents or not AFAIK. So either I'm
totally misunderstanding MLT, you're leaving out a step or there's
some bug in Solr. The fact that setting the document cache to 0
changes the behavior, or restarting Solr and submitting the exact same
request gives different behavior is strong evidence it's a problem
with Solr.

Could I ask you to open a JIRA and add all the relevant details you
can? Especially if you could get it to work (well actually fail) with
the techproducts data. But barring that, the (perhaps sanitized)
queries you send to get diff results before and after.

Best,
Erick

On Fri, May 29, 2015 at 7:10 AM, Bryan Bende  wrote:
> Thanks Erik. I realize this really makes no sense, but I was looking to
> work around a problem. Here is the scenario...
>
> Using Solr 5.1 we have a service that utilizes the new mlt query parser to
> get recommendations. So we start up the application,
> ask for recommendations for a document, and everything works.
>
> Another feature is to "dislike" a document, and once it is "disliked" it
> shouldn't show up as a recommended document. It
> does this by looking up the disliked documents for a user and adding a
> filter query to the recommendation call which excludes
> the disliked documents.
>
> So now we dislike a document that was in the original list of
> recommendations above, then ask for the recommendations again,
> and now we get nothing back. If we restart Solr, or reload the collection,
> then we can get it to work, but as soon as we dislike another
> document we get back into a weird state.
>
> Through trial and error I narrowed down that if we set the documentCache
> size to 0, then this problem doesn't happen. Since we can't
> really figure out why this is happening in Solr, we were hoping there was
> some way to not use the document cache on the call where
> we use the mlt query parser.
>
> On Thu, May 28, 2015 at 5:44 PM, Erick Erickson 
> wrote:
>
>> First, there isn't that I know of. But why would you want to do this?
>>
>> On the face of it, it makes no sense to ignore the doc cache. One of its
>> purposes is to hold the document (read off disk) for successive
>> search components _in the same query_. Otherwise, each component
>> might have to do a disk seek.
>>
>> So I must be missing why you want to do this.
>>
>> Best,
>> Erick
>>
>> On Thu, May 28, 2015 at 1:23 PM, Bryan Bende  wrote:
>> > Is there a way to the document cache on a per-query basis?
>> >
>> > It looks like theres {!cache=false} for preventing the filter cache from
>> > being used for a given query, looking for the same thing for the document
>> > cache.
>> >
>> > Thanks,
>> >
>> > Bryan
>>


Re: Ignoring the Document Cache per query

2015-05-29 Thread Bryan Bende
Thanks Erik. I realize this really makes no sense, but I was looking to
work around a problem. Here is the scenario...

Using Solr 5.1 we have a service that utilizes the new mlt query parser to
get recommendations. So we start up the application,
ask for recommendations for a document, and everything works.

Another feature is to "dislike" a document, and once it is "disliked" it
shouldn't show up as a recommended document. It
does this by looking up the disliked documents for a user and adding a
filter query to the recommendation call which excludes
the disliked documents.

So now we dislike a document that was in the original list of
recommendations above, then ask for the recommendations again,
and now we get nothing back. If we restart Solr, or reload the collection,
then we can get it to work, but as soon as we dislike another
document we get back into a weird state.

Through trial and error I narrowed down that if we set the documentCache
size to 0, then this problem doesn't happen. Since we can't
really figure out why this is happening in Solr, we were hoping there was
some way to not use the document cache on the call where
we use the mlt query parser.

On Thu, May 28, 2015 at 5:44 PM, Erick Erickson 
wrote:

> First, there isn't that I know of. But why would you want to do this?
>
> On the face of it, it makes no sense to ignore the doc cache. One of its
> purposes is to hold the document (read off disk) for successive
> search components _in the same query_. Otherwise, each component
> might have to do a disk seek.
>
> So I must be missing why you want to do this.
>
> Best,
> Erick
>
> On Thu, May 28, 2015 at 1:23 PM, Bryan Bende  wrote:
> > Is there a way to the document cache on a per-query basis?
> >
> > It looks like theres {!cache=false} for preventing the filter cache from
> > being used for a given query, looking for the same thing for the document
> > cache.
> >
> > Thanks,
> >
> > Bryan
>


Re: Ignoring the Document Cache per query

2015-05-28 Thread Erick Erickson
First, there isn't that I know of. But why would you want to do this?

On the face of it, it makes no sense to ignore the doc cache. One of its
purposes is to hold the document (read off disk) for successive
search components _in the same query_. Otherwise, each component
might have to do a disk seek.

So I must be missing why you want to do this.

Best,
Erick

On Thu, May 28, 2015 at 1:23 PM, Bryan Bende  wrote:
> Is there a way to the document cache on a per-query basis?
>
> It looks like theres {!cache=false} for preventing the filter cache from
> being used for a given query, looking for the same thing for the document
> cache.
>
> Thanks,
>
> Bryan


Ignoring the Document Cache per query

2015-05-28 Thread Bryan Bende
Is there a way to the document cache on a per-query basis?

It looks like theres {!cache=false} for preventing the filter cache from
being used for a given query, looking for the same thing for the document
cache.

Thanks,

Bryan


Re: How to size document cache

2013-10-25 Thread Shawn Heisey
On 10/25/2013 7:48 AM, Erick Erickson wrote:
> I hadn't thought about it before, but now I'm curious how
> MMapDirectoryFactory plays into documentCache. Uwe,
> are you listening? :) My _guess_ is that if you're using
> MMapDirectoryFactory, the usefulness of the document
> cache is lessened, kinda.
> 
> Since the documents are coming from essentially
> random places in the files, you're probably going to chew
> up op system blocks keeping these around. But that's
> probably no worse than chewing up Java memory and
> avoids some GC churn.

Solr's caches save CPU cycles as well as disk access.  If results can be
returned from a Solr cache, then Solr (and ultimately, Lucene) don't
have to go rifling through index data to figure out what the results
are.  Although this process is greatly sped up when the data is in the
OS disk cache, it still isn't free.

For large-scale caching, the OS is better at the job than Solr and Java.
 IMHO, the Solr caches are still important (but can be smaller) because
data that is accessed a LOT will be very readily available.

Thanks,
Shawn



Re: How to size document cache

2013-10-25 Thread Erick Erickson
I hadn't thought about it before, but now I'm curious how
MMapDirectoryFactory plays into documentCache. Uwe,
are you listening? :) My _guess_ is that if you're using
MMapDirectoryFactory, the usefulness of the document
cache is lessened, kinda.

Since the documents are coming from essentially
random places in the files, you're probably going to chew
up op system blocks keeping these around. But that's
probably no worse than chewing up Java memory and
avoids some GC churn.

OTOH, the raw disk data must be decompressed, perhaps
every time they're read no matter whether the data comes
from the MMap IO buffers or have to be read from disk

OTOH, unless the docs are really big, this shouldn't matter
much.

Hmmm, I guess "measure and find out" is about all I can
really offer...

Best,
Erick


On Fri, Oct 25, 2013 at 6:28 AM, Matteo Grolla wrote:

> Hi,
> I'd really appreciate if you could give me some help understanding
> how to tune the document cache.
> My thoughts:
>
> min values: max_results * max_concurrent_queries, as stated by
> http://wiki.apache.org/solr/SolrCaching
> how can I estimate max_concurrent_queries?
>
> size:  I think there's a tension between dedicating memory to this
> cache and reducing the java heap size so the OS can buffer more of the
> index on disk
> probably I could try increasing this value if I see strong
> benefits on the hit ratio (the documents returned are a small subset of all
> docs)
>
> If I have enough RAM that the whole index fits in memory
> can I just ignore this cache? (maybe just keep it just above the
> recommended min values)
>
>
> Matteo


How to size document cache

2013-10-25 Thread Matteo Grolla
Hi,
I'd really appreciate if you could give me some help understanding how 
to tune the document cache.
My thoughts:

min values: max_results * max_concurrent_queries, as stated by 
http://wiki.apache.org/solr/SolrCaching
how can I estimate max_concurrent_queries?

size:  I think there's a tension between dedicating memory to this 
cache and reducing the java heap size so the OS can buffer more of the index on 
disk
probably I could try increasing this value if I see strong 
benefits on the hit ratio (the documents returned are a small subset of all 
docs)

If I have enough RAM that the whole index fits in memory can I 
just ignore this cache? (maybe just keep it just above the recommended min 
values)


Matteo

Re: Soft Commit and Document Cache

2013-04-22 Thread Niran Fajemisin
Thanks Shawn and Mark! That was very helpful.

-Niran



>
> From: Shawn Heisey 
>To: solr-user@lucene.apache.org 
>Sent: Monday, April 22, 2013 5:30 PM
>Subject: Re: Soft Commit and Document Cache
> 
>
>On 4/22/2013 4:16 PM, Niran Fajemisin wrote:
>> A quick (and hopefully simply) question: Does the document cache (or any of 
>> the other caches for that matter), get invalidated after a soft commit has 
>> been performed?
>
>All Solr caches are invalidated when you issue a commit with 
>openSearcher set to true.  There would be no reason to do a soft commit 
>with openSearcher set to false.  That setting only makes sense with hard 
>commits.
>
>If you have queries defined for the newSearcher event, then they will be 
>run, which can pre-populate caches.
>
>The filterCache and queryResultCache can be autowarmed on commit - the 
>most relevant autowarmCount queries in the cache from the old searcher 
>are re-run against the new searcher.  The queryResultWindowSize 
>parameter helps control exactly what gets cached with the queryResultCache.
>
>The documentCache cannot be autowarmed, although I *think* that when 
>entries from the queryResultCache are run, it will also populate the 
>documentCache, though I could be wrong about that.
>
>I do not know whether autowarming is done before or after newSearcher 
>queries.
>
>http://wiki.apache.org/solr/SolrCaching
>
>Thanks,
>Shawn
>
>
>
>

Re: Soft Commit and Document Cache

2013-04-22 Thread Shawn Heisey

On 4/22/2013 4:16 PM, Niran Fajemisin wrote:

A quick (and hopefully simply) question: Does the document cache (or any of the 
other caches for that matter), get invalidated after a soft commit has been 
performed?


All Solr caches are invalidated when you issue a commit with 
openSearcher set to true.  There would be no reason to do a soft commit 
with openSearcher set to false.  That setting only makes sense with hard 
commits.


If you have queries defined for the newSearcher event, then they will be 
run, which can pre-populate caches.


The filterCache and queryResultCache can be autowarmed on commit - the 
most relevant autowarmCount queries in the cache from the old searcher 
are re-run against the new searcher.  The queryResultWindowSize 
parameter helps control exactly what gets cached with the queryResultCache.


The documentCache cannot be autowarmed, although I *think* that when 
entries from the queryResultCache are run, it will also populate the 
documentCache, though I could be wrong about that.


I do not know whether autowarming is done before or after newSearcher 
queries.


http://wiki.apache.org/solr/SolrCaching

Thanks,
Shawn



Re: Soft Commit and Document Cache

2013-04-22 Thread Mark Miller
Yup - all of the top level caches are. It's a trade off - don't NRT more than 
you need to.

- Mark

On Apr 22, 2013, at 6:16 PM, Niran Fajemisin  wrote:

> Hi all,
> 
> A quick (and hopefully simply) question: Does the document cache (or any of 
> the other caches for that matter), get invalidated after a soft commit has 
> been performed?
> 
> Thanks,
> Niran



Soft Commit and Document Cache

2013-04-22 Thread Niran Fajemisin
Hi all,

A quick (and hopefully simply) question: Does the document cache (or any of the 
other caches for that matter), get invalidated after a soft commit has been 
performed?

Thanks,
Niran

RE: Disabling document cache usage

2013-01-15 Thread Markus Jelsma
Hi,

Commenting them out works fine. We don't use documentCaches either as they eat 
too much and return only so little.

Cheers

 
 
-Original message-
> From:Otis Gospodnetic 
> Sent: Tue 15-Jan-2013 17:29
> To: solr-user@lucene.apache.org
> Subject: Re: Disabling document cache usage
> 
> Hi,
> 
> Thanks Markus.
> How are caches disabled these days... in Solr 4.0 that is?  I remember
> trying to comment them out in the past, but seeing them still enabled and
> used with some custom size and other settings.
> 
> Thanks,
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
> 
> 
> 
> 
> 
> On Tue, Jan 15, 2013 at 11:00 AM, Markus Jelsma
> wrote:
> 
> > No, SolrIndexSearcher has no mechanism to do that. The only way is to
> > disable the cache altogether or patch it up :)
> >
> >
> >
> > -Original message-
> > > From:Otis Gospodnetic 
> > > Sent: Tue 15-Jan-2013 16:57
> > > To: solr-user@lucene.apache.org
> > > Subject: Disabling document cache usage
> > >
> > > Hi,
> > >
> > > https://issues.apache.org/jira/browse/SOLR-2429 added the ability to
> > > disable filter and query caches on a request by request basis.
> > >
> > > Is there anything one can use to disable usage of (lookups and insertion
> > > into) document cache?
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Solr & ElasticSearch Support
> > > http://sematext.com/
> > >
> >
> 


Re: Disabling document cache usage

2013-01-15 Thread Otis Gospodnetic
Hi,

Thanks Markus.
How are caches disabled these days... in Solr 4.0 that is?  I remember
trying to comment them out in the past, but seeing them still enabled and
used with some custom size and other settings.

Thanks,
Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Tue, Jan 15, 2013 at 11:00 AM, Markus Jelsma
wrote:

> No, SolrIndexSearcher has no mechanism to do that. The only way is to
> disable the cache altogether or patch it up :)
>
>
>
> -Original message-
> > From:Otis Gospodnetic 
> > Sent: Tue 15-Jan-2013 16:57
> > To: solr-user@lucene.apache.org
> > Subject: Disabling document cache usage
> >
> > Hi,
> >
> > https://issues.apache.org/jira/browse/SOLR-2429 added the ability to
> > disable filter and query caches on a request by request basis.
> >
> > Is there anything one can use to disable usage of (lookups and insertion
> > into) document cache?
> >
> > Thanks,
> > Otis
> > --
> > Solr & ElasticSearch Support
> > http://sematext.com/
> >
>


RE: Disabling document cache usage

2013-01-15 Thread Markus Jelsma
No, SolrIndexSearcher has no mechanism to do that. The only way is to disable 
the cache altogether or patch it up :)


 
-Original message-
> From:Otis Gospodnetic 
> Sent: Tue 15-Jan-2013 16:57
> To: solr-user@lucene.apache.org
> Subject: Disabling document cache usage
> 
> Hi,
> 
> https://issues.apache.org/jira/browse/SOLR-2429 added the ability to
> disable filter and query caches on a request by request basis.
> 
> Is there anything one can use to disable usage of (lookups and insertion
> into) document cache?
> 
> Thanks,
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
> 


Re: document cache

2012-05-15 Thread Erick Erickson
Yes. In fact, all the caches get flushed on every commit/replication cycle.

Some of the caches get autowarmed when a new searcher is opened,
which happens...you guessed it...every time a commit/replication happens.

Best
Erick

On Tue, May 15, 2012 at 1:32 AM, shinkanze  wrote:
>  hi ,
>
> I want to know the internal mechanism how document cache works .
>
> specifically its flushing cycle ...
>
> i.e does it gets flushed  on every commit /replication .
>
> regards
>
> Rajat Rastogi
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html
> Sent from the Solr - User mailing list archive at Nabble.com.


document cache

2012-05-14 Thread shinkanze
 hi ,

I want to know the internal mechanism how document cache works .

specifically its flushing cycle ...

i.e does it gets flushed  on every commit /replication .

regards 

Rajat Rastogi


--
View this message in context: 
http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: maximum recommended document cache size

2010-05-15 Thread Lance Norskog
The general recommendation is to watch the caches during normal user
searches and keep increasing the size until evictions start happening.
This may or may not work for your situation.

The problem is that the eviction rate does not show "lifetime in
cache". So if 90% of the cache sits there indefinitely and the
remaining 10% churns, the cache is fine but you'll show zillions of
evictions.

On Thu, May 13, 2010 at 10:38 AM, Nagelberg, Kallin
 wrote:
> I am trying to tune my Solr setup so that the caches are well warmed after 
> the index is updated. My documents are quite small, usually under 10k. I 
> currently have a document cache size of about 15,000, and am warming up 5,000 
> with a query after each indexing. Autocommit is set at 30 seconds, and my 
> caches are warming up easily in just a couple of seconds. I've read of 
> concerns regarding garbage collection when your cache is too large. Does 
> anyone have experience with this? Ideally I would like to get 90% of all 
> documents from the last month in memory after each index, which would be 
> around 25,000. I'm doing extensive load testing, but if someone has 
> recommendations I'd love to hear them.
>
> Thanks,
> -Kallin Nagelberg
>



-- 
Lance Norskog
goks...@gmail.com


maximum recommended document cache size

2010-05-13 Thread Nagelberg, Kallin
I am trying to tune my Solr setup so that the caches are well warmed after the 
index is updated. My documents are quite small, usually under 10k. I currently 
have a document cache size of about 15,000, and am warming up 5,000 with a 
query after each indexing. Autocommit is set at 30 seconds, and my caches are 
warming up easily in just a couple of seconds. I've read of concerns regarding 
garbage collection when your cache is too large. Does anyone have experience 
with this? Ideally I would like to get 90% of all documents from the last month 
in memory after each index, which would be around 25,000. I'm doing extensive 
load testing, but if someone has recommendations I'd love to hear them.

Thanks,
-Kallin Nagelberg