Re: clearing document cache || solr 6.6
On 1/30/2019 2:27 AM, sachin gk wrote: To support an existing functionality we have turned the opensearcher to false. Is there a way to flush the cache programiticaly. Executing a commit with openSearcher=true is the only way I know of without custom code. When you commit with openSearcher set to false, it's generally much faster than with openSearcher set to true ... but any changes made to the index will not be visible to people making queries, because those will continue using the existing searcher that has no idea anything has changed. If you're willing to write your own code, you can usually do just about anything you want. There are no guarantees that what you want will actually be beneficial ... I agree with Erick on wondering why you would try to cripple Solr in this way. Solr's caches are almost always a good thing when used appropriately. Thanks, Shawn
Re: clearing document cache || solr 6.6
I'd also ask why you care? What benefit do you think you'd get if you did explicitly flush the document cache? You seem to think there's some benefit to programmatically flushing the cache, but you haven't stated what that benefit is. I suspect that you are making some assumptions that are not true and that this is a waste of effort. Best, Erick On Wed, Jan 30, 2019 at 9:46 AM Walter Underwood wrote: > > You don’t need to do that. When there is a commit, Solr creates a new Searcher > with an empty document cache. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Jan 29, 2019, at 10:27 PM, sachin gk wrote: > > > > Hi All, > > > > Is there a way to clear the *document cache* after we commit to the indexer. > > > > -- > > Regards, > > Sachin >
Re: clearing document cache || solr 6.6
You don’t need to do that. When there is a commit, Solr creates a new Searcher with an empty document cache. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jan 29, 2019, at 10:27 PM, sachin gk wrote: > > Hi All, > > Is there a way to clear the *document cache* after we commit to the indexer. > > -- > Regards, > Sachin
Re: clearing document cache || solr 6.6
Thanks Shawn, To support an existing functionality we have turned the opensearcher to false. Is there a way to flush the cache programiticaly. Regards, Sachin On Wed, Jan 30, 2019, 12:58 PM Shawn Heisey On 1/29/2019 11:27 PM, sachin gk wrote: > > Is there a way to clear the *document cache* after we commit to the > indexer. > > All Solr caches are invalidated when you issue a commit with > openSearcher set to true. The default setting is true, and normally it > doesn't get set to false unless you explicitly set it. Most of the > time, autoCommit has openSearcher set to false. > > The documentCache cannot be warmed directly, but it does get items added > to it if there are any warming queries, which may come from autowarming > queryResultCache. > > Thanks, > Shawn >
Re: clearing document cache || solr 6.6
On 1/29/2019 11:27 PM, sachin gk wrote: Is there a way to clear the *document cache* after we commit to the indexer. All Solr caches are invalidated when you issue a commit with openSearcher set to true. The default setting is true, and normally it doesn't get set to false unless you explicitly set it. Most of the time, autoCommit has openSearcher set to false. The documentCache cannot be warmed directly, but it does get items added to it if there are any warming queries, which may come from autowarming queryResultCache. Thanks, Shawn
clearing document cache || solr 6.6
Hi All, Is there a way to clear the *document cache* after we commit to the indexer. -- Regards, Sachin
Re: Document Cache
Hi, Your cache will be cleared on soft commits - every two minutes. It seems that it is either configured to be huge or you have big documents and retrieving all fields or dont have lazy field loading set to true. Can you please share your document cache config and heap settings. Thanks, Emir On 17.03.2016 22:24, Rallavagu wrote: comments in line... On 3/17/16 2:16 PM, Erick Erickson wrote: First, I want to make sure when you say "TTL", you're talking about documents being evicted from the documentCache and not the "Time To Live" option whereby documents are removed completely from the index. May be TTL was not the right word to use here. I wanted learn the criteria for an entry to be ejected. The time varies with the number of new documents fetched. This is an LRU cache whose size is configured in solrconfig.xml. It's pretty much unpredictable. If for some odd reason every request gets the same document it'll never be aged out. If no two queries return the same document, when "cache size" docs are fetched by subsequent requests. The entire thing is thrown out whenever a new searcher is opened (i.e. softCommit or hardCommit with openSearcher=true) But maybe this is an XY problem. Why do you care? Is there something you're seeing that you're trying to understand or is this just a general interest question? I have following configuration, ${solr.autoCommit.maxTime:15000}false ${solr.autoSoftCommit.maxTime:12} As you can see, openSearcher is set to "false". What I am seeing is (from heap dump due to OutOfMemory error) that the LRUCache pertaining "Document Cache" occupies around 85% of available heap and that is causing OOM errors. So, trying to understand the behavior to address the OOM issues. Thanks Best, Erick On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu wrote: Solr 5.4 embedded Jetty Is it the right assumption that whenever a document that is returned as a response to a query is cached in "Document Cache"? Essentially, if I request for any entry like /select?q=id: will it be cached in "Document Cache"? If yes, what is the TTL? Thanks in advance -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Re: Document Cache
Problem starts with autowarmCount="5000" - that executes 5000 queries when new searcher is created and as queries are executed, document cache is filled. If you have large queryResultWindowSize and queries return big number of documents, that will eat up memory before new search is executed. It probably takes some time as well. This is also combined with filter cache. How big is your index? Thanks, Emir On 18.03.2016 15:43, Rallavagu wrote: Thanks for the recommendations Shawn. Those are the lines I am thinking as well. I am reviewing application also. Going with the note on cache invalidation for every two minutes due to soft commit, wonder how would it go OOM in simply two minutes or is it likely that a thread is holding the searcher due to long running query that might be potentially causing OOM? Was trying to reproduce but could not so far. Here is the filter cache config autowarmCount="1000"/> Query Results cache initialSize="2" autowarmCount="5000"/> On 3/18/16 7:31 AM, Shawn Heisey wrote: On 3/18/2016 8:22 AM, Rallavagu wrote: So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Re: Document Cache
comments in line... On 3/17/16 2:16 PM, Erick Erickson wrote: First, I want to make sure when you say "TTL", you're talking about documents being evicted from the documentCache and not the "Time To Live" option whereby documents are removed completely from the index. May be TTL was not the right word to use here. I wanted learn the criteria for an entry to be ejected. The time varies with the number of new documents fetched. This is an LRU cache whose size is configured in solrconfig.xml. It's pretty much unpredictable. If for some odd reason every request gets the same document it'll never be aged out. If no two queries return the same document, when "cache size" docs are fetched by subsequent requests. The entire thing is thrown out whenever a new searcher is opened (i.e. softCommit or hardCommit with openSearcher=true) But maybe this is an XY problem. Why do you care? Is there something you're seeing that you're trying to understand or is this just a general interest question? I have following configuration, ${solr.autoCommit.maxTime:15000}false ${solr.autoSoftCommit.maxTime:12} As you can see, openSearcher is set to "false". What I am seeing is (from heap dump due to OutOfMemory error) that the LRUCache pertaining "Document Cache" occupies around 85% of available heap and that is causing OOM errors. So, trying to understand the behavior to address the OOM issues. Thanks Best, Erick On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu wrote: Solr 5.4 embedded Jetty Is it the right assumption that whenever a document that is returned as a response to a query is cached in "Document Cache"? Essentially, if I request for any entry like /select?q=id: will it be cached in "Document Cache"? If yes, what is the TTL? Thanks in advance
Re: Document Cache
Running single query that returns all docs and all fields will actually load as many document as queryResultWindowSize is. What you need to do is run multiple queries that will return different documents. In case your id is numeric, you can run something like id:[1 TO 100] and then id:[100 TO 200] etc. Make sure that it is done within those two minute period if there is any indexing activities. Your index is relatively small so filter cache of initial size of 1000 entries should take around 20MB (assuming single shard) Thanks, Emir On 18.03.2016 17:02, Rallavagu wrote: On 3/18/16 8:56 AM, Emir Arnautovic wrote: Problem starts with autowarmCount="5000" - that executes 5000 queries when new searcher is created and as queries are executed, document cache is filled. If you have large queryResultWindowSize and queries return big number of documents, that will eat up memory before new search is executed. It probably takes some time as well. This is also combined with filter cache. How big is your index? Index is not very large. numDocs: 85933 maxDoc: 161115 deletedDocs: 75182 Size 1.08 GB I have run a query to return all documents with all fields. I could not reproduce OOM. I understand that I need to reduce cache sizes but wondering what conditions could have caused OOM so I can keep a watch. Thanks Thanks, Emir On 18.03.2016 15:43, Rallavagu wrote: Thanks for the recommendations Shawn. Those are the lines I am thinking as well. I am reviewing application also. Going with the note on cache invalidation for every two minutes due to soft commit, wonder how would it go OOM in simply two minutes or is it likely that a thread is holding the searcher due to long running query that might be potentially causing OOM? Was trying to reproduce but could not so far. Here is the filter cache config Query Results cache On 3/18/16 7:31 AM, Shawn Heisey wrote: On 3/18/2016 8:22 AM, Rallavagu wrote: So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/
Re: Document Cache
On 3/18/16 9:27 AM, Emir Arnautovic wrote: Running single query that returns all docs and all fields will actually load as many document as queryResultWindowSize is. What you need to do is run multiple queries that will return different documents. In case your id is numeric, you can run something like id:[1 TO 100] and then id:[100 TO 200] etc. Make sure that it is done within those two minute period if there is any indexing activities. Would the existing cache be cleared while a active thread is performing/receiving query? Your index is relatively small so filter cache of initial size of 1000 entries should take around 20MB (assuming single shard) Thanks, Emir On 18.03.2016 17:02, Rallavagu wrote: On 3/18/16 8:56 AM, Emir Arnautovic wrote: Problem starts with autowarmCount="5000" - that executes 5000 queries when new searcher is created and as queries are executed, document cache is filled. If you have large queryResultWindowSize and queries return big number of documents, that will eat up memory before new search is executed. It probably takes some time as well. This is also combined with filter cache. How big is your index? Index is not very large. numDocs: 85933 maxDoc: 161115 deletedDocs: 75182 Size 1.08 GB I have run a query to return all documents with all fields. I could not reproduce OOM. I understand that I need to reduce cache sizes but wondering what conditions could have caused OOM so I can keep a watch. Thanks Thanks, Emir On 18.03.2016 15:43, Rallavagu wrote: Thanks for the recommendations Shawn. Those are the lines I am thinking as well. I am reviewing application also. Going with the note on cache invalidation for every two minutes due to soft commit, wonder how would it go OOM in simply two minutes or is it likely that a thread is holding the searcher due to long running query that might be potentially causing OOM? Was trying to reproduce but could not so far. Here is the filter cache config Query Results cache On 3/18/16 7:31 AM, Shawn Heisey wrote: On 3/18/2016 8:22 AM, Rallavagu wrote: So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn
Re: Document Cache
On 3/18/16 8:56 AM, Emir Arnautovic wrote: Problem starts with autowarmCount="5000" - that executes 5000 queries when new searcher is created and as queries are executed, document cache is filled. If you have large queryResultWindowSize and queries return big number of documents, that will eat up memory before new search is executed. It probably takes some time as well. This is also combined with filter cache. How big is your index? Index is not very large. numDocs: 85933 maxDoc: 161115 deletedDocs: 75182 Size 1.08 GB I have run a query to return all documents with all fields. I could not reproduce OOM. I understand that I need to reduce cache sizes but wondering what conditions could have caused OOM so I can keep a watch. Thanks Thanks, Emir On 18.03.2016 15:43, Rallavagu wrote: Thanks for the recommendations Shawn. Those are the lines I am thinking as well. I am reviewing application also. Going with the note on cache invalidation for every two minutes due to soft commit, wonder how would it go OOM in simply two minutes or is it likely that a thread is holding the searcher due to long running query that might be potentially causing OOM? Was trying to reproduce but could not so far. Here is the filter cache config Query Results cache On 3/18/16 7:31 AM, Shawn Heisey wrote: On 3/18/2016 8:22 AM, Rallavagu wrote: So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn
Document Cache
Solr 5.4 embedded Jetty Is it the right assumption that whenever a document that is returned as a response to a query is cached in "Document Cache"? Essentially, if I request for any entry like /select?q=id: will it be cached in "Document Cache"? If yes, what is the TTL? Thanks in advance
Re: Document Cache
Thanks for the recommendations Shawn. Those are the lines I am thinking as well. I am reviewing application also. Going with the note on cache invalidation for every two minutes due to soft commit, wonder how would it go OOM in simply two minutes or is it likely that a thread is holding the searcher due to long running query that might be potentially causing OOM? Was trying to reproduce but could not so far. Here is the filter cache config autowarmCount="1000"/> Query Results cache autowarmCount="5000"/> On 3/18/16 7:31 AM, Shawn Heisey wrote: On 3/18/2016 8:22 AM, Rallavagu wrote: So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn
Re: Document Cache
First, I want to make sure when you say "TTL", you're talking about documents being evicted from the documentCache and not the "Time To Live" option whereby documents are removed completely from the index. The time varies with the number of new documents fetched. This is an LRU cache whose size is configured in solrconfig.xml. It's pretty much unpredictable. If for some odd reason every request gets the same document it'll never be aged out. If no two queries return the same document, when "cache size" docs are fetched by subsequent requests. The entire thing is thrown out whenever a new searcher is opened (i.e. softCommit or hardCommit with openSearcher=true) But maybe this is an XY problem. Why do you care? Is there something you're seeing that you're trying to understand or is this just a general interest question? Best, Erick On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu wrote: > Solr 5.4 embedded Jetty > > Is it the right assumption that whenever a document that is returned as a > response to a query is cached in "Document Cache"? > > Essentially, if I request for any entry like /select?q=id: > will it be cached in "Document Cache"? If yes, what is the TTL? > > Thanks in advance >
Re: Document Cache
On 3/18/2016 8:22 AM, Rallavagu wrote: > So, each soft commit would create a new searcher that would invalidate > the old cache? > > Here is the configuration for Document Cache > > initialSize="10" autowarmCount="0"/> > > true In an earlier message, you indicated you're running into OOM. I think we can see why with this cache definition. There are exactly two ways to deal with OOM. One is to increase the heap size. The other is to reduce the amount of memory that the program requires by changing something -- that might be the code, the config, or how you're using it. Start by reducing that cache size to 4096 or 1024. https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap If yuo've also got a very large filterCache, reduce that size too. The filterCache typically eats up a LOT of memory, because each entry in the cache is very large. Thanks, Shawn
Re: Document Cache
So, each soft commit would create a new searcher that would invalidate the old cache? Here is the configuration for Document Cache autowarmCount="0"/> true Thanks On 3/18/16 12:45 AM, Emir Arnautovic wrote: Hi, Your cache will be cleared on soft commits - every two minutes. It seems that it is either configured to be huge or you have big documents and retrieving all fields or dont have lazy field loading set to true. Can you please share your document cache config and heap settings. Thanks, Emir On 17.03.2016 22:24, Rallavagu wrote: comments in line... On 3/17/16 2:16 PM, Erick Erickson wrote: First, I want to make sure when you say "TTL", you're talking about documents being evicted from the documentCache and not the "Time To Live" option whereby documents are removed completely from the index. May be TTL was not the right word to use here. I wanted learn the criteria for an entry to be ejected. The time varies with the number of new documents fetched. This is an LRU cache whose size is configured in solrconfig.xml. It's pretty much unpredictable. If for some odd reason every request gets the same document it'll never be aged out. If no two queries return the same document, when "cache size" docs are fetched by subsequent requests. The entire thing is thrown out whenever a new searcher is opened (i.e. softCommit or hardCommit with openSearcher=true) But maybe this is an XY problem. Why do you care? Is there something you're seeing that you're trying to understand or is this just a general interest question? I have following configuration, ${solr.autoCommit.maxTime:15000}false ${solr.autoSoftCommit.maxTime:12} As you can see, openSearcher is set to "false". What I am seeing is (from heap dump due to OutOfMemory error) that the LRUCache pertaining "Document Cache" occupies around 85% of available heap and that is causing OOM errors. So, trying to understand the behavior to address the OOM issues. Thanks Best, Erick On Thu, Mar 17, 2016 at 1:40 PM, Rallavagu wrote: Solr 5.4 embedded Jetty Is it the right assumption that whenever a document that is returned as a response to a query is cached in "Document Cache"? Essentially, if I request for any entry like /select?q=id: will it be cached in "Document Cache"? If yes, what is the TTL? Thanks in advance
Re: Ignoring the Document Cache per query
This is totally weird. The document cache should really have nothing to do with whether MLT returns documents or not AFAIK. So either I'm totally misunderstanding MLT, you're leaving out a step or there's some bug in Solr. The fact that setting the document cache to 0 changes the behavior, or restarting Solr and submitting the exact same request gives different behavior is strong evidence it's a problem with Solr. Could I ask you to open a JIRA and add all the relevant details you can? Especially if you could get it to work (well actually fail) with the techproducts data. But barring that, the (perhaps sanitized) queries you send to get diff results before and after. Best, Erick On Fri, May 29, 2015 at 7:10 AM, Bryan Bende wrote: > Thanks Erik. I realize this really makes no sense, but I was looking to > work around a problem. Here is the scenario... > > Using Solr 5.1 we have a service that utilizes the new mlt query parser to > get recommendations. So we start up the application, > ask for recommendations for a document, and everything works. > > Another feature is to "dislike" a document, and once it is "disliked" it > shouldn't show up as a recommended document. It > does this by looking up the disliked documents for a user and adding a > filter query to the recommendation call which excludes > the disliked documents. > > So now we dislike a document that was in the original list of > recommendations above, then ask for the recommendations again, > and now we get nothing back. If we restart Solr, or reload the collection, > then we can get it to work, but as soon as we dislike another > document we get back into a weird state. > > Through trial and error I narrowed down that if we set the documentCache > size to 0, then this problem doesn't happen. Since we can't > really figure out why this is happening in Solr, we were hoping there was > some way to not use the document cache on the call where > we use the mlt query parser. > > On Thu, May 28, 2015 at 5:44 PM, Erick Erickson > wrote: > >> First, there isn't that I know of. But why would you want to do this? >> >> On the face of it, it makes no sense to ignore the doc cache. One of its >> purposes is to hold the document (read off disk) for successive >> search components _in the same query_. Otherwise, each component >> might have to do a disk seek. >> >> So I must be missing why you want to do this. >> >> Best, >> Erick >> >> On Thu, May 28, 2015 at 1:23 PM, Bryan Bende wrote: >> > Is there a way to the document cache on a per-query basis? >> > >> > It looks like theres {!cache=false} for preventing the filter cache from >> > being used for a given query, looking for the same thing for the document >> > cache. >> > >> > Thanks, >> > >> > Bryan >>
Re: Ignoring the Document Cache per query
Thanks Erik. I realize this really makes no sense, but I was looking to work around a problem. Here is the scenario... Using Solr 5.1 we have a service that utilizes the new mlt query parser to get recommendations. So we start up the application, ask for recommendations for a document, and everything works. Another feature is to "dislike" a document, and once it is "disliked" it shouldn't show up as a recommended document. It does this by looking up the disliked documents for a user and adding a filter query to the recommendation call which excludes the disliked documents. So now we dislike a document that was in the original list of recommendations above, then ask for the recommendations again, and now we get nothing back. If we restart Solr, or reload the collection, then we can get it to work, but as soon as we dislike another document we get back into a weird state. Through trial and error I narrowed down that if we set the documentCache size to 0, then this problem doesn't happen. Since we can't really figure out why this is happening in Solr, we were hoping there was some way to not use the document cache on the call where we use the mlt query parser. On Thu, May 28, 2015 at 5:44 PM, Erick Erickson wrote: > First, there isn't that I know of. But why would you want to do this? > > On the face of it, it makes no sense to ignore the doc cache. One of its > purposes is to hold the document (read off disk) for successive > search components _in the same query_. Otherwise, each component > might have to do a disk seek. > > So I must be missing why you want to do this. > > Best, > Erick > > On Thu, May 28, 2015 at 1:23 PM, Bryan Bende wrote: > > Is there a way to the document cache on a per-query basis? > > > > It looks like theres {!cache=false} for preventing the filter cache from > > being used for a given query, looking for the same thing for the document > > cache. > > > > Thanks, > > > > Bryan >
Re: Ignoring the Document Cache per query
First, there isn't that I know of. But why would you want to do this? On the face of it, it makes no sense to ignore the doc cache. One of its purposes is to hold the document (read off disk) for successive search components _in the same query_. Otherwise, each component might have to do a disk seek. So I must be missing why you want to do this. Best, Erick On Thu, May 28, 2015 at 1:23 PM, Bryan Bende wrote: > Is there a way to the document cache on a per-query basis? > > It looks like theres {!cache=false} for preventing the filter cache from > being used for a given query, looking for the same thing for the document > cache. > > Thanks, > > Bryan
Ignoring the Document Cache per query
Is there a way to the document cache on a per-query basis? It looks like theres {!cache=false} for preventing the filter cache from being used for a given query, looking for the same thing for the document cache. Thanks, Bryan
Re: How to size document cache
On 10/25/2013 7:48 AM, Erick Erickson wrote: > I hadn't thought about it before, but now I'm curious how > MMapDirectoryFactory plays into documentCache. Uwe, > are you listening? :) My _guess_ is that if you're using > MMapDirectoryFactory, the usefulness of the document > cache is lessened, kinda. > > Since the documents are coming from essentially > random places in the files, you're probably going to chew > up op system blocks keeping these around. But that's > probably no worse than chewing up Java memory and > avoids some GC churn. Solr's caches save CPU cycles as well as disk access. If results can be returned from a Solr cache, then Solr (and ultimately, Lucene) don't have to go rifling through index data to figure out what the results are. Although this process is greatly sped up when the data is in the OS disk cache, it still isn't free. For large-scale caching, the OS is better at the job than Solr and Java. IMHO, the Solr caches are still important (but can be smaller) because data that is accessed a LOT will be very readily available. Thanks, Shawn
Re: How to size document cache
I hadn't thought about it before, but now I'm curious how MMapDirectoryFactory plays into documentCache. Uwe, are you listening? :) My _guess_ is that if you're using MMapDirectoryFactory, the usefulness of the document cache is lessened, kinda. Since the documents are coming from essentially random places in the files, you're probably going to chew up op system blocks keeping these around. But that's probably no worse than chewing up Java memory and avoids some GC churn. OTOH, the raw disk data must be decompressed, perhaps every time they're read no matter whether the data comes from the MMap IO buffers or have to be read from disk OTOH, unless the docs are really big, this shouldn't matter much. Hmmm, I guess "measure and find out" is about all I can really offer... Best, Erick On Fri, Oct 25, 2013 at 6:28 AM, Matteo Grolla wrote: > Hi, > I'd really appreciate if you could give me some help understanding > how to tune the document cache. > My thoughts: > > min values: max_results * max_concurrent_queries, as stated by > http://wiki.apache.org/solr/SolrCaching > how can I estimate max_concurrent_queries? > > size: I think there's a tension between dedicating memory to this > cache and reducing the java heap size so the OS can buffer more of the > index on disk > probably I could try increasing this value if I see strong > benefits on the hit ratio (the documents returned are a small subset of all > docs) > > If I have enough RAM that the whole index fits in memory > can I just ignore this cache? (maybe just keep it just above the > recommended min values) > > > Matteo
How to size document cache
Hi, I'd really appreciate if you could give me some help understanding how to tune the document cache. My thoughts: min values: max_results * max_concurrent_queries, as stated by http://wiki.apache.org/solr/SolrCaching how can I estimate max_concurrent_queries? size: I think there's a tension between dedicating memory to this cache and reducing the java heap size so the OS can buffer more of the index on disk probably I could try increasing this value if I see strong benefits on the hit ratio (the documents returned are a small subset of all docs) If I have enough RAM that the whole index fits in memory can I just ignore this cache? (maybe just keep it just above the recommended min values) Matteo
Re: Soft Commit and Document Cache
Thanks Shawn and Mark! That was very helpful. -Niran > > From: Shawn Heisey >To: solr-user@lucene.apache.org >Sent: Monday, April 22, 2013 5:30 PM >Subject: Re: Soft Commit and Document Cache > > >On 4/22/2013 4:16 PM, Niran Fajemisin wrote: >> A quick (and hopefully simply) question: Does the document cache (or any of >> the other caches for that matter), get invalidated after a soft commit has >> been performed? > >All Solr caches are invalidated when you issue a commit with >openSearcher set to true. There would be no reason to do a soft commit >with openSearcher set to false. That setting only makes sense with hard >commits. > >If you have queries defined for the newSearcher event, then they will be >run, which can pre-populate caches. > >The filterCache and queryResultCache can be autowarmed on commit - the >most relevant autowarmCount queries in the cache from the old searcher >are re-run against the new searcher. The queryResultWindowSize >parameter helps control exactly what gets cached with the queryResultCache. > >The documentCache cannot be autowarmed, although I *think* that when >entries from the queryResultCache are run, it will also populate the >documentCache, though I could be wrong about that. > >I do not know whether autowarming is done before or after newSearcher >queries. > >http://wiki.apache.org/solr/SolrCaching > >Thanks, >Shawn > > > >
Re: Soft Commit and Document Cache
On 4/22/2013 4:16 PM, Niran Fajemisin wrote: A quick (and hopefully simply) question: Does the document cache (or any of the other caches for that matter), get invalidated after a soft commit has been performed? All Solr caches are invalidated when you issue a commit with openSearcher set to true. There would be no reason to do a soft commit with openSearcher set to false. That setting only makes sense with hard commits. If you have queries defined for the newSearcher event, then they will be run, which can pre-populate caches. The filterCache and queryResultCache can be autowarmed on commit - the most relevant autowarmCount queries in the cache from the old searcher are re-run against the new searcher. The queryResultWindowSize parameter helps control exactly what gets cached with the queryResultCache. The documentCache cannot be autowarmed, although I *think* that when entries from the queryResultCache are run, it will also populate the documentCache, though I could be wrong about that. I do not know whether autowarming is done before or after newSearcher queries. http://wiki.apache.org/solr/SolrCaching Thanks, Shawn
Re: Soft Commit and Document Cache
Yup - all of the top level caches are. It's a trade off - don't NRT more than you need to. - Mark On Apr 22, 2013, at 6:16 PM, Niran Fajemisin wrote: > Hi all, > > A quick (and hopefully simply) question: Does the document cache (or any of > the other caches for that matter), get invalidated after a soft commit has > been performed? > > Thanks, > Niran
Soft Commit and Document Cache
Hi all, A quick (and hopefully simply) question: Does the document cache (or any of the other caches for that matter), get invalidated after a soft commit has been performed? Thanks, Niran
RE: Disabling document cache usage
Hi, Commenting them out works fine. We don't use documentCaches either as they eat too much and return only so little. Cheers -Original message- > From:Otis Gospodnetic > Sent: Tue 15-Jan-2013 17:29 > To: solr-user@lucene.apache.org > Subject: Re: Disabling document cache usage > > Hi, > > Thanks Markus. > How are caches disabled these days... in Solr 4.0 that is? I remember > trying to comment them out in the past, but seeing them still enabled and > used with some custom size and other settings. > > Thanks, > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Tue, Jan 15, 2013 at 11:00 AM, Markus Jelsma > wrote: > > > No, SolrIndexSearcher has no mechanism to do that. The only way is to > > disable the cache altogether or patch it up :) > > > > > > > > -Original message- > > > From:Otis Gospodnetic > > > Sent: Tue 15-Jan-2013 16:57 > > > To: solr-user@lucene.apache.org > > > Subject: Disabling document cache usage > > > > > > Hi, > > > > > > https://issues.apache.org/jira/browse/SOLR-2429 added the ability to > > > disable filter and query caches on a request by request basis. > > > > > > Is there anything one can use to disable usage of (lookups and insertion > > > into) document cache? > > > > > > Thanks, > > > Otis > > > -- > > > Solr & ElasticSearch Support > > > http://sematext.com/ > > > > > >
Re: Disabling document cache usage
Hi, Thanks Markus. How are caches disabled these days... in Solr 4.0 that is? I remember trying to comment them out in the past, but seeing them still enabled and used with some custom size and other settings. Thanks, Otis -- Solr & ElasticSearch Support http://sematext.com/ On Tue, Jan 15, 2013 at 11:00 AM, Markus Jelsma wrote: > No, SolrIndexSearcher has no mechanism to do that. The only way is to > disable the cache altogether or patch it up :) > > > > -Original message- > > From:Otis Gospodnetic > > Sent: Tue 15-Jan-2013 16:57 > > To: solr-user@lucene.apache.org > > Subject: Disabling document cache usage > > > > Hi, > > > > https://issues.apache.org/jira/browse/SOLR-2429 added the ability to > > disable filter and query caches on a request by request basis. > > > > Is there anything one can use to disable usage of (lookups and insertion > > into) document cache? > > > > Thanks, > > Otis > > -- > > Solr & ElasticSearch Support > > http://sematext.com/ > > >
RE: Disabling document cache usage
No, SolrIndexSearcher has no mechanism to do that. The only way is to disable the cache altogether or patch it up :) -Original message- > From:Otis Gospodnetic > Sent: Tue 15-Jan-2013 16:57 > To: solr-user@lucene.apache.org > Subject: Disabling document cache usage > > Hi, > > https://issues.apache.org/jira/browse/SOLR-2429 added the ability to > disable filter and query caches on a request by request basis. > > Is there anything one can use to disable usage of (lookups and insertion > into) document cache? > > Thanks, > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ >
Re: document cache
Yes. In fact, all the caches get flushed on every commit/replication cycle. Some of the caches get autowarmed when a new searcher is opened, which happens...you guessed it...every time a commit/replication happens. Best Erick On Tue, May 15, 2012 at 1:32 AM, shinkanze wrote: > hi , > > I want to know the internal mechanism how document cache works . > > specifically its flushing cycle ... > > i.e does it gets flushed on every commit /replication . > > regards > > Rajat Rastogi > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html > Sent from the Solr - User mailing list archive at Nabble.com.
document cache
hi , I want to know the internal mechanism how document cache works . specifically its flushing cycle ... i.e does it gets flushed on every commit /replication . regards Rajat Rastogi -- View this message in context: http://lucene.472066.n3.nabble.com/document-cache-tp3983796.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: maximum recommended document cache size
The general recommendation is to watch the caches during normal user searches and keep increasing the size until evictions start happening. This may or may not work for your situation. The problem is that the eviction rate does not show "lifetime in cache". So if 90% of the cache sits there indefinitely and the remaining 10% churns, the cache is fine but you'll show zillions of evictions. On Thu, May 13, 2010 at 10:38 AM, Nagelberg, Kallin wrote: > I am trying to tune my Solr setup so that the caches are well warmed after > the index is updated. My documents are quite small, usually under 10k. I > currently have a document cache size of about 15,000, and am warming up 5,000 > with a query after each indexing. Autocommit is set at 30 seconds, and my > caches are warming up easily in just a couple of seconds. I've read of > concerns regarding garbage collection when your cache is too large. Does > anyone have experience with this? Ideally I would like to get 90% of all > documents from the last month in memory after each index, which would be > around 25,000. I'm doing extensive load testing, but if someone has > recommendations I'd love to hear them. > > Thanks, > -Kallin Nagelberg > -- Lance Norskog goks...@gmail.com
maximum recommended document cache size
I am trying to tune my Solr setup so that the caches are well warmed after the index is updated. My documents are quite small, usually under 10k. I currently have a document cache size of about 15,000, and am warming up 5,000 with a query after each indexing. Autocommit is set at 30 seconds, and my caches are warming up easily in just a couple of seconds. I've read of concerns regarding garbage collection when your cache is too large. Does anyone have experience with this? Ideally I would like to get 90% of all documents from the last month in memory after each index, which would be around 25,000. I'm doing extensive load testing, but if someone has recommendations I'd love to hear them. Thanks, -Kallin Nagelberg