RE: Facets and running out of Heap Space
It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the problem to another step in the process? DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
Re: Facets and running out of Heap Space
On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this will work well depends greatly on the cardinality of your facet sets. facet.enum.cache.minDf set high is another option (will not generate a bitset for any value whose facet set is less that this value). Both options have performance implications. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the problem to another step in the process? Running one query per unique facet value seems impractical, if that is what you are suggesting. Setting minDf to a very high value should always outperform such an approach. -Mike DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
RE: Facets and running out of Heap Space
Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that? -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 4:56 PM To: solr-user@lucene.apache.org Cc: stuhood Subject: Re: Facets and running out of Heap Space On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this will work well depends greatly on the cardinality of your facet sets. facet.enum.cache.minDf set high is another option (will not generate a bitset for any value whose facet set is less that this value). Both options have performance implications. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the problem to another step in the process? Running one query per unique facet value seems impractical, if that is what you are suggesting. Setting minDf to a very high value should always outperform such an approach. -Mike DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
Re: Facets and running out of Heap Space
On 10-Oct-07, at 2:40 PM, David Whalen wrote: Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that? For the fields that matter (many unique values), this is likely result in a performance regression. It might be better to try storing less unique data. For instance, faceting on the blog_url field, or create_date in your schema would case problems (they probably have millions of unique values). It would be helpful to know which field is causing the problem. One way would be to do a sorted query on a quiescent index for each field, and see if there are any suspiciously large jumps in memory usage. -Mike -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 4:56 PM To: solr-user@lucene.apache.org Cc: stuhood Subject: Re: Facets and running out of Heap Space On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this will work well depends greatly on the cardinality of your facet sets. facet.enum.cache.minDf set high is another option (will not generate a bitset for any value whose facet set is less that this value). Both options have performance implications. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the problem to another step in the process? Running one query per unique facet value seems impractical, if that is what you are suggesting. Setting minDf to a very high value should always outperform such an approach. -Mike DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
RE: Facets and running out of Heap Space
I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we actually need a counter-intuitive solution. That's why I thought Field Queries might be the solution. Is there some reason to avoid setting multiValued to true here? It sounds like it would be the true cure-all Thanks again! dave -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 6:20 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 10-Oct-07, at 2:40 PM, David Whalen wrote: Accoriding to Yonik I can't use minDf because I'm faceting on a string field. I'm thinking of changing it to a tokenized type so that I can utilize this setting, but then I'll have to rebuild my entire index. Unless there's some way around that? For the fields that matter (many unique values), this is likely result in a performance regression. It might be better to try storing less unique data. For instance, faceting on the blog_url field, or create_date in your schema would case problems (they probably have millions of unique values). It would be helpful to know which field is causing the problem. One way would be to do a sorted query on a quiescent index for each field, and see if there are any suspiciously large jumps in memory usage. -Mike -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 10, 2007 4:56 PM To: solr-user@lucene.apache.org Cc: stuhood Subject: Re: Facets and running out of Heap Space On 10-Oct-07, at 12:19 PM, David Whalen wrote: It looks now like I can't use facets the way I was hoping to because the memory requirements are impractical. I can't remember if this has been mentioned, but upping the HashDocSet size is one way to reduce memory consumption. Whether this will work well depends greatly on the cardinality of your facet sets. facet.enum.cache.minDf set high is another option (will not generate a bitset for any value whose facet set is less that this value). Both options have performance implications. So, as an alternative I was thinking I could get counts by doing rows=0 and using filter queries. Is there a reason to think that this might perform better? Or, am I simply moving the problem to another step in the process? Running one query per unique facet value seems impractical, if that is what you are suggesting. Setting minDf to a very high value should always outperform such an approach. -Mike DW -Original Message- From: Stu Hood [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 10:53 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
Re: Facets and running out of Heap Space
On 10-Oct-07, at 3:46 PM, David Whalen wrote: I'll see what I can do about that. Truthfully, the most important facet we need is the one on media_type, which has only 4 unique values. The second most important one to us is location, which has about 30 unique values. So, it would seem like we actually need a counter-intuitive solution. That's why I thought Field Queries might be the solution. Is there some reason to avoid setting multiValued to true here? It sounds like it would be the true cure-all Should work. It would cost about 100 MB on a 25m corpus for those two fields. Have you tried setting multivalued=true without reindexing? I'm not sure, but I think it will work. -Mike
Re: Facets and running out of Heap Space
On 10/10/07, Mike Klaas [EMAIL PROTECTED] wrote: Have you tried setting multivalued=true without reindexing? I'm not sure, but I think it will work. Yes, that will work fine. One thing that will change is the response format for stored fields arr name=foostrval1/str/arr instead of str name=fooval1/str Hopefully in the future we can specify a faceting method w/o having to change the schema. -Yonik
Facets and running out of Heap Space
Hi All. I run a faceted query against a very large index on a regular schedule. Every now and then the query throws an out of heap space error, and we're sunk. So, naturally we increased the heap size and things worked well for a while and then the errors would happen again. We've increased the initial heap size to 2.5GB and it's still happening. Is there anything we can do about this? Thanks in advance, Dave W
Re: Facets and running out of Heap Space
On 10/9/07, David Whalen [EMAIL PROTECTED] wrote: I run a faceted query against a very large index on a regular schedule. Every now and then the query throws an out of heap space error, and we're sunk. So, naturally we increased the heap size and things worked well for a while and then the errors would happen again. We've increased the initial heap size to 2.5GB and it's still happening. Is there anything we can do about this? Try facet.enum.cache.minDf param: http://wiki.apache.org/solr/SimpleFacetParameters -Yonik
RE: Facets and running out of Heap Space
Hi Yonik. According to the doc: This is only used during the term enumeration method of faceting (facet.field type faceting on multi-valued or full-text fields). What if I'm faceting on just a plain String field? It's not full-text, and I don't have multiValued set for it Dave -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 12:47 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 10/9/07, David Whalen [EMAIL PROTECTED] wrote: I run a faceted query against a very large index on a regular schedule. Every now and then the query throws an out of heap space error, and we're sunk. So, naturally we increased the heap size and things worked well for a while and then the errors would happen again. We've increased the initial heap size to 2.5GB and it's still happening. Is there anything we can do about this? Try facet.enum.cache.minDf param: http://wiki.apache.org/solr/SimpleFacetParameters -Yonik
Re: Facets and running out of Heap Space
On 10/9/07, David Whalen [EMAIL PROTECTED] wrote: This is only used during the term enumeration method of faceting (facet.field type faceting on multi-valued or full-text fields). What if I'm faceting on just a plain String field? It's not full-text, and I don't have multiValued set for it Then you will be using the FieldCache counting method, and this param is not applicable :-) Are all your field that you facet on like this? The FieldCache entry might be taking up too much room, esp if the number of entries is high, and the entries are big. The requests themselves can take up a good chunk of memory temporarily (4 bytes * nValuesInField). You could try a memory profiling tool and see where all the memory is being taken up too. -Yonik
Re: Facets and running out of Heap Space
: So, naturally we increased the heap size and things worked : well for a while and then the errors would happen again. : We've increased the initial heap size to 2.5GB and it's : still happening. is this the same 25,000,000 document index you mentioned before? 2.5GB of heap doesn't seem like much if you are also doing faceting ... even if you are faceting on an int field, there's going to be 95MB of FieldCache for that field, you said this was a string field, so it's going to be 95MB+however much space is needed for all the terms (presumably if you are faceting on this field every doc doesn't have a unique value, but even assuming a conservative 10% unique values of 10 characters each that's another ~50MB, so we're up to about 150MB of FieldCache to facet that field -- and we haven't even started talking about how big the index is itself (or how big the filterCache gets, or how many other fields you are faceting on) how big is your index on disk? are you faceting or sorting on other fields as well? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? -Hoss
RE: Facets and running out of Heap Space
Make sure you have: requestHandler name=/admin/luke class=org.apache.solr.handler.admin.LukeRequestHandler / defined in solrconfig.xml What's the consequence of me changing the solrconfig.xml file? Doesn't that cause a restart of solr? for a large index, this can be very slow but the results are valuable. In what way? I'm still not clear on what this does for me -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Tuesday, October 09, 2007 4:01 PM To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? check: http://wiki.apache.org/solr/LukeRequestHandler Make sure you have: requestHandler name=/admin/luke class=org.apache.solr.handler.admin.LukeRequestHandler / defined in solrconfig.xml for a large index, this can be very slow but the results are valuable. ryan
Re: Facets and running out of Heap Space
David Whalen wrote: Make sure you have: requestHandler name=/admin/luke class=org.apache.solr.handler.admin.LukeRequestHandler / defined in solrconfig.xml What's the consequence of me changing the solrconfig.xml file? Doesn't that cause a restart of solr? editing solrconfig.xml does *not* restart solr. But you need to restart solr to see any changes to solrconfig. for a large index, this can be very slow but the results are valuable. In what way? I'm still not clear on what this does for me It gives you all kinds of index statistics - that may or may not be useful in figuring out how big field caches will need to be. It is just a diagnostics tool, not a fix. ryan
Re: Facets and running out of Heap Space
On 9-Oct-07, at 12:36 PM, David Whalen wrote: field name=id type=string indexed=true stored=true / field name=content_date type=date indexed=true stored=true / field name=media_type type=string indexed=true stored=true / field name=location type=string indexed=true stored=true / field name=country_code type=string indexed=true stored=true / field name=text type=text indexed=true stored=true multiValued=true / field name=content_source type=string indexed=true stored=true / field name=title type=string indexed=true stored=true / field name=site_id type=string indexed=true stored=true / field name=journalist_id type=string indexed=true stored=true / field name=blog_url type=string indexed=true stored=true / field name=created_date type=date indexed=true stored=true / I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
Re: Facets and running out of Heap Space
Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). Thanks, Stu -Original Message- From: Mike Klaas [EMAIL PROTECTED] Sent: Tuesday, October 9, 2007 9:30pm To: solr-user@lucene.apache.org Subject: Re: Facets and running out of Heap Space On 9-Oct-07, at 12:36 PM, David Whalen wrote: (snip) I'm sure we could stop storing many of these columns, especially if someone told me that would make a big difference. I don't think that it would make a difference in memory consumption, but storage is certainly not necessary for faceting. Extra stored fields can slow down search if they are large (in terms of bytes), but don't really occupy extra memory, unless they are polluting the doc cache. Does 'text' need to be stored? what does the LukeReqeust Handler tell you about the # of distinct terms in each field that you facet on? Where would I find that? I could probably estimate that myself on a per-column basis. it ranges from 4 distinct values for media_type to 30-ish for location to 200-ish for country_code to almost 10,000 for site_id to almost 100,000 for journalist_id. Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_, so it should be a net win for those (although quite close in space requirements for a 30-ary field on your index size). -Mike
Re: Facets and running out of Heap Space
On 9-Oct-07, at 7:53 PM, Stu Hood wrote: Using the filter cache method on the things like media type and location; this will occupy ~2.3MB of memory _per unique value_ Mike, how did you calculate that value? I'm trying to tune my caches, and any equations that could be used to determine some balanced settings would be extremely helpful. I'm in a memory limited environment, so I can't afford to throw a ton of cache at the problem. 8bits * 25m docs. Note that HashSet filters will be smaller (cardinality 3000). (I don't want to thread-jack, but I'm also wondering whether anyone has any notes on how to tune cache sizes for the filterCache, queryResultCache and documentCache). I'll give the usual Solr answer: it depends g. For me: The filterCache is the most important. I want my faceting filters to be there at all times, as well as the common fq's I throw at Solr. I have this bumped up to 4096 or so. The queryResultCache isn't too important. I'm mostly interested in keeping around a few recent queries since they tend to be reexecuted. There is generally not a whole lot of overlap, though, and I never page very far into the results (10 results over 100 slaves is more than I typically would ever need). Memory usage is quite low, though, so you might have success going nuts with this cache. docCache? Make sure this is set to at least maxResults*max concurrent queries, since the query processing sometimes assumes fetching a document earlier in the request will let us retrieve it for free later in the request from the cache. Other than that, it depends on your document usage overlap. It you have a set of documents needed for meta-data storage, it behooves you to make sure these are always cached. cheers, -Mike
Cache Memory Usage (was: Facets and running out of Heap Space)
Sorry... where do the unique values come into the equation? Also, you say that the queryResultCache memory usage is very low... how could this be when it is storing the same information as the filterCache, but with the addition of sorting? Your answers are very helpful, thanks! Stu Hood Webmail.us You manage your business. We'll manage your email.®
Re: Cache Memory Usage (was: Facets and running out of Heap Space)
On 9-Oct-07, at 8:28 PM, Stu Hood wrote: Sorry... where do the unique values come into the equation? Faceting. You should have a filterCache # unique values in all fields faceted-on (using the fieldCache method). Also, you say that the queryResultCache memory usage is very low... how could this be when it is storing the same information as the filterCache, but with the addition of sorting? Solr caches only the top N documents in the queryResultCache (boosted by queryResultWindowSize), which amounts to 40-odd ints, 40-odd float, and change. -Mike