Re: FilterCache - maximum size of document set

2012-06-15 Thread Erick Erickson
Test first, of course, but slave on 3.6 and master on 3.5 should be
fine. If you're
getting evictions with the cache settings that high, you really want
to look at why.

Note that in particular, using NOW in your filter queries virtually guarantees
that they won't be re-used as per the link I sent yesterday.

Best
Erick

On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog pawelro...@gmail.com wrote:
 It can be true that filters cache max size is set to high value. That is
 also true that.
 We looked at evictions and hit rate earlier. Maybe you are right that
 evictions are
 not always unwanted. Some time ago we made tests. There are not so high
 difference in hit rate when filters maxSize is set to 4000 (hit rate about
 85%) and
 16000 (hitrate about 91%). I think that also using LFU cache can be helpful
 but
 it makes me to migrate to 3.6. Do you think it is reasonable to use slave on
 version 3.6 and master on 3.5?

 Once again, Thanks for your help

 --
 Pawel

 On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Hmmm, your maxSize is pretty high, it may just be that you've set this
 much higher
 than is wise. The maxSize setting governs the number of entries. I'd start
 with
 a much lower number here, and monitor the solr/admin page for both
 hit ratio and evictions. Well, and size too. 16,000 entries puts a
 ceiling of, what,
 48G on it? Ouch! It sounds like what's happening here is you're just
 accumulating
 more and more fqs over the course of the evening and blowing memory.

 Not all FQs will be that big, there's some heuristics in there to just
 store the
 document numbers for sparse filters, maxDocs/8 is pretty much the upper
 bound though.

 Evictions are not necessarily a bad thing, the hit-ratio is important
 here. And
 if you're using a bare NOW in your filter queries, you're probably never
 re-using them anyway, see:

 http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/

 I really question whether this limit is reasonable, but you know your
 situation best.

 Best
 Erick

 On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote:
  Thanks for your response
  Yes, maybe you are right. I thought that filters can be larger than 3M.
 All
  kinds of filters uses BitSet?
  Moreover maxSize of filterCache is set to 16000 in my case. There are
  evictions during day traffic
  but not during night traffic.
 
  Version of Solr which I use is 3.5
 
  I haven't used Memory Anayzer yet. Could you write more details about it?
 
  --
  Regards,
  Pawel
 
  On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
  Hmmm, I think you may be looking at the wrong thing here. Generally, a
  filterCache
  entry will be maxDocs/8 (plus some overhead), so in your case they
 really
  shouldn't be all that large, on the order of 3M/filter. That shouldn't
  vary based
  on the number of docs that match the fq, it's just a bitset. To see if
  that makes any
  sense, take a look at the admin page and the number of evictions in
  your filterCache. If
  that is  0, you're probably using all the memory you're going to in
  the filterCache during
  the day..
 
  But you haven't indicated what version of Solr you're using, I'm going
  from a
  relatively recent 3x knowledge-base.
 
  Have you put a memory analyzer against your Solr instance to see where
  the memory
  is being used?
 
  Best
  Erick
 
  On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
   Hi,
   I have solr index with about 25M documents. I optimized FilterCache
 size
  to
   reach the best performance (considering traffic characteristic that my
  Solr
   handles). I see that the only way to limit size of a Filter Cace is to
  set
   number of document sets that Solr can cache. There is no way to set
  memory
   limit (eg. 2GB, 4GB or something like that). When I process a standard
   trafiic (during day) everything is fine. But when Solr handle night
  traffic
   (and the charateristic of requests change) some problems appear.
 There is
   JVM out of memory error. I know what is the reason. Some filters on
 some
   fields are quite poor filters. They returns 15M of documents or even
  more.
   You could say 'Just put that into q'. I tried to put that filters into
   Query part but then, the statistics of request processing time
 (during
   day) become much worse. Reduction of Filter Cache maxSize is also not
  good
   solution because during day cache filters are very very helpful.
   You could be interested in type of filters that I use. These are range
   filters (I tried standard range filters and frange) - eg. price:[* TO
   1]. Some fq with price can return few thousands of results (eg.
   price:[40 TO 50]), but some (eg. price:[* TO 1]) can return
 milions
  of
   documents. I'd also like to avoid solution which will introduce strict
   ranges that user can choose.
   Have you any suggestions what can I do? Is there any way to limit 

Re: FilterCache - maximum size of document set

2012-06-15 Thread Pawel Rog
Thanks
I don't use NOW in queries. All my filters with timestamp are rounded to
hundreds of
seconds to increase hitrate. The only problem could be in price filters
which can be
varied (users are unpredictable :P), but also that filters from fq or
setting cache=false
is also bad idea ... checked it :) Load rised three times :)

--
Pawel

On Fri, Jun 15, 2012 at 1:30 PM, Erick Erickson erickerick...@gmail.comwrote:

 Test first, of course, but slave on 3.6 and master on 3.5 should be
 fine. If you're
 getting evictions with the cache settings that high, you really want
 to look at why.

 Note that in particular, using NOW in your filter queries virtually
 guarantees
 that they won't be re-used as per the link I sent yesterday.

 Best
 Erick

 On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog pawelro...@gmail.com wrote:
  It can be true that filters cache max size is set to high value. That is
  also true that.
  We looked at evictions and hit rate earlier. Maybe you are right that
  evictions are
  not always unwanted. Some time ago we made tests. There are not so high
  difference in hit rate when filters maxSize is set to 4000 (hit rate
 about
  85%) and
  16000 (hitrate about 91%). I think that also using LFU cache can be
 helpful
  but
  it makes me to migrate to 3.6. Do you think it is reasonable to use
 slave on
  version 3.6 and master on 3.5?
 
  Once again, Thanks for your help
 
  --
  Pawel
 
  On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Hmmm, your maxSize is pretty high, it may just be that you've set this
  much higher
  than is wise. The maxSize setting governs the number of entries. I'd
 start
  with
  a much lower number here, and monitor the solr/admin page for both
  hit ratio and evictions. Well, and size too. 16,000 entries puts a
  ceiling of, what,
  48G on it? Ouch! It sounds like what's happening here is you're just
  accumulating
  more and more fqs over the course of the evening and blowing memory.
 
  Not all FQs will be that big, there's some heuristics in there to just
  store the
  document numbers for sparse filters, maxDocs/8 is pretty much the upper
  bound though.
 
  Evictions are not necessarily a bad thing, the hit-ratio is important
  here. And
  if you're using a bare NOW in your filter queries, you're probably never
  re-using them anyway, see:
 
 
 http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/
 
  I really question whether this limit is reasonable, but you know your
  situation best.
 
  Best
  Erick
 
  On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com
 wrote:
   Thanks for your response
   Yes, maybe you are right. I thought that filters can be larger than
 3M.
  All
   kinds of filters uses BitSet?
   Moreover maxSize of filterCache is set to 16000 in my case. There are
   evictions during day traffic
   but not during night traffic.
  
   Version of Solr which I use is 3.5
  
   I haven't used Memory Anayzer yet. Could you write more details about
 it?
  
   --
   Regards,
   Pawel
  
   On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
  erickerick...@gmail.comwrote:
  
   Hmmm, I think you may be looking at the wrong thing here. Generally,
 a
   filterCache
   entry will be maxDocs/8 (plus some overhead), so in your case they
  really
   shouldn't be all that large, on the order of 3M/filter. That
 shouldn't
   vary based
   on the number of docs that match the fq, it's just a bitset. To see
 if
   that makes any
   sense, take a look at the admin page and the number of evictions in
   your filterCache. If
   that is  0, you're probably using all the memory you're going to in
   the filterCache during
   the day..
  
   But you haven't indicated what version of Solr you're using, I'm
 going
   from a
   relatively recent 3x knowledge-base.
  
   Have you put a memory analyzer against your Solr instance to see
 where
   the memory
   is being used?
  
   Best
   Erick
  
   On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com
 wrote:
Hi,
I have solr index with about 25M documents. I optimized FilterCache
  size
   to
reach the best performance (considering traffic characteristic
 that my
   Solr
handles). I see that the only way to limit size of a Filter Cace
 is to
   set
number of document sets that Solr can cache. There is no way to set
   memory
limit (eg. 2GB, 4GB or something like that). When I process a
 standard
trafiic (during day) everything is fine. But when Solr handle night
   traffic
(and the charateristic of requests change) some problems appear.
  There is
JVM out of memory error. I know what is the reason. Some filters on
  some
fields are quite poor filters. They returns 15M of documents or
 even
   more.
You could say 'Just put that into q'. I tried to put that filters
 into
Query part but then, the statistics of request processing time
  (during
day) become much worse. Reduction of Filter Cache maxSize 

Re: FilterCache - maximum size of document set

2012-06-14 Thread Erick Erickson
Hmmm, your maxSize is pretty high, it may just be that you've set this
much higher
than is wise. The maxSize setting governs the number of entries. I'd start with
a much lower number here, and monitor the solr/admin page for both
hit ratio and evictions. Well, and size too. 16,000 entries puts a
ceiling of, what,
48G on it? Ouch! It sounds like what's happening here is you're just
accumulating
more and more fqs over the course of the evening and blowing memory.

Not all FQs will be that big, there's some heuristics in there to just store the
document numbers for sparse filters, maxDocs/8 is pretty much the upper
bound though.

Evictions are not necessarily a bad thing, the hit-ratio is important here. And
if you're using a bare NOW in your filter queries, you're probably never
re-using them anyway, see:
http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/

I really question whether this limit is reasonable, but you know your
situation best.

Best
Erick

On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote:
 Thanks for your response
 Yes, maybe you are right. I thought that filters can be larger than 3M. All
 kinds of filters uses BitSet?
 Moreover maxSize of filterCache is set to 16000 in my case. There are
 evictions during day traffic
 but not during night traffic.

 Version of Solr which I use is 3.5

 I haven't used Memory Anayzer yet. Could you write more details about it?

 --
 Regards,
 Pawel

 On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Hmmm, I think you may be looking at the wrong thing here. Generally, a
 filterCache
 entry will be maxDocs/8 (plus some overhead), so in your case they really
 shouldn't be all that large, on the order of 3M/filter. That shouldn't
 vary based
 on the number of docs that match the fq, it's just a bitset. To see if
 that makes any
 sense, take a look at the admin page and the number of evictions in
 your filterCache. If
 that is  0, you're probably using all the memory you're going to in
 the filterCache during
 the day..

 But you haven't indicated what version of Solr you're using, I'm going
 from a
 relatively recent 3x knowledge-base.

 Have you put a memory analyzer against your Solr instance to see where
 the memory
 is being used?

 Best
 Erick

 On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
  Hi,
  I have solr index with about 25M documents. I optimized FilterCache size
 to
  reach the best performance (considering traffic characteristic that my
 Solr
  handles). I see that the only way to limit size of a Filter Cace is to
 set
  number of document sets that Solr can cache. There is no way to set
 memory
  limit (eg. 2GB, 4GB or something like that). When I process a standard
  trafiic (during day) everything is fine. But when Solr handle night
 traffic
  (and the charateristic of requests change) some problems appear. There is
  JVM out of memory error. I know what is the reason. Some filters on some
  fields are quite poor filters. They returns 15M of documents or even
 more.
  You could say 'Just put that into q'. I tried to put that filters into
  Query part but then, the statistics of request processing time (during
  day) become much worse. Reduction of Filter Cache maxSize is also not
 good
  solution because during day cache filters are very very helpful.
  You could be interested in type of filters that I use. These are range
  filters (I tried standard range filters and frange) - eg. price:[* TO
  1]. Some fq with price can return few thousands of results (eg.
  price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions
 of
  documents. I'd also like to avoid solution which will introduce strict
  ranges that user can choose.
  Have you any suggestions what can I do? Is there any way to limit for
  example maximum size of docSet which is cached in FilterCache?
 
  --
  Pawel



Re: FilterCache - maximum size of document set

2012-06-14 Thread Pawel Rog
It can be true that filters cache max size is set to high value. That is
also true that.
We looked at evictions and hit rate earlier. Maybe you are right that
evictions are
not always unwanted. Some time ago we made tests. There are not so high
difference in hit rate when filters maxSize is set to 4000 (hit rate about
85%) and
16000 (hitrate about 91%). I think that also using LFU cache can be helpful
but
it makes me to migrate to 3.6. Do you think it is reasonable to use slave on
version 3.6 and master on 3.5?

Once again, Thanks for your help

--
Pawel

On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, your maxSize is pretty high, it may just be that you've set this
 much higher
 than is wise. The maxSize setting governs the number of entries. I'd start
 with
 a much lower number here, and monitor the solr/admin page for both
 hit ratio and evictions. Well, and size too. 16,000 entries puts a
 ceiling of, what,
 48G on it? Ouch! It sounds like what's happening here is you're just
 accumulating
 more and more fqs over the course of the evening and blowing memory.

 Not all FQs will be that big, there's some heuristics in there to just
 store the
 document numbers for sparse filters, maxDocs/8 is pretty much the upper
 bound though.

 Evictions are not necessarily a bad thing, the hit-ratio is important
 here. And
 if you're using a bare NOW in your filter queries, you're probably never
 re-using them anyway, see:

 http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/

 I really question whether this limit is reasonable, but you know your
 situation best.

 Best
 Erick

 On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote:
  Thanks for your response
  Yes, maybe you are right. I thought that filters can be larger than 3M.
 All
  kinds of filters uses BitSet?
  Moreover maxSize of filterCache is set to 16000 in my case. There are
  evictions during day traffic
  but not during night traffic.
 
  Version of Solr which I use is 3.5
 
  I haven't used Memory Anayzer yet. Could you write more details about it?
 
  --
  Regards,
  Pawel
 
  On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
  Hmmm, I think you may be looking at the wrong thing here. Generally, a
  filterCache
  entry will be maxDocs/8 (plus some overhead), so in your case they
 really
  shouldn't be all that large, on the order of 3M/filter. That shouldn't
  vary based
  on the number of docs that match the fq, it's just a bitset. To see if
  that makes any
  sense, take a look at the admin page and the number of evictions in
  your filterCache. If
  that is  0, you're probably using all the memory you're going to in
  the filterCache during
  the day..
 
  But you haven't indicated what version of Solr you're using, I'm going
  from a
  relatively recent 3x knowledge-base.
 
  Have you put a memory analyzer against your Solr instance to see where
  the memory
  is being used?
 
  Best
  Erick
 
  On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
   Hi,
   I have solr index with about 25M documents. I optimized FilterCache
 size
  to
   reach the best performance (considering traffic characteristic that my
  Solr
   handles). I see that the only way to limit size of a Filter Cace is to
  set
   number of document sets that Solr can cache. There is no way to set
  memory
   limit (eg. 2GB, 4GB or something like that). When I process a standard
   trafiic (during day) everything is fine. But when Solr handle night
  traffic
   (and the charateristic of requests change) some problems appear.
 There is
   JVM out of memory error. I know what is the reason. Some filters on
 some
   fields are quite poor filters. They returns 15M of documents or even
  more.
   You could say 'Just put that into q'. I tried to put that filters into
   Query part but then, the statistics of request processing time
 (during
   day) become much worse. Reduction of Filter Cache maxSize is also not
  good
   solution because during day cache filters are very very helpful.
   You could be interested in type of filters that I use. These are range
   filters (I tried standard range filters and frange) - eg. price:[* TO
   1]. Some fq with price can return few thousands of results (eg.
   price:[40 TO 50]), but some (eg. price:[* TO 1]) can return
 milions
  of
   documents. I'd also like to avoid solution which will introduce strict
   ranges that user can choose.
   Have you any suggestions what can I do? Is there any way to limit for
   example maximum size of docSet which is cached in FilterCache?
  
   --
   Pawel
 



Re: FilterCache - maximum size of document set

2012-06-13 Thread Erick Erickson
Hmmm, I think you may be looking at the wrong thing here. Generally, a
filterCache
entry will be maxDocs/8 (plus some overhead), so in your case they really
shouldn't be all that large, on the order of 3M/filter. That shouldn't
vary based
on the number of docs that match the fq, it's just a bitset. To see if
that makes any
sense, take a look at the admin page and the number of evictions in
your filterCache. If
that is  0, you're probably using all the memory you're going to in
the filterCache during
the day..

But you haven't indicated what version of Solr you're using, I'm going from a
relatively recent 3x knowledge-base.

Have you put a memory analyzer against your Solr instance to see where
the memory
is being used?

Best
Erick

On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
 Hi,
 I have solr index with about 25M documents. I optimized FilterCache size to
 reach the best performance (considering traffic characteristic that my Solr
 handles). I see that the only way to limit size of a Filter Cace is to set
 number of document sets that Solr can cache. There is no way to set memory
 limit (eg. 2GB, 4GB or something like that). When I process a standard
 trafiic (during day) everything is fine. But when Solr handle night traffic
 (and the charateristic of requests change) some problems appear. There is
 JVM out of memory error. I know what is the reason. Some filters on some
 fields are quite poor filters. They returns 15M of documents or even more.
 You could say 'Just put that into q'. I tried to put that filters into
 Query part but then, the statistics of request processing time (during
 day) become much worse. Reduction of Filter Cache maxSize is also not good
 solution because during day cache filters are very very helpful.
 You could be interested in type of filters that I use. These are range
 filters (I tried standard range filters and frange) - eg. price:[* TO
 1]. Some fq with price can return few thousands of results (eg.
 price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions of
 documents. I'd also like to avoid solution which will introduce strict
 ranges that user can choose.
 Have you any suggestions what can I do? Is there any way to limit for
 example maximum size of docSet which is cached in FilterCache?

 --
 Pawel


Re: FilterCache - maximum size of document set

2012-06-13 Thread Pawel Rog
Thanks for your response
Yes, maybe you are right. I thought that filters can be larger than 3M. All
kinds of filters uses BitSet?
Moreover maxSize of filterCache is set to 16000 in my case. There are
evictions during day traffic
but not during night traffic.

Version of Solr which I use is 3.5

I haven't used Memory Anayzer yet. Could you write more details about it?

--
Regards,
Pawel

On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, I think you may be looking at the wrong thing here. Generally, a
 filterCache
 entry will be maxDocs/8 (plus some overhead), so in your case they really
 shouldn't be all that large, on the order of 3M/filter. That shouldn't
 vary based
 on the number of docs that match the fq, it's just a bitset. To see if
 that makes any
 sense, take a look at the admin page and the number of evictions in
 your filterCache. If
 that is  0, you're probably using all the memory you're going to in
 the filterCache during
 the day..

 But you haven't indicated what version of Solr you're using, I'm going
 from a
 relatively recent 3x knowledge-base.

 Have you put a memory analyzer against your Solr instance to see where
 the memory
 is being used?

 Best
 Erick

 On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
  Hi,
  I have solr index with about 25M documents. I optimized FilterCache size
 to
  reach the best performance (considering traffic characteristic that my
 Solr
  handles). I see that the only way to limit size of a Filter Cace is to
 set
  number of document sets that Solr can cache. There is no way to set
 memory
  limit (eg. 2GB, 4GB or something like that). When I process a standard
  trafiic (during day) everything is fine. But when Solr handle night
 traffic
  (and the charateristic of requests change) some problems appear. There is
  JVM out of memory error. I know what is the reason. Some filters on some
  fields are quite poor filters. They returns 15M of documents or even
 more.
  You could say 'Just put that into q'. I tried to put that filters into
  Query part but then, the statistics of request processing time (during
  day) become much worse. Reduction of Filter Cache maxSize is also not
 good
  solution because during day cache filters are very very helpful.
  You could be interested in type of filters that I use. These are range
  filters (I tried standard range filters and frange) - eg. price:[* TO
  1]. Some fq with price can return few thousands of results (eg.
  price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions
 of
  documents. I'd also like to avoid solution which will introduce strict
  ranges that user can choose.
  Have you any suggestions what can I do? Is there any way to limit for
  example maximum size of docSet which is cached in FilterCache?
 
  --
  Pawel