Re: Filter cache pollution during sharded edismax queries

2014-10-08 Thread Charlie Hull
On 01/10/2014 09:55, jim ferenczi wrote: I think you should test with facet.shard.limit=-1 this will disallow the limit for the facet on the shards and remove the needs for facet refinements. I bet that returning every facet with a count greater than 0 on internal queries is cheaper than using

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Charlie Hull
On 30/09/2014 22:25, Erick Erickson wrote: Just from a 20,000 ft. view, using the filterCache this way seems...odd. +1 for using a different cache, but that's being quite unfamiliar with the code. Here's a quick update: 1. LFUCache performs worse so we returned to LRUCache 2. Making the

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread jim ferenczi
I think you should test with facet.shard.limit=-1 this will disallow the limit for the facet on the shards and remove the needs for facet refinements. I bet that returning every facet with a count greater than 0 on internal queries is cheaper than using the filter cache to handle a lot of

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Chris Hostetter
: +1 for using a different cache, but that's being quite unfamiliar with the : code. in (a) common case, people tend to drill down and filter on facet constraints -- so using a special purpose cache for the refinements would result in redundent caching of the same info in multiple places. :

RE: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Toke Eskildsen
From: Charlie Hull [char...@flax.co.uk]: We've just found a very similar issue at a client installation. They have around 27 million documents and are faceting on fields with high cardinality, and are unhappy with query performance and the server hardware necessary to make this performance

Re: Filter cache pollution during sharded edismax queries

2014-10-01 Thread Mikhail Khludnev
Hoss, Nice to hear you! I wonder if there is a sequence chart, or maybe a deck, which explains the whole picture of distributed search, especially these ones? If it hasn't been presented to community so far, I'm aware of one conference which can accept such talk. WDYT? On Wed, Oct 1, 2014 at

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Charlie Hull
Hi, We've just found a very similar issue at a client installation. They have around 27 million documents and are faceting on fields with high cardinality, and are unhappy with query performance and the server hardware necessary to make this performance acceptable. Last night we noticed the

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Alan Woodward
A bit of digging show that the extra entries in the filter cache are added when getting facets from a distributed search. Once all the facets have been gathered, the co-ordinating node then asks the subnodes for an exact count for the final top-N facets, and the path for executing this goes

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Shawn Heisey
On 9/30/2014 4:38 AM, Charlie Hull wrote: We've just found a very similar issue at a client installation. They have around 27 million documents and are faceting on fields with high cardinality, and are unhappy with query performance and the server hardware necessary to make this performance

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Mikhail Khludnev
Hello, I already saw such discussion, but want to confirm. On Tue, Sep 30, 2014 at 2:59 PM, Alan Woodward a...@flax.co.uk wrote: Once all the facets have been gathered, the co-ordinating node then asks the subnodes for an exact count for the final top-N facets, What's the point to refine

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Alan Woodward
Once all the facets have been gathered, the co-ordinating node then asks the subnodes for an exact count for the final top-N facets, What's the point to refine these counts? I've thought that it make sense only for facet.limit ed requests. Is it correct statement? can those who suffer

Re: Filter cache pollution during sharded edismax queries

2014-09-30 Thread Erick Erickson
Just from a 20,000 ft. view, using the filterCache this way seems...odd. +1 for using a different cache, but that's being quite unfamiliar with the code. On Tue, Sep 30, 2014 at 1:53 PM, Alan Woodward a...@flax.co.uk wrote: Once all the facets have been gathered, the co-ordinating node

Re: Filter cache pollution during sharded edismax queries

2013-10-18 Thread Anca Kopetz
Hi Ken, Have you managed to find out why these entries were stored into filterCache and if they have an impact on the hit ratio ? We noticed the same problem, there are entries of this type : item_+(+(title:western^10.0 | ... in our filterCache. Thanks, Anca On 07/02/2013 09:01 PM, Ken

Re: Filter cache pollution during sharded edismax queries

2013-08-28 Thread Chris Hostetter
Ken ... i'm not really sure i'm understanding what you're trying to describe. can you give the full details of a concrete example of what you are seeing? * full requestHandler config * example of query issued by client * every request logged on each shard * contends of filterCache and

Re: Filter cache pollution during sharded edismax queries

2013-08-27 Thread Otis Gospodnetic
Hi Ken, JIRA is kind of stuffed. I'd imagine showing more proof on the ML may be more effective. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Tue, Aug 27, 2013 at 4:32 AM, Ken Krugler kkrugler_li...@transpac.com wrote: Hi

Re: Filter cache pollution during sharded edismax queries

2013-08-26 Thread Ken Krugler
Hi Otis, Sorry I missed your reply, and thanks for trying to find a similar report. Wondering if I should file a Jira issue? That might get more attention :) -- Ken On Jul 5, 2013, at 1:05pm, Otis Gospodnetic wrote: Hi Ken, Uh, I left this email until now hoping I could find you a

Re: Filter cache pollution during sharded edismax queries

2013-07-05 Thread Otis Gospodnetic
Hi Ken, Uh, I left this email until now hoping I could find you a reference to similar reports, but I can't find them now. I am quite sure I saw somebody with a similar report within the last month. Plus, several people have reported issues with performance dropping when they went from 3.x to

Filter cache pollution during sharded edismax queries

2013-07-02 Thread Ken Krugler
Hi all, After upgrading from Solr 3.5 to 4.2.1, I noticed our filterCache hit ratio had dropped significantly. Previously it was at 95+%, but now it's 50%. I enabled recording 100 entries for debugging, and in looking at them it seems that edismax (and faceting) is creating entries for me.