Re: Is anyone using proxy caching in front of solr?
Multiple caches can have the same hit rate as a single cache if the same query is always sent back to the same replica. This works great until a replica goes down. If the queries are redistributed, all the caches have the wrong content, very expensive. Instead. the queries need to be redistributed among the up replicas. We learned this the hard way at Infoseek in the late 1990s. Overall, it is much easier to use a single HTTP cache in front of the cluster. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 25, 2019, at 8:43 AM, Michael Gibney wrote: > > Tangentially related, possibly of interest regarding solr-internal cache > hit ratio (esp. with a lot of replicas): > https://issues.apache.org/jira/browse/SOLR-13257 > > On Mon, Feb 25, 2019 at 11:33 AM Walter Underwood > wrote: > >> Don’t worry about one and two character queries, because they will almost >> always be served from cache. >> >> There are only 26 one-letter queries (36 if you use numbers). Almost all >> of those will be in the query results cache and will be very fast with very >> little server load. The common two-letter queries will also be cached. >> >> An external HTTP cache can be effective, especially if you have a lot of >> replicas. The single cache will have a higher hit rate than the individual >> servers. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Feb 25, 2019, at 7:57 AM, Edward Ribeiro >> wrote: >>> >>> Maybe you could add a length filter factory to filter out queries with 2 >> or >>> 3 characters using >>> >> https://lucene.apache.org/solr/guide/7_4/filter-descriptions.html#FilterDescriptions-LengthFilter >>> ? >>> >>> PS: this filter requires a max length too. >>> >>> Edward >>> >>> Em qui, 21 de fev de 2019 04:52, Furkan KAMACI >>> escreveu: >>> Hi Joakim, I suggest you to read these resources: http://lucene.472066.n3.nabble.com/Varnish-td4072057.html http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html https://wiki.apache.org/solr/SolrAndHTTPCaches which gives information about HTTP Caching including Varnish Cache, Last-Modified, ETag, Expires, Cache-Control headers. Kind Regards, Furkan KAMACI On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson < joakim.hansso...@gmail.com> wrote: > Hello dear user list! > I work at a company in retail where we use solr to perform searches as you > type. > As soon as you type more than 1 characters in the search field solr starts > serving hits. > Of course this generates a lot of "unnecessary" queries (in the sense that > they are never shown to the user) which is why I started thinking about > using something like squid or varnish to cache a bunch of these 2-4 > character queries. > > It seems most stuff I find about it is from pretty old sources, but as far > as I know solrcloud doesn't have distributed cache support. > > Our indexes aren't updated that frequently, about 4 - 6 times a day. We > don't use a lot of shards and replicas (biggest index is split to 3 shards > with 2 replicas). All shards/replicas are not on the same solr host. > Our solr setup handles around 80-200 queries per second during the day with > peaks at >1500 before holiday season and sales. > > I haven't really read up on the details yet but it seems like I could >> use > etags and Expires headers to work around having to do some of that > "unnecessary" work. > > Is anyone doing this? Why? Why not? > > - peace! > >> >>
Re: Is anyone using proxy caching in front of solr?
Tangentially related, possibly of interest regarding solr-internal cache hit ratio (esp. with a lot of replicas): https://issues.apache.org/jira/browse/SOLR-13257 On Mon, Feb 25, 2019 at 11:33 AM Walter Underwood wrote: > Don’t worry about one and two character queries, because they will almost > always be served from cache. > > There are only 26 one-letter queries (36 if you use numbers). Almost all > of those will be in the query results cache and will be very fast with very > little server load. The common two-letter queries will also be cached. > > An external HTTP cache can be effective, especially if you have a lot of > replicas. The single cache will have a higher hit rate than the individual > servers. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > On Feb 25, 2019, at 7:57 AM, Edward Ribeiro > wrote: > > > > Maybe you could add a length filter factory to filter out queries with 2 > or > > 3 characters using > > > https://lucene.apache.org/solr/guide/7_4/filter-descriptions.html#FilterDescriptions-LengthFilter > > ? > > > > PS: this filter requires a max length too. > > > > Edward > > > > Em qui, 21 de fev de 2019 04:52, Furkan KAMACI > > escreveu: > > > >> Hi Joakim, > >> > >> I suggest you to read these resources: > >> > >> http://lucene.472066.n3.nabble.com/Varnish-td4072057.html > >> http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html > >> https://wiki.apache.org/solr/SolrAndHTTPCaches > >> > >> which gives information about HTTP Caching including Varnish Cache, > >> Last-Modified, ETag, Expires, Cache-Control headers. > >> > >> Kind Regards, > >> Furkan KAMACI > >> > >> On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson < > >> joakim.hansso...@gmail.com> > >> wrote: > >> > >>> Hello dear user list! > >>> I work at a company in retail where we use solr to perform searches as > >> you > >>> type. > >>> As soon as you type more than 1 characters in the search field solr > >> starts > >>> serving hits. > >>> Of course this generates a lot of "unnecessary" queries (in the sense > >> that > >>> they are never shown to the user) which is why I started thinking about > >>> using something like squid or varnish to cache a bunch of these 2-4 > >>> character queries. > >>> > >>> It seems most stuff I find about it is from pretty old sources, but as > >> far > >>> as I know solrcloud doesn't have distributed cache support. > >>> > >>> Our indexes aren't updated that frequently, about 4 - 6 times a day. We > >>> don't use a lot of shards and replicas (biggest index is split to 3 > >> shards > >>> with 2 replicas). All shards/replicas are not on the same solr host. > >>> Our solr setup handles around 80-200 queries per second during the day > >> with > >>> peaks at >1500 before holiday season and sales. > >>> > >>> I haven't really read up on the details yet but it seems like I could > use > >>> etags and Expires headers to work around having to do some of that > >>> "unnecessary" work. > >>> > >>> Is anyone doing this? Why? Why not? > >>> > >>> - peace! > >>> > >> > >
Re: Is anyone using proxy caching in front of solr?
Don’t worry about one and two character queries, because they will almost always be served from cache. There are only 26 one-letter queries (36 if you use numbers). Almost all of those will be in the query results cache and will be very fast with very little server load. The common two-letter queries will also be cached. An external HTTP cache can be effective, especially if you have a lot of replicas. The single cache will have a higher hit rate than the individual servers. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Feb 25, 2019, at 7:57 AM, Edward Ribeiro wrote: > > Maybe you could add a length filter factory to filter out queries with 2 or > 3 characters using > https://lucene.apache.org/solr/guide/7_4/filter-descriptions.html#FilterDescriptions-LengthFilter > ? > > PS: this filter requires a max length too. > > Edward > > Em qui, 21 de fev de 2019 04:52, Furkan KAMACI > escreveu: > >> Hi Joakim, >> >> I suggest you to read these resources: >> >> http://lucene.472066.n3.nabble.com/Varnish-td4072057.html >> http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html >> https://wiki.apache.org/solr/SolrAndHTTPCaches >> >> which gives information about HTTP Caching including Varnish Cache, >> Last-Modified, ETag, Expires, Cache-Control headers. >> >> Kind Regards, >> Furkan KAMACI >> >> On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson < >> joakim.hansso...@gmail.com> >> wrote: >> >>> Hello dear user list! >>> I work at a company in retail where we use solr to perform searches as >> you >>> type. >>> As soon as you type more than 1 characters in the search field solr >> starts >>> serving hits. >>> Of course this generates a lot of "unnecessary" queries (in the sense >> that >>> they are never shown to the user) which is why I started thinking about >>> using something like squid or varnish to cache a bunch of these 2-4 >>> character queries. >>> >>> It seems most stuff I find about it is from pretty old sources, but as >> far >>> as I know solrcloud doesn't have distributed cache support. >>> >>> Our indexes aren't updated that frequently, about 4 - 6 times a day. We >>> don't use a lot of shards and replicas (biggest index is split to 3 >> shards >>> with 2 replicas). All shards/replicas are not on the same solr host. >>> Our solr setup handles around 80-200 queries per second during the day >> with >>> peaks at >1500 before holiday season and sales. >>> >>> I haven't really read up on the details yet but it seems like I could use >>> etags and Expires headers to work around having to do some of that >>> "unnecessary" work. >>> >>> Is anyone doing this? Why? Why not? >>> >>> - peace! >>> >>
Re: Is anyone using proxy caching in front of solr?
Maybe you could add a length filter factory to filter out queries with 2 or 3 characters using https://lucene.apache.org/solr/guide/7_4/filter-descriptions.html#FilterDescriptions-LengthFilter ? PS: this filter requires a max length too. Edward Em qui, 21 de fev de 2019 04:52, Furkan KAMACI escreveu: > Hi Joakim, > > I suggest you to read these resources: > > http://lucene.472066.n3.nabble.com/Varnish-td4072057.html > http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html > https://wiki.apache.org/solr/SolrAndHTTPCaches > > which gives information about HTTP Caching including Varnish Cache, > Last-Modified, ETag, Expires, Cache-Control headers. > > Kind Regards, > Furkan KAMACI > > On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson < > joakim.hansso...@gmail.com> > wrote: > > > Hello dear user list! > > I work at a company in retail where we use solr to perform searches as > you > > type. > > As soon as you type more than 1 characters in the search field solr > starts > > serving hits. > > Of course this generates a lot of "unnecessary" queries (in the sense > that > > they are never shown to the user) which is why I started thinking about > > using something like squid or varnish to cache a bunch of these 2-4 > > character queries. > > > > It seems most stuff I find about it is from pretty old sources, but as > far > > as I know solrcloud doesn't have distributed cache support. > > > > Our indexes aren't updated that frequently, about 4 - 6 times a day. We > > don't use a lot of shards and replicas (biggest index is split to 3 > shards > > with 2 replicas). All shards/replicas are not on the same solr host. > > Our solr setup handles around 80-200 queries per second during the day > with > > peaks at >1500 before holiday season and sales. > > > > I haven't really read up on the details yet but it seems like I could use > > etags and Expires headers to work around having to do some of that > > "unnecessary" work. > > > > Is anyone doing this? Why? Why not? > > > > - peace! > > >
Re: Is anyone using proxy caching in front of solr?
Hi Joakim, I suggest you to read these resources: http://lucene.472066.n3.nabble.com/Varnish-td4072057.html http://lucene.472066.n3.nabble.com/SolrJ-HTTP-caching-td490063.html https://wiki.apache.org/solr/SolrAndHTTPCaches which gives information about HTTP Caching including Varnish Cache, Last-Modified, ETag, Expires, Cache-Control headers. Kind Regards, Furkan KAMACI On Wed, Feb 20, 2019 at 11:18 PM Joakim Hansson wrote: > Hello dear user list! > I work at a company in retail where we use solr to perform searches as you > type. > As soon as you type more than 1 characters in the search field solr starts > serving hits. > Of course this generates a lot of "unnecessary" queries (in the sense that > they are never shown to the user) which is why I started thinking about > using something like squid or varnish to cache a bunch of these 2-4 > character queries. > > It seems most stuff I find about it is from pretty old sources, but as far > as I know solrcloud doesn't have distributed cache support. > > Our indexes aren't updated that frequently, about 4 - 6 times a day. We > don't use a lot of shards and replicas (biggest index is split to 3 shards > with 2 replicas). All shards/replicas are not on the same solr host. > Our solr setup handles around 80-200 queries per second during the day with > peaks at >1500 before holiday season and sales. > > I haven't really read up on the details yet but it seems like I could use > etags and Expires headers to work around having to do some of that > "unnecessary" work. > > Is anyone doing this? Why? Why not? > > - peace! >
Is anyone using proxy caching in front of solr?
Hello dear user list! I work at a company in retail where we use solr to perform searches as you type. As soon as you type more than 1 characters in the search field solr starts serving hits. Of course this generates a lot of "unnecessary" queries (in the sense that they are never shown to the user) which is why I started thinking about using something like squid or varnish to cache a bunch of these 2-4 character queries. It seems most stuff I find about it is from pretty old sources, but as far as I know solrcloud doesn't have distributed cache support. Our indexes aren't updated that frequently, about 4 - 6 times a day. We don't use a lot of shards and replicas (biggest index is split to 3 shards with 2 replicas). All shards/replicas are not on the same solr host. Our solr setup handles around 80-200 queries per second during the day with peaks at >1500 before holiday season and sales. I haven't really read up on the details yet but it seems like I could use etags and Expires headers to work around having to do some of that "unnecessary" work. Is anyone doing this? Why? Why not? - peace!