Below is one of the sample slow query that takes mins! ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or purchase* or repurchase*)) w/10 (executive or director)
If a filter is used it comes in fq but what can be done about plain keyword search? On Sun, Mar 16, 2014 at 4:37 AM, Erick Erickson <erickerick...@gmail.com>wrote: > What are our complex queries? You > say that your app will very rarely see the > same query thus you aren't using caches... > But, if you can move some of your > clauses to fq clauses, then the filterCache > might well be used to good effect. > > > > On Thu, Mar 13, 2014 at 7:22 AM, Salman Akram > <salman.ak...@northbaysolutions.net> wrote: > > 1- SOLR 4.6 > > 2- We do but right now I am talking about plain keyword queries just > sorted > > by date. Once this is better will start looking into caches which we > > already changed a little. > > 3- As I said the contents are not stored in this index. Some other > metadata > > fields are but with normal queries its super fast so I guess even if I > > change there it will be a minor difference. We have SSD and quite fast > too. > > 4- That's something we need to do but even in low workload those queries > > take a lot of time > > 5- Every 10 mins and currently no auto warming as user queries are rarely > > same and also once its fully warmed those queries are still slow. > > 6- Nops. > > > > On Thu, Mar 13, 2014 at 5:38 PM, Dmitry Kan <solrexp...@gmail.com> > wrote: > > > >> 1. What is your solr version? In 4.x family the proximity searches have > >> been optimized among other query types. > >> 2. Do you use the filter queries? What is the situation with the cache > >> utilization ratios? Optimize (= i.e. bump up the respective cache > sizes) if > >> you have low hitratios and many evictions. > >> 3. Can you avoid storing some fields and only index them? When the > field is > >> stored and it is retrieved in the result, there are couple of disk seeks > >> per field=> search slows down. Consider SSD disks. > >> 4. Do you monitor your system in terms of RAM / cache stats / GC? Do you > >> observe STW GC pauses? > >> 5. How often do you commit & do you have the autowarming / external > warming > >> configured? > >> 6. If you use faceting, consider storing DocValues for facet fields. > >> > >> some solr wiki docs: > >> > >> > https://wiki.apache.org/solr/SolrPerformanceProblems?highlight=%28%28SolrPerformanceFactors%29%29 > >> > >> > >> > >> > >> > >> On Thu, Mar 13, 2014 at 8:52 AM, Salman Akram < > >> salman.ak...@northbaysolutions.net> wrote: > >> > >> > Well some of the searches take minutes. > >> > > >> > Below are some stats about this particular index that I am talking > about: > >> > > >> > Index size = 400GB (Using CommonGrams so without that the index is > around > >> > 180GB) > >> > Position File = 280GB > >> > Total Docs = 170 million (just indexed for searching - for > highlighting > >> > contents are stored in another index) > >> > Avg Doc Size = Few hundred KBs > >> > RAM = 384GB (it has other indexes too but still OS cache can have > 60-80% > >> of > >> > the total index cached) > >> > > >> > Phrase queries run pretty fast with CG but complex versions of > wildcard > >> and > >> > proximity queries can be really slow. I know using CG will make them > slow > >> > but they just take too long. By default sorting is on date but users > have > >> > few other parameters too on which they can sort. > >> > > >> > I wanted to avoid creating multiple indexes (maybe based on years) but > >> > seems that to search on partial data that's the only feasible way. > >> > > >> > > >> > > >> > > >> > On Wed, Mar 12, 2014 at 2:47 PM, Dmitry Kan <solrexp...@gmail.com> > >> wrote: > >> > > >> > > As Hoss pointed out above, different projects have different > >> > requirements. > >> > > Some want to sort by date of ingestion reverse, which means that > having > >> > > posting lists organized in a reverse order with the early > termination > >> is > >> > > the way to go (no such feature in Solr directly). Some other > projects > >> > want > >> > > to collect all docs matching a query, and then sort by rank, but you > >> > cannot > >> > > guarantee, that the most recently inserted document is the most > >> relevant > >> > in > >> > > terms of your ranking. > >> > > > >> > > > >> > > Do your current searches take too long? > >> > > > >> > > > >> > > On Tue, Mar 11, 2014 at 11:51 AM, Salman Akram < > >> > > salman.ak...@northbaysolutions.net> wrote: > >> > > > >> > > > Its a long video and I will definitely go through it but it seems > >> this > >> > is > >> > > > not possible with SOLR as it is? > >> > > > > >> > > > I just thought it would be quite a common issue; I mean generally > for > >> > > > search engines its more important to show the first page results, > >> > rather > >> > > > than using timeAllowed which might not even return a single > result. > >> > > > > >> > > > Thanks! > >> > > > > >> > > > > >> > > > -- > >> > > > Regards, > >> > > > > >> > > > Salman Akram > >> > > > > >> > > > >> > > > >> > > > >> > > -- > >> > > Dmitry > >> > > Blog: http://dmitrykan.blogspot.com > >> > > Twitter: http://twitter.com/dmitrykan > >> > > > >> > > >> > > >> > > >> > -- > >> > Regards, > >> > > >> > Salman Akram > >> > > >> > >> > >> > >> -- > >> Dmitry > >> Blog: http://dmitrykan.blogspot.com > >> Twitter: http://twitter.com/dmitrykan > >> > > > > > > > > -- > > Regards, > > > > Salman Akram > -- Regards, Salman Akram