What are our complex queries? You
say that your app will very rarely see the
same query thus you aren't using caches...
But, if you can move some of your
clauses to fq clauses, then the filterCache
might well be used to good effect.



On Thu, Mar 13, 2014 at 7:22 AM, Salman Akram
<salman.ak...@northbaysolutions.net> wrote:
> 1- SOLR 4.6
> 2- We do but right now I am talking about plain keyword queries just sorted
> by date. Once this is better will start looking into caches which we
> already changed a little.
> 3- As I said the contents are not stored in this index. Some other metadata
> fields are but with normal queries its super fast so I guess even if I
> change there it will be a minor difference. We have SSD and quite fast too.
> 4- That's something we need to do but even in low workload those queries
> take a lot of time
> 5- Every 10 mins and currently no auto warming as user queries are rarely
> same and also once its fully warmed those queries are still slow.
> 6- Nops.
>
> On Thu, Mar 13, 2014 at 5:38 PM, Dmitry Kan <solrexp...@gmail.com> wrote:
>
>> 1. What is your solr version? In 4.x family the proximity searches have
>> been optimized among other query types.
>> 2. Do you use the filter queries? What is the situation with the cache
>> utilization ratios? Optimize (= i.e. bump up the respective cache sizes) if
>> you have low hitratios and many evictions.
>> 3. Can you avoid storing some fields and only index them? When the field is
>> stored and it is retrieved in the result, there are couple of disk seeks
>> per field=> search slows down. Consider SSD disks.
>> 4. Do you monitor your system in terms of RAM / cache stats / GC? Do you
>> observe STW GC pauses?
>> 5. How often do you commit & do you have the autowarming / external warming
>> configured?
>> 6. If you use faceting, consider storing DocValues for facet fields.
>>
>> some solr wiki docs:
>>
>> https://wiki.apache.org/solr/SolrPerformanceProblems?highlight=%28%28SolrPerformanceFactors%29%29
>>
>>
>>
>>
>>
>> On Thu, Mar 13, 2014 at 8:52 AM, Salman Akram <
>> salman.ak...@northbaysolutions.net> wrote:
>>
>> > Well some of the searches take minutes.
>> >
>> > Below are some stats about this particular index that I am talking about:
>> >
>> > Index size = 400GB (Using CommonGrams so without that the index is around
>> > 180GB)
>> > Position File = 280GB
>> > Total Docs = 170 million (just indexed for searching - for highlighting
>> > contents are stored in another index)
>> > Avg Doc Size = Few hundred KBs
>> > RAM = 384GB (it has other indexes too but still OS cache can have 60-80%
>> of
>> > the total index cached)
>> >
>> > Phrase queries run pretty fast with CG but complex versions of wildcard
>> and
>> > proximity queries can be really slow. I know using CG will make them slow
>> > but they just take too long. By default sorting is on date but users have
>> > few other parameters too on which they can sort.
>> >
>> > I wanted to avoid creating multiple indexes (maybe based on years) but
>> > seems that to search on partial data that's the only feasible way.
>> >
>> >
>> >
>> >
>> > On Wed, Mar 12, 2014 at 2:47 PM, Dmitry Kan <solrexp...@gmail.com>
>> wrote:
>> >
>> > > As Hoss pointed out above, different projects have different
>> > requirements.
>> > > Some want to sort by date of ingestion reverse, which means that having
>> > > posting lists organized in a reverse order with the early termination
>> is
>> > > the way to go (no such feature in Solr directly). Some other projects
>> > want
>> > > to collect all docs matching a query, and then sort by rank, but you
>> > cannot
>> > > guarantee, that the most recently inserted document is the most
>> relevant
>> > in
>> > > terms of your ranking.
>> > >
>> > >
>> > > Do your current searches take too long?
>> > >
>> > >
>> > > On Tue, Mar 11, 2014 at 11:51 AM, Salman Akram <
>> > > salman.ak...@northbaysolutions.net> wrote:
>> > >
>> > > > Its a long video and I will definitely go through it but it seems
>> this
>> > is
>> > > > not possible with SOLR as it is?
>> > > >
>> > > > I just thought it would be quite a common issue; I mean generally for
>> > > > search engines its more important to show the first page results,
>> > rather
>> > > > than using timeAllowed which might not even return a single result.
>> > > >
>> > > > Thanks!
>> > > >
>> > > >
>> > > > --
>> > > > Regards,
>> > > >
>> > > > Salman Akram
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Dmitry
>> > > Blog: http://dmitrykan.blogspot.com
>> > > Twitter: http://twitter.com/dmitrykan
>> > >
>> >
>> >
>> >
>> > --
>> > Regards,
>> >
>> > Salman Akram
>> >
>>
>>
>>
>> --
>> Dmitry
>> Blog: http://dmitrykan.blogspot.com
>> Twitter: http://twitter.com/dmitrykan
>>
>
>
>
> --
> Regards,
>
> Salman Akram

Reply via email to