Hi Dmitry, I have solr 4.3 and every query is distributed and merged back for ranking purpose.
What do you mean by frontend solr? On Mon, Sep 9, 2013 at 2:12 PM, Dmitry Kan <solrexp...@gmail.com> wrote: > are you querying your shards via a frontend solr? We have noticed, that > querying becomes much faster if results merging can be avoided. > > Dmitry > > > On Sun, Sep 8, 2013 at 6:56 PM, Manuel Le Normand < > manuel.lenorm...@gmail.com> wrote: > > > Hello all > > Looking on the 10% slowest queries, I get very bad performances (~60 sec > > per query). > > These queries have lots of conditions on my main field (more than a > > hundred), including phrase queries and rows=1000. I do return only id's > > though. > > I can quite firmly say that this bad performance is due to slow storage > > issue (that are beyond my control for now). Despite this I want to > improve > > my performances. > > > > As tought in school, I started profiling these queries and the data of ~1 > > minute profile is located here: > > http://picpaste.com/pics/IMG_20130908_132441-ZyrfXeTY.1378637843.jpg > > > > Main observation: most of the time I do wait for readVInt, who's > stacktrace > > (2 out of 2 thread dumps) is: > > > > catalina-exec-3870 - Thread t@6615 > > java.lang.Thread.State: RUNNABLE > > at org.apadhe.lucene.store.DataInput.readVInt(DataInput.java:108) > > at > > > > > org.apaChe.lucene.codeosAockTreeIermsReade$FieldReader$SegmentTermsEnumFrame.loadBlock(BlockTreeTermsReader.java: > > 2357) > > at > > > > > ora.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekExact(BlockTreeTermsReader.java:1745) > > at org.apadhe.lucene.index.TermContext.build(TermContext.java:95) > > at > > > > > org.apache.lucene.search.PhraseQuery$PhraseWeight.<init>(PhraseQuery.java:221) > > at > org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:326) > > at > > > > > org.apache.lucene.search.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183) > > at > > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384) > > at > > > > > org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183) > > at > > oro.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384) > > at > > > > > org.apache.lucene.searth.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:183) > > at > > org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:384) > > at > > > > > org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:675) > > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297) > > > > > > So I do actually wait for IO as expected, but I might be too many time > page > > faulting while looking for the TermBlocks (tim file), ie locating the > term. > > As I reindex now, would it be useful lowering down the termInterval > > (default to 128)? As the FST (tip files) are that small (few 10-100 MB) > so > > there are no memory contentions, could I lower down this param to 8 for > > example? The benefit from lowering down the term interval would be to > > obligate the FST to get on memory (JVM - thanks to the > NRTCachingDirectory) > > as I do not control the term dictionary file (OS caching, loads an > average > > of 6% of it). > > > > > > General configs: > > solr 4.3 > > 36 shards, each has few million docs > > These 36 servers (each server has 2 replicas) are running virtual, 16GB > > memory each (4GB for JVM, 12GB remain for the OS caching), consuming > 260GB > > of disk mounted for the index files. > > >