I tested with more threads / processes. indeed this is completely cpu-bound, since running 1 thread gives the same latency as 4 threads (my box has 4 cores)
given this, is there any way to simplify the scoring computation (i'm only using lucene as a first level "rough" search, so the search quality is not a huge issue here) , so that, for example, fewer fields are evaluated or a simpler scoring function is used? thanks Yang On Fri, May 25, 2012 at 5:47 PM, Yang <[email protected]> wrote: > thanks a lot guys > > > On Tue, May 22, 2012 at 1:34 AM, Ian Lea <[email protected]> wrote: > >> Lots of good tips in >> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from >> the FAQ. >> >> >> -- >> Ian. >> >> >> On Tue, May 22, 2012 at 2:08 AM, Li Li <[email protected]> wrote: >> > something wrong when writing in my android client. >> > if RAMDirectory do not help, i think the bottleneck is cpu. you may try >> to >> > tune jvm but i do not expect much improvement. >> > the best one is splitting your index into 2 or more smaller ones. >> > you can then use solr s distributed searching. >> > if the cpu is not fully used, yuo can do this in one physical machine >> > >> > 在 2012-5-22 上午8:50,"Li Li" <[email protected]>写道: >> >> >> >> >> >> 在 2012-5-22 凌晨4:59,"Yang" <[email protected]>写道: >> >> >> >> > >> >> > I'm trying to make my search faster. right now a query like >> >> > >> >> > name:Joe Moe Pizza address:77 main street city:San Francisco >> >> >is this a conjunction query or a disjunction query? >> >> >> >> > in a index with 20mil such short business descriptions (total size >> > about 3GB) takes about 100--200ms. >> >> >20m is not a small size, how many results for a query in average? >> >> >> >> > I profiled the query, most time is spent in TermScorer.score(), as is >> > shown by the attached yourkit screenshot. >> >> >that's true, for a query, matching and scoring is very time consuming >> > and cpu intensive. another one is io for reading postings. >> >> >> >> > >> >> > >> >> > >> >> > I tried loading the index onto tmpfs (in-memory block device), and >> also >> > tried RAMDirectory, neither helps much. >> >> >if that is true. it seems that io is not the >> >> > I am reading >> > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf >> >> > it mentions >> >> > Size >> >> > – Stopword removal >> >> > – Stemming >> >> > • Lucene has a number of stemmers available >> >> > • Light versus Aggressive >> >> > • May prevent fine-grained matches in some cases >> >> > – Not a linear factor (usually) due to index compression >> >> > >> >> > so for "stopword removal", I'm already using the standard analyzer, >> so >> > stop word removal is already included, right? >> >> > >> >> > also generally any other tricks to try for reducing the search >> latency? >> >> > >> >> > Thanks! >> >> > Yang >> >> > >> >> > >> >> > --------------------------------------------------------------------- >> >> > To unsubscribe, e-mail: [email protected] >> >> > For additional commands, e-mail: [email protected] >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> >
