Superslow search on a single 600MB index segment

Igor Shalyminov Mon, 14 Oct 2013 09:17:20 -0700

Hello!

I'm trying to realize how I can improve search performance for my task.


The index is as follows:
- 29 segments, each of about 600 MB;
- in the complete setup, there's a thread for each segment searcher;
- index contains TermVectors with positions and payloads for word-level fields, 
and SortedDocValues for document-level fields.

I perform a SpanNearQuery customized by me for payload checking (payloads are 
just single int's).
I've reduced the whole search logic to iterating throught spanQuery.getSpans() 
and counting the precise matched document and span numbers, and all on a single 
645 MB segment (I launched java with -Xmx4G for this particular task).
It takes 25 seconds to complete!

I tried using RAMDirectory on this index for testing purposes, but results are 
the same (for now I didn't try tmpfs approach though).

Are there any ideas of how one can speed things up (possibly with delving into 
Lucene's internals) keeping the completeness of query results?

-- 
Best Regards,
Igor Shalyminov

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Superslow search on a single 600MB index segment

Reply via email to