Piotr Kosiorowski wrote:

Hi,
I started to think about implementing special kind of Lucene Query (if I
remember correctly I would have to write my own Scorer and probably a few
other classes) optimized for Nutch some time ago. I assumed having
specialized query I would be able to avoid accessing some of lucene index
structures multiple times as the same term apears many times in query
generated by Nutch for multitoken queries. I am not an Lucene expert but
maybe it is worth checking if it might give some performance boost. Has
anyone any ideas why it might help or not?

That's a very good comment. Looking at the profile traces I can see that a lot of time is spent just juggling the sub-query scorers inside the BooleanScorer, and handling the complex query structure; if this part could be optimized by the use of a special scorer, it could be a big win.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Reply via email to