Dag Lem <[email protected]> writes: [...]
> Some observations: > > * Lucy::Search::IndexSearcher::top_docs (used by SearchServer) is > about twice as slow Lucy::Search::Searcher::hits (used by > IndexSearcher). > * The time used for object serialization with sharding surpasses the > time spent in Lucy::Search::Searcher::hits without sharding, and > scales with query complexity. > * Quite a lot of time is spent on (local) network communication, and > this also seems to scale with query complexity. Having run strace, one culprit with the current implementation seems to be that the client requests "doc_freq" from the server once for every single term in the query. This seems to be the fundamental cause of the last two observations above. Fixing this issue would help a bit, I think, but the performance would still be severely limited by the network roundtrips caused by hit fetching (which is not part of my test) and the overhead of Storable. If I may be so bold as to suggest how to make sharding really fly, I believe what's called for is the following: 1. Get rid of as many network roundtrips as possible. 2. Design a (simple) custom application protocol, to get rid of the overhead of Storable. As far as I can tell, the current protocol covers the following actions: handshake terminate doc_max doc_freq top_docs fetch_doc fetch_doc_vec Here, doc_freq and top_docs should be replaced with something like docs_freq_and_top_docs, i.e. only one request / response per query. Furthermore fetch_doc and fetch_doc_vec should be replaced with something like fetch_docs and fetch_docs_vec, facilitating the fetching of several documents with a single request / response. With this in place, Storable serialization could be replaced by a custom application protocol to make things *really* fast. However note that this is not a requirement to fix the fundamental issue - network roundtrips. The only remaining question would be whether it is possible to optimize Lucy::Search::IndexSearcher::top_docs to perform nearly as well as Lucy::Search::Searcher::hits. -- Best regards, Dag Lem
