On 5/17/06, Robert Engels <[EMAIL PROTECTED]> wrote:
Since reading a document is a relatively expensive operation
Expensive relative to searching operations that cover 1 million documents (i.e. you don't want to call doc() a million times) Solr has a document cache, and I've found that it doesn't help max throughput that much (I just needed more concurrent searchers to reach the max), but it does help latency of individual requests.
[...] Since the isDeleted() method uses the same synchronized lock as document(), all query scorers that filter out deleted documents will also be impacted, as they will block while the document is being read.
Interesting observation... that lock need not be shared. Using a different lock would mean aquiring two locks per doc() call, but it may be worth it to unblock the scorers waiting on isDeleted() It might also be worth it to make a ReadOnlyIndexReader that didn't have to deal with issues of synchronizing access to the deleted docs vector. If only Sun had done their APIs correctly, we could easily make non-synchronizing implementations of IndexInput and friends w/o resorting to ThreadLocals... Ah well. -Yonik http://incubator.apache.org/solr Solr, the open-source Lucene search server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]