Re: thread safe shared IndexSearcher

Jay Yu Tue, 25 Sep 2007 10:47:43 -0700

Mark,

Looking at your implementation of the DefaultIndexAccessor regarding thewriter, I think there could be a problem: you have only one cachedwriter but the getWriter(boolean, boolean) allows 2 booleans, soideally, you need 4 cached writer. Otherwise if one starts with a writerthat over writes the existing index, then later he cannot append docs tothe index.Do I miss sth here or you have not finished the implementation ofgetWriter yet?


Thanks!

Jay

Mark Miller wrote:

Ah, thanks for catching that. One of the pieces I did not finish...thekeyword analyzer was placeholder code.
I will take your comments into account and update the code.
I have some other pieces to polish as well. Previously, I extended andbuilt upon the original code, but I can't give it away, so this is myattempt at something lessor, but cleaner.
Jay Yu wrote:
Thanks for the tip.
One small improvement on the IndexAccessorFactory might be to allowuser to specify the Analyzer instead of using a defaultKeywordAnalyzer, which of course will make your static init of thecached accessors difficult unless you add more interfaces to theaccessor to allow reset analyzer/Dir as in my own version.
Jay

Mark Miller wrote:
One final note....if you are using the IndexAccessor and you are onlyaccessing the index from one JVM, you can use the NoLockFactory andsave some sync cost there.
Jay Yu wrote:
Mark,
Great effort getting the original lucene index accessor package inthis shape. I am sure this will benefit a lot of people using Lucenein a multithread env.
I have a quick question to ask you:
Do you have to use the core Lucene 2.3-dev in order to use theaccessor?
I will take a look at your codes to see if I could help. I used aslightly modified version of the original package in my project butit breaks some of my tests. I hope your version works better.
Thanks a lot!

Jay


Mark Miller wrote:
I have sat down and rewrote IndexAccessor from scratch. I copied inthe same reference counting logic, pruned some things, and tried tomake the whole package a bit simpler to use. I have a few things todo, but its pretty solid already. The only major thing I'd stilllike to do is add an option to warm searchers before putting themin the Searcher cache. Id like to writer some more tests as well.Any help greatly appreciated if your interested in using the thing.
http://myhardshadow.com/indexaccessor/trunk/src/test/com/mhs/indexaccessor/SimpleSearchServer.java
Here is a an example of a class that can be instantiated in one ofmultiple threads and read /modify a single index without worryingabout what anyof the other threads are doing to the index at any given time. Thisis a very simple example of how to use the IndexAccessor and notnecessarily anexample of best practices. The main idea is that you get yourWriter, Searcher, or Reader, and then be sure to release it as soonas your done with itin a finally block. For loading, you will want to load many docswith a Writer (batch them) before releasing it, but remember thatReaders will not get a new viewof the index until you release all of the Writers. So bewarehogging a Writer unless you thats what your intending.
JavaDoc:
http://myhardshadow.com/indexaccessorapi/

Code:
http://myhardshadow.com/indexaccessor/trunk/

Jar:
http://myhardshadow.com/indexaccessorreleases/indexaccessor.jar


Your synchronized block concerns:
The synchronized blocks that control accesss to the IndexAccessordo not have a huge impact on performance. Keep in mind that all ofthe work is not done in a synchonrized block, just the retrieval ofthe Searcher, Writer, Reader. Even if the synchronization makes themethod twice as expensive, it is still overpowered by the cost ofparsing queries and searching the index. This applies with orwithout contention. I wrote a simple test and included the outputbelow. You might use the IBM Lock Analyzer for Java to furtheranalyze these costs. Trust me, this thing is speedy. Its many timesbetter than using IndexModifier.
Without Contention
Just retrieve and release Searcher 100000 times
----
avg time:6.3E-4 ms
total time:63 ms

Parse query and search on 1 doc 100000 times
----
avg time:0.03107 ms
total time:3107 ms


With Contention (40 other threads running 80000 searches)
Just retrieve and release Searcher 100000 times
----
avg time:0.04643 ms
total time:4643 ms

Parse query and search on 1 doc 100000 times
----
avg time:0.64337 ms
total time:64337 ms


- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: thread safe shared IndexSearcher

Reply via email to