Mark,
Looking at your implementation of the DefaultIndexAccessor regarding the
writer, I think there could be a problem: you have only one cached
writer but the getWriter(boolean, boolean) allows 2 booleans, so
ideally, you need 4 cached writer. Otherwise if one starts with a writer
that over writes the existing index, then later he cannot append docs to
the index.
Do I miss sth here or you have not finished the implementation of
getWriter yet?
Thanks!
Jay
Mark Miller wrote:
Ah, thanks for catching that. One of the pieces I did not finish...the
keyword analyzer was placeholder code.
I will take your comments into account and update the code.
I have some other pieces to polish as well. Previously, I extended and
built upon the original code, but I can't give it away, so this is my
attempt at something lessor, but cleaner.
Jay Yu wrote:
Thanks for the tip.
One small improvement on the IndexAccessorFactory might be to allow
user to specify the Analyzer instead of using a default
KeywordAnalyzer, which of course will make your static init of the
cached accessors difficult unless you add more interfaces to the
accessor to allow reset analyzer/Dir as in my own version.
Jay
Mark Miller wrote:
One final note....if you are using the IndexAccessor and you are only
accessing the index from one JVM, you can use the NoLockFactory and
save some sync cost there.
Jay Yu wrote:
Mark,
Great effort getting the original lucene index accessor package in
this shape. I am sure this will benefit a lot of people using Lucene
in a multithread env.
I have a quick question to ask you:
Do you have to use the core Lucene 2.3-dev in order to use the
accessor?
I will take a look at your codes to see if I could help. I used a
slightly modified version of the original package in my project but
it breaks some of my tests. I hope your version works better.
Thanks a lot!
Jay
Mark Miller wrote:
I have sat down and rewrote IndexAccessor from scratch. I copied in
the same reference counting logic, pruned some things, and tried to
make the whole package a bit simpler to use. I have a few things to
do, but its pretty solid already. The only major thing I'd still
like to do is add an option to warm searchers before putting them
in the Searcher cache. Id like to writer some more tests as well.
Any help greatly appreciated if your interested in using the thing.
http://myhardshadow.com/indexaccessor/trunk/src/test/com/mhs/indexaccessor/SimpleSearchServer.java
Here is a an example of a class that can be instantiated in one of
multiple threads and read /modify a single index without worrying
about what any
of the other threads are doing to the index at any given time. This
is a very simple example of how to use the IndexAccessor and not
necessarily an
example of best practices. The main idea is that you get your
Writer, Searcher, or Reader, and then be sure to release it as soon
as your done with it
in a finally block. For loading, you will want to load many docs
with a Writer (batch them) before releasing it, but remember that
Readers will not get a new view
of the index until you release all of the Writers. So beware
hogging a Writer unless you thats what your intending.
JavaDoc:
http://myhardshadow.com/indexaccessorapi/
Code:
http://myhardshadow.com/indexaccessor/trunk/
Jar:
http://myhardshadow.com/indexaccessorreleases/indexaccessor.jar
Your synchronized block concerns:
The synchronized blocks that control accesss to the IndexAccessor
do not have a huge impact on performance. Keep in mind that all of
the work is not done in a synchonrized block, just the retrieval of
the Searcher, Writer, Reader. Even if the synchronization makes the
method twice as expensive, it is still overpowered by the cost of
parsing queries and searching the index. This applies with or
without contention. I wrote a simple test and included the output
below. You might use the IBM Lock Analyzer for Java to further
analyze these costs. Trust me, this thing is speedy. Its many times
better than using IndexModifier.
Without Contention
Just retrieve and release Searcher 100000 times
----
avg time:6.3E-4 ms
total time:63 ms
Parse query and search on 1 doc 100000 times
----
avg time:0.03107 ms
total time:3107 ms
With Contention (40 other threads running 80000 searches)
Just retrieve and release Searcher 100000 times
----
avg time:0.04643 ms
total time:4643 ms
Parse query and search on 1 doc 100000 times
----
avg time:0.64337 ms
total time:64337 ms
- Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]