Lock obtain timed out

2003-12-16 Thread Hohwiller, Joerg
Hi there,

I have not yet got any response about my problem.

While debugging into the depth of lucene (really hard to read deep insde) I 
discovered that it is possible to disable the Locks using a System property.

When I start my application with -DdisableLuceneLocks=true, 
I do not get the error anymore.

I just wonder if this is legal and wont cause other trouble???
As far as I could understand the source, a proper thread 
synchronization is done using locks on Java Objects and
the index-store locks seem to be required only if multiple 
lucenes (in different VMs) work on the same index.
In my situation there is only one Java-VM running and only one
lucene is working on one index. 

Am I safe disabling the locking???
Can anybody tell me where to get documentation about the Locking
strategy (I still would like to know why I have that problem) ???

Or does anybody know where to get an official example of how to
handle concurrent index modification and searches?

Tank you so much
  Jörg

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Lock obtain timed out

2003-12-16 Thread Hohwiller, Joerg
Hi there,

thanks for your resonse guys!

For the answers I got the info that I must not have an IndexWriter
and an IndexReader open at the same time that both want to modify
the index - even sequentially.

What I have is the following:

1 Thread is working out events such as resource (file or folder)
  was added/removed/deleted/etc. All index modifications are
  synchronized against a write-lock object.

1 Thread does index switching what means that he synchronizes on
  the write lock and then closes modifying index-reader and index-writer.
  Next it copies that index completely and reopens the index-reader and
  -writer on the copied index.
  Then he syncs on the read lock and closes the index searcher and
  reopens it on the index that was previously copied.

N Threads that perform search requestes but sync against the read-lock.

Since I can garantee that there is only one thread working out the
change events sequentially, the index-writer and index-reader will never
do any concurrent modifications.

This time I will attatch my source as text in this mail to get sure.
For those who do not know avalon/exalibur: It is a framework that
will be the only one calling the configure/start/stop methods.
No one can access the instance until it is properly created, configured
and started so synchronization is not neccessary in the start method.

Thanks again
  Jörg

/**
 * This is the implementation of the ISearchManager using lucene as underlying
 * search engine.br/
 * Everything would be so simple if lucene was thread-safe for concurrently
 * modifying and searching on the same index, but it is not. br/
 * My first idea was to have a single index that is continiusly modified and a
 * background thread that continuosly closes and reopens the index searcher.
 * This should bring most recent search results but it did not work proberly
 * with lucene.br/  
 * My strategy now is to have multiple indexes and to cycle over all of them
 * in a backround thread copying the most recent one to the next (least recent)
 * one. Index modifications are always performed on the most recent index, 
 * while searching is always performed on the second recent (copy of the) index.
 * This stategy results in less acutal (but still very acceptable) actuality
 * of search results. Further it produces a lot more disk space overhead but
 * with the advantage of having backups of the index.br/
 * Because the search must filter the search results the user does not have 
 * read access on, it can also filter the results that do not exist anymore
 * without further costs.  
 * 
 * @author Joerg Hohwiller (jhohwill)
 */
public class SearchManager
extends AbstractManager
implements
ISearchManager,
IDataEventListener,
Startable,
Serviceable,
Disposable,
Configurable,
Runnable,
ThreadSafe {

/** 
 * A background thread is switching/updating the index used for indexing
 * and/or searching. The thread sleeps an amount of this constant in 
 * milliseconds until the next switch is done.br/
 * The shorter the delay, the more actual the search results but also the
 * more preformance overhead is produced.br/
 * Be aware that the delay does not determine the index switching frequency
 * because after a sleep of the delay, the index is copied and the switched.
 * This required time for this operation does depend on the size of the
 * index. This also means that the bigger the index, the less acutal are
 * the search results.br/ 
 * A value of 60 seconds (60 * 1000L) should be OK. 
 */
private static final long INDEX_SWITCH_DELAY = 30 * 1000L;

/** the URI field name */
public static final String FIELD_URI = uri;

/** the title field name */
public static final String FIELD_TITLE = dc_title;

/** the text field name */
public static final String FIELD_TEXT = text;

/** the read action */
private static final String READ_ACTION_URI = /actions/read;

/** the name of the configuration tag for the index settings */
private static final String CONFIGURATION_TAG_INDEXER = indexer;

/** the name of the configuration attribute for the index path */
private static final String CONFIGURATION_ATTRIBUTE_INDEX_PATH = index-path;

/** the user used to access resources for indexing (global read access) */
private static final String SEARCH_INDEX_USER = indexer;

/** the maximum number of search hits */
private static final int MAX_SEARCH_HITS = 100;

/** the default analyzer used for the search index */
private static final Analyzer ANALYZER = new StandardAnalyzer();

/** 
 * the number of indexes used, must be at least 3:
 * ul
 *   lione for writing/updating/li
 *   lione for read/search/li
 *   lione temporary where the index is copied to/li
 * /ul
 * All further indexes will act as extra backups of the index but will
 * also 

Problems deleting documents from the index (Lock obtain timed out)

2003-12-15 Thread Hohwiller, Joerg
Hi there,

I just subscribed to this list and have a little Problem:

I am using lucene for incremental indexing (yes, I read the FAQ! dont try to convince 
me to rebuild the index periodically from scratch :) ).

Now the problem seems to be that lucene is not able to perform index modifications 
and parallel search requests. 
After my simple approaches failed, I finnaly implemented the recomended way to have an 
index that is modified and create a copy of that index for searches. I do all this 
with proper Thread synchronization (at least I hope so). 
Before I copy the index, I do close the index-writer and index-reader working 
on that index, then copy and reopen the index-writer and -reader on the new copy. Next 
I close the index-searcher and reopen it on the index that has been copied before.

Now my problem is that when I receive a delete event and want to remove a document 
from the index by a special field (in my case the URI), I get a IOException with the 
message Lock obtain timed out.

I tried lucene 1.3-rc1, 1.3-rc2 and 1.3-rc3 all with the same result.

Any suggestions would be very welcome :)

Thank you so far
  Jörg Hohwiller

BTW: I attatched the relevant source code (but removed imports, etc. so that it does 
not contain any confidential information). Maybe this answers the first of your 
questions.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Word Documents

2003-12-15 Thread Hohwiller, Joerg
Hi there,

also a little comment from me about the toppic.

I am using OpenOffice what produces the best HTML results for me.

If you only want to index the documents this might be overhead except
you have no other idea how to waste the performance of your machine.

In case someone is interested: You can start OpenOffice as 
server (e.g. soffice -headless -accept=socket,port=8100;urp;) and then
use the Java Uno Binding API to access that OpenOffice server.

Regards 
  Jörg

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]