Re: Lucene : avoiding locking (incremental indexing)
I am interested in pursuing experienced peoples' understanding as I have half the queue approach developed already. I am not following why you don't like the queue approach Sergiu. From what I gathered from this board, if you do lots of updates, the opening of the WriterIndex is very intensive and should be used in a batch orientation rather then on a one-at-a-time incremental approach. In some cases on this board they talk about it being so overwhelming that people are putting forced delays so the Java engine can catch up. Using a queueing approach, you may get a hit every 30 seconds or minute or...whatever you choose as your timeframe, but it should be enough of a delay to allow the java engine to not be overwhelmed. I would like this not to happen with Lucene and would like to be able to update every time an update occurs, but this does not seem the right approach right now. As I said before, this seems like a wish item for Lucene. I don't really know if the wish is feasible. So far the biggest problem I was facing with this approach, however, was having feedback from the archiving process to the main database that the archiving change actually has happened and correctly even if the server goes down. JohnE Personally I don't like the Queue aproach... because I already implemented multithreading in out application to improve its performance. In our application indexing is not a high priority, but it's happening quite often. Search is a priority. Lucene allows to have more searches at on time. When you have a big index and a many users then ... the Queue aproach can slow down your application to much. I think it will be a bottleneck. I know that the lock problem is annoying, but I also think that the right way is to identify the source of locking. Our application is a webbased application based on turbine, and when we want to restart tomcat, we just kill the process (otherwise we need to restart 2 times because of some log4j initialization problem), so ... the index is locked after the tomcat restart. In my case it makes sense to check if index is locked one time at startup. I'm also logging all errors that I get in the systems, this is helping me to find their sourcce easier. All the best, Sergiu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene : avoiding locking (incremental indexing)
[EMAIL PROTECTED] wrote: I am interested in pursuing experienced peoples' understanding as I have half the queue approach developed already. well I think that experienced people developed lucene :) theyoffered us the possibility to use multithreading and concurent searching. Of course .. depends on requirements to use them or not. I choose to use them ... because I'm developing a web application. I am not following why you don't like the queue approach Sergiu. From what I gathered from this board, if you do lots of updates, the opening of the WriterIndex is very intensive and should be used in a batch orientation rather then on a one-at-a-time incremental approach. That's not my case .. I have to reindex the information that is changed in our system. We are developing a knowledge management platform and reindex the objects each time they are changed. In some cases on this board they talk about it being so overwhelming that people are putting forced delays so the Java engine can catch up. I haven'T had this kind of problems and I use multithreading when I reindex the whole index ... and the searches still work correctly whithout any locking problems. I think that the locking problems come from outside .. and this locking sources should be identified. But again .. this is just my case ... Using a queueing approach, you may get a hit every 30 seconds or minute or...whatever you choose as your timeframe, but it should be enough of a delay to allow the java engine to not be overwhelmed. No .. I cannot accept this because our users should be able to change information in the system and to make searches in the same time, without having to wait to much for server response ... I would like this not to happen with Lucene and would like to be able to update every time an update occurs, but this does not seem the right approach right now. As I said before, this seems like a wish item for Lucene. I don't really know if the wish is feasible. I agree that maybe a built in function for identifying false locking would be very usefull ... but it might be also a little bit bad for the users because they will be tempted to unlock index ... instead of closing readers/writers correctly. So far the biggest problem I was facing with this approach, however, was having feedback from the archiving process to the main database that the archiving change actually has happened and correctly even if the server goes down. ... so .. it may work correctly if we use lucene (and the servers and the OS) correctly :) Maybe it will be a good idea to create some junit/jmeter tests to identify the source of unespected locks. This is also depending on your availability. But I think it will worth the effort. Sergiu JohnE Personally I don't like the Queue aproach... because I already implemented multithreading in out application to improve its performance. In our application indexing is not a high priority, but it's happening quite often. Search is a priority. Lucene allows to have more searches at on time. When you have a big index and a many users then ... the Queue aproach can slow down your application to much. I think it will be a bottleneck. I know that the lock problem is annoying, but I also think that the right way is to identify the source of locking. Our application is a webbased application based on turbine, and when we want to restart tomcat, we just kill the process (otherwise we need to restart 2 times because of some log4j initialization problem), so ... the index is locked after the tomcat restart. In my case it makes sense to check if index is locked one time at startup. I'm also logging all errors that I get in the systems, this is helping me to find their sourcce easier. All the best, Sergiu - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene : avoiding locking
I am new to Lucene, but have a large project in production on the web using other apache software including Tomcat, Struts, OJB, and others. The database I need to support will hopefully grow to millions of records. Right now it only has thousands but it is growing. These documents get updated by users regularly, but not frequently. When you have 100k users though, infrequently means you still have to deal with lock types of issues. When they update their record, their search criteria will have to be updated and they will expect to see results somewhat immediately. In moving from exact matching which is very poor for searches to Lucene, this locking is the only thing that has me nervous. I would really like a well thought out scheme for incremental changes as I won't generally need batch unless I have to delete/recreate the database for some reason. Thinking about most online forums, I think incremental is the way they would like to be able to go for searching. I have lots to learn about this project, but I really like what I see besides that locking issue. If I get into this more and understand details maybe I will have something to offer later. Lots to learn first though. Thank you for your hard work, JohnE I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. Regards, Luke Francl - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene : avoiding locking (incremental indexing)
This is how I implemented incremental indexing. If anyone sees anything wrong, please let me know. Our motivation is similar to John Eichel's. We have a digital asset management system and when users update, delete or create a new asset, they need to see their results immediately. The most important thing to know about incremental indexing that multiple threads cannot share the same IndexWriter, and only one IndexWriter can be open on an index at a time. Therefore, what I did was control access to the IndexWriter through a singleton wrapper class that synchronizes access to the IndexWriter and IndexReader (for deletes). After finishing writing to the index, you must close the IndexWriter to flush the changes to the index. If you do this you will be fine. However, opening and closing the index takes time so we had to look for some ways to speed up the indexing. The most obvious thing is that you should do as much work as possible outside of the synchronized block. For example, in my application, the creation of Lucene Document objects is not synchronized. Only the part of the code that is between your IndexWriter.open() and IndexWriter.close() needs to be synchronized. The other easy thing I did to improve performance was batch changes in a transaction together for indexing. If a user changes 50 assets, that will all be indexed using one Lucene IndexWriter. So far, we haven't had to explore further performance enhancements, but if we do the next thing I will do is create a thread that gathers assets that need to be indexed and performs a batch job every five minutes or so. Hope this is helpful, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene : avoiding locking (incremental indexing)
It really seems like I am not the only person having this issue. So far I am seeing 2 solutions and honestly I don't love either totally. I am thinking that without changes to Lucene itself, the best general way to implement this might be to have a queue of changes and have Lucene work off this queue in a single thread using a time-settable batch method. This is similar to what you are using below, but I don't like that you forcibly unlock Lucene if it shows itself locked. Using the Queue approach, only that one thread could be accessing Lucene for writes/deletes anyway so there should be no unknown locking. I can imagine this being a very good addition to Lucene - creating a high level interface to Lucene that manages incremental updates in such a manner. If anybody has such a general piece of code, please post it!!! I would use it tonight rather then create my own. I am not sure if there is anything that can be done to Lucene itself to help with this need people seem to be having. I realize the likely reasons why Lucene might need to only have one Index writer and the additional load that might be caused by locking off pieces of the database rather then the whole database. I think I need to look in the developer archives. JohnE - Original Message - From: Luke Shannon [EMAIL PROTECTED] Date: Monday, November 15, 2004 5:14 pm Subject: Re: Lucene : avoiding locking (incremental indexing) Hi Luke; I have a similar system (except people don't need to see results immediatly). The approach I took is a little different. I made my Indexer a thread with the indexing operations occuring the in run method. When the IndexWriter is to be created or the IndexReader needs to execute a delete I called the following method: private void manageIndexLock() { try { //check if the index is locked and deal with it if it is if (index.exists() IndexReader.isLocked(indexFileLocation)) { System.out.println(INDEXING INFO: There is more than one process trying to write to the index folder. Will wait for index to become available.);//perform this loop until the lock if released or 3 mins // has expired int indexChecks = 0; while (IndexReader.isLocked(indexFileLocation) indexChecks 6) { //increment the number of times we check the index // files indexChecks++; try { //sleep for 30 seconds Thread.sleep(3L); } catch (InterruptedException e2) { System.out.println(INDEX ERROR: There was a problem waiting for the lock to release. + e2.getMessage()); } }//closes the while loop for checking on the index // directory //if we are still locked we need to do something about it if (IndexReader.isLocked(indexFileLocation)) { System.out.println(INDEXING INFO: Index Locked After 3 minute of waiting. Forcefully releasing lock.); IndexReader.unlock(FSDirectory.getDirectory(index, false)); System.out.println(INDEXING INFO: Index lock released); }//close the if that actually releases the lock }//close the if ensure the file exists }//closes the try for all the above operations catch (IOException e1) { System.out.println(INDEX ERROR: There was a problem waiting for the lock to release. + e1.getMessage()); } }//close the manageIndexLock method Do you think this is a bad approach? Luke - Original Message - From: Luke Francl [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, November 15, 2004 5:01 PM Subject: Re: Lucene : avoiding locking (incremental indexing) This is how I implemented incremental indexing. If anyone sees anything wrong, please let me know. Our motivation is similar to John Eichel's. We have a digital asset management system and when users update, delete or create a new asset, they need to see their results immediately. The most important thing to know about incremental indexing that multiple threads cannot share the same IndexWriter, and only one IndexWriter can be open on an index at a time. Therefore, what I did was control access to the IndexWriter through a singleton wrapper class that synchronizes access to the IndexWriter and IndexReader (for deletes). After finishing writing to the index, you must close the IndexWriter to flush the changes to the index. If you do this you will be fine. However, opening and closing the index takes time so we had to look for some ways to speed up the indexing. The most obvious thing is that you should do as much work as possible outside of the synchronized block. For example, in my application, the creation of Lucene Document objects is not synchronized. Only the part of the code that is between your IndexWriter.open() and IndexWriter.close() needs to be synchronized. The other easy thing I did to improve performance was batch changes in a transaction together
Re: Lucene : avoiding locking (incremental indexing)
Luke Shannon wrote: I like the sound of the Queue approach. I also don't like that I have to focefully unlock the index. Personally I don't like the Queue aproach... because I already implemented multithreading in out application to improve its performance. In our application indexing is not a high priority, but it's happening quite often. Search is a priority. Lucene allows to have more searches at on time. When you have a big index and a many users then ... the Queue aproach can slow down your application to much. I think it will be a bottleneck. I know that the lock problem is annoying, but I also think that the right way is to identify the source of locking. Our application is a webbased application based on turbine, and when we want to restart tomcat, we just kill the process (otherwise we need to restart 2 times because of some log4j initialization problem), so ... the index is locked after the tomcat restart. In my case it makes sense to check if index is locked one time at startup. I'm also logging all errors that I get in the systems, this is helping me to find their sourcce easier. All the best, Sergiu I'm not the most experience programmer and am on a tight deadline. The approach I ended up with was the best I could do with the experience I've got and the time I had. My indexer works so far and doesn't have to forcefully release the lock on the Index too often (the case is most likely to occur when someone removes a content file(s) and the reader needs to delete from the existing index for the first time). We will see what happens as more people use the system with large content directories. As I learn more I plan to expand the functionality of my class. Luke S - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, November 15, 2004 5:50 PM Subject: Re: Lucene : avoiding locking (incremental indexing) It really seems like I am not the only person having this issue. So far I am seeing 2 solutions and honestly I don't love either totally. I am thinking that without changes to Lucene itself, the best general way to implement this might be to have a queue of changes and have Lucene work off this queue in a single thread using a time-settable batch method. This is similar to what you are using below, but I don't like that you forcibly unlock Lucene if it shows itself locked. Using the Queue approach, only that one thread could be accessing Lucene for writes/deletes anyway so there should be no unknown locking. I can imagine this being a very good addition to Lucene - creating a high level interface to Lucene that manages incremental updates in such a manner. If anybody has such a general piece of code, please post it!!! I would use it tonight rather then create my own. I am not sure if there is anything that can be done to Lucene itself to help with this need people seem to be having. I realize the likely reasons why Lucene might need to only have one Index writer and the additional load that might be caused by locking off pieces of the database rather then the whole database. I think I need to look in the developer archives. JohnE - Original Message - From: Luke Shannon [EMAIL PROTECTED] Date: Monday, November 15, 2004 5:14 pm Subject: Re: Lucene : avoiding locking (incremental indexing) Hi Luke; I have a similar system (except people don't need to see results immediatly). The approach I took is a little different. I made my Indexer a thread with the indexing operations occuring the in run method. When the IndexWriter is to be created or the IndexReader needs to execute a delete I called the following method: private void manageIndexLock() { try { //check if the index is locked and deal with it if it is if (index.exists() IndexReader.isLocked(indexFileLocation)) { System.out.println(INDEXING INFO: There is more than one process trying to write to the index folder. Will wait for index to become available.);//perform this loop until the lock if released or 3 mins // has expired int indexChecks = 0; while (IndexReader.isLocked(indexFileLocation) indexChecks 6) { //increment the number of times we check the index // files indexChecks++; try { //sleep for 30 seconds Thread.sleep(3L); } catch (InterruptedException e2) { System.out.println(INDEX ERROR: There was a problem waiting for the lock to release. + e2.getMessage()); } }//closes the while loop for checking on the index // directory //if we are still locked we need to do something about it if (IndexReader.isLocked(indexFileLocation)) { System.out.println(INDEXING INFO: Index Locked After 3 minute of waiting. Forcefully releasing lock.); IndexReader.unlock(FSDirectory.getDirectory(index, false)); System.out.println(INDEXING INFO: Index lock released); }//close the if that actually releases the lock }//close the if ensure the file exists }//closes
Re: Lucene : avoiding locking
Luke, I also integrated Lucene into a content management application with incremental updates and ran into the same problem you did. You need to make sure only one process (which means, no multiple copies of the application writing to the index simultaneously) or thread ever writes to the index. That includes deletes as in your code below, so make sure that is synchronized, too. Also, you will find that opening and closing the index for writing is very costly, especially on a large index, so it pays to batch up all changes in a transaction (inserts and deletes) together in one go at the Lucene index. If this still isn't enough, you can batch up 5 minutes worth of changes and apply them at once. We haven't got to that point yet. I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. Regards, Luke Francl On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: Syncronizing the method didn't seem to help. The lock is being detected right here in the code: while (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } This runs fine on my own site so I am confused. For now I think I am going to remove the deleting of stale files etc and just rebuild the index each time to see what happens. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would
Re: Lucene : avoiding locking
Hi Luke; Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? Thanks, Luke - Original Message - From: Luke Francl [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12, 2004 10:38 AM Subject: Re: Lucene : avoiding locking Luke, I also integrated Lucene into a content management application with incremental updates and ran into the same problem you did. You need to make sure only one process (which means, no multiple copies of the application writing to the index simultaneously) or thread ever writes to the index. That includes deletes as in your code below, so make sure that is synchronized, too. Also, you will find that opening and closing the index for writing is very costly, especially on a large index, so it pays to batch up all changes in a transaction (inserts and deletes) together in one go at the Lucene index. If this still isn't enough, you can batch up 5 minutes worth of changes and apply them at once. We haven't got to that point yet. I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. Regards, Luke Francl On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: Syncronizing the method didn't seem to help. The lock is being detected right here in the code: while (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } This runs fine on my own site so I am confused. For now I think I am going to remove the deleting of stale files etc and just rebuild the index each time to see what happens. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions
Re: Lucene : avoiding locking
Hello, --- Luke Shannon [EMAIL PROTECTED] wrote: Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? Only if you synchronize well and only if all index-modifying accesses are contained in the same JVM. Alternatively, you could add a 'sleep and retry' logic around the lock check, and perhaps 'give up or force unlock if you got too much sleep'. Otis - Original Message - From: Luke Francl [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12, 2004 10:38 AM Subject: Re: Lucene : avoiding locking Luke, I also integrated Lucene into a content management application with incremental updates and ran into the same problem you did. You need to make sure only one process (which means, no multiple copies of the application writing to the index simultaneously) or thread ever writes to the index. That includes deletes as in your code below, so make sure that is synchronized, too. Also, you will find that opening and closing the index for writing is very costly, especially on a large index, so it pays to batch up all changes in a transaction (inserts and deletes) together in one go at the Lucene index. If this still isn't enough, you can batch up 5 minutes worth of changes and apply them at once. We haven't got to that point yet. I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. Regards, Luke Francl On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: Syncronizing the method didn't seem to help. The lock is being detected right here in the code: while (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } This runs fine on my own site so I am confused. For now I think I am going to remove the deleting of stale files etc and just rebuild the index each time to see what happens. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions
Re: Lucene : avoiding locking
On Fri, 2004-11-12 at 09:51, Luke Shannon wrote: Hi Luke; Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? No, because if the index is locked, that means another thread or process is writing to it. If you're getting spurious locks, stop your application and clean our the /tmp/ directory (you should see files named *lucene* -- these are the lock files). Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene : avoiding locking
I am curious, though, how many people on this list are using Lucene in the incremental update case. Most examples I've seen all assume batch indexing. I do both on for Simpy (simpy.com). To ensure no duplicates, I try to delete (by some unique ID) before I add a new Document. Otis On Thu, 2004-11-11 at 18:33, Luke Shannon wrote: Syncronizing the method didn't seem to help. The lock is being detected right here in the code: while (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } This runs fine on my own site so I am confused. For now I think I am going to remove the deleting of stale files etc and just rebuild the index each time to see what happens. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done intelligently. Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date
Re: Lucene : avoiding locking
. indexDocs(1 arg)); //look out for reader/writer conflicts if (IndexReader.isLocked(index.getPath())) { try { System.out .println(Waiting 1 minute for the reader to release the lock on the index.); Thread.sleep(6L); //if we are still locked we need to do // something about it if (IndexReader.isLocked(index.getPath())) { System.out .println(Index Locked After 1 minute waiting. Forcefully releasing lock.); IndexReader.unlock(FSDirectory .getDirectory(index, false)); System.out.println(Index lock released); } } catch (InterruptedException e2) { System.out .println(INDEX ERROR: There was a problem waiting for the lock to release. + e2.getMessage()); } } System.out .println(INDEX INFO: Data has been deleted from the index.); reader.delete(uidIter.term()); } uidIter.next(); } //if the terms are equal there is no change with this document //we keep it as is if (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) == 0) { uidIter.next(); } //if we are not deleting and the document was not there //it means we didn't have this document on the last index //and we should add it else if (!deleting) { System.out .println(INDEXING INFO: Adding a new Document to the existing index: + file.getPath()); //pdf files if (file.getPath().endsWith(.pdf)) { try { Document doc = LucenePDFDocument.getDocument(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Unable to index pdf document: + file.getPath() + + e.getMessage()); } } //xml documents else if (file.getPath().endsWith(.xml)) { try { Document doc = XMLDocument.Document(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Was unable to index XML document: + file.getPath() + + e.getMessage()); } } //html and txt documents else { try { Document doc = HTMLDocument.Document(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Was unable to index HTML/TXT file: + file.getPath() + + e.getMessage()); } } } }//end the if for an incremental update //we are creating a new index, add all document types else { System.out .println(INDEXING INFO: Adding a new Document to a new index: + file.getPath()); //pdf documents if (file.getPath().endsWith(.pdf)) { try { Document doc = LucenePDFDocument.getDocument(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Unable to index pdf document: + file.getPath() + + e.getMessage()); } } //xml documents else if (file.getPath().endsWith(.xml)) { try { Document doc = XMLDocument.Document(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Was unable to index XML document: + file.getPath() + + e.getMessage()); } } //html and txt documents else { try { Document doc = HTMLDocument.Document(file); writer.addDocument(doc); } catch (Exception e) { System.out .println(INDEXING ERROR: Was unable to index HTML/TXT file: + file.getPath() + + e.getMessage()); } }//close the else }//close the else for a new index }//close the else if to handle file types }//close the indexDocs method /* * Close any open objects. */ protected void finalize() throws Throwable { if (reader != null) { reader.close(); } if (writer != null) { writer.close(); } } } - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12, 2004 11:03 AM Subject: Re: Lucene : avoiding locking Hello, --- Luke Shannon [EMAIL PROTECTED] wrote: Currently I am experimenting with checking if the index is lock using IndexReader.locked before creating a writer. If this turns out to be the case I was thinking of just unlocking the file. Do you think this is a good strategy? Only if you synchronize well and only if all index-modifying accesses are contained in the same JVM. Alternatively, you could add a 'sleep and retry' logic around the lock check, and perhaps 'give up or force unlock if you got too much sleep'. Otis - Original Message - From: Luke Francl [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Friday, November 12
Lucene : avoiding locking
Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done intelligently. Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermEnum; import org.pdfbox.searchengine.lucene.LucenePDFDocument; import org.apache.lucene.demo.HTMLDocument; import com.alaia.common.debug.Trace; import com.alaia.common.util.AppProperties; /** * @author lshannon Description: br * This class is used to index a content folder. It contains logic to * ensure only new or documents that have been modified since the last * search are indexed. br * Based on code writen by Doug Cutting in the IndexHTML class found in * the Lucene demo */ public class Indexer { //true during deletion pass, this is when the index already exists private static boolean deleting = false; //object to read existing indexes private static IndexReader reader; //object to write to the index folder private static IndexWriter writer; //this will be used to write the index file private static TermEnum uidIter; /* * This static method does all the work, the end result is an up-to-date index folder */ public static void Index() { //we will assume to start the index has been created boolean create = true; //set the name of the index file String indexFileLocation = AppProperties.getPropertyAsString(bolt.search.siteIndex.index.root); //set the name of the content folder String contentFolderLocation = AppProperties.getPropertyAsString(site.root); //manage whether the index needs to be created or not File index = new File(indexFileLocation);
Re: Lucene : avoiding locking
I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done intelligently. Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermEnum; import org.pdfbox.searchengine.lucene.LucenePDFDocument; import org.apache.lucene.demo.HTMLDocument; import com.alaia.common.debug.Trace; import com.alaia.common.util.AppProperties; /** * @author lshannon Description: br * This class is used to index a content folder. It contains logic to * ensure only new or documents that have been modified since the last * search are indexed. br * Based on code writen by Doug Cutting in the IndexHTML class found in * the Lucene demo */ public class Indexer { //true during deletion pass, this is when the index already exists private static boolean deleting = false; //object to read existing indexes private static IndexReader reader; //object to write to the index folder private static IndexWriter writer; //this will be used to write the index file private static TermEnum uidIter; /* * This static method does all the work, the end result is an up-to-date index folder */ public static void Index() { //we will assume to start the index has been created boolean create = true; //set
Re: Lucene : avoiding locking
I will try that now. Thank you. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done intelligently. Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermEnum; import org.pdfbox.searchengine.lucene.LucenePDFDocument; import org.apache.lucene.demo.HTMLDocument; import com.alaia.common.debug.Trace; import com.alaia.common.util.AppProperties; /** * @author lshannon Description: br * This class is used to index a content folder. It contains logic to * ensure only new or documents that have been modified since the last * search are indexed. br * Based on code writen by Doug Cutting in the IndexHTML class found in * the Lucene demo */ public class Indexer { //true during deletion pass, this is when the index already exists private static boolean deleting = false; //object to read existing indexes private static IndexReader reader; //object to write to the index folder private
Re: Lucene : avoiding locking
Syncronizing the method didn't seem to help. The lock is being detected right here in the code: while (uidIter.term() != null uidIter.term().field() == uid uidIter.term().text().compareTo(uid) 0) { //delete stale docs if (deleting) { reader.delete(uidIter.term()); } uidIter.next(); } This runs fine on my own site so I am confused. For now I think I am going to remove the deleting of stale files etc and just rebuild the index each time to see what happens. - Original Message - From: [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Thursday, November 11, 2004 6:56 PM Subject: Re: Lucene : avoiding locking I'm working on a similar project... Make sure that only one call to the index method is occuring at a time. Synchronizing that method should do it. --- Luke Shannon [EMAIL PROTECTED] wrote: Hi All; I have hit a snag in my Lucene integration and don't know what to do. My company has a content management product. Each time someone changes the directory structure or a file with in it that portion of the site needs to be re-indexed so the changes are reflected in future searches (indexing must happen during run time). I have written a Indexer class with a static Index() method. The idea is too call the method every time something changes and the index needs to be re-examined. I am hoping the logic put in by Doug Cutting surrounding the UID will make indexing efficient enough to be called so frequently. This class works great when I tested it on my own little site (I have about 2000 file). But when I drop the functionality into the QA environment I get a locking error. I can't access the stack trace, all I can get at is a log file the application writes too. Here is the section my class wrote. It was right in the middle of indexing and bang lock issue. I don't know if the problem is in my code or something in the existing application. Error Message: ENTER|SearchEventProcessor.visit(ContentNodeDeleteEvent) |INFO|INDEXING INFO: Start Indexing new content. |INFO|INDEXING INFO: Index Folder Did Not Exist. Start Creation Of New Index |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING INFO: Beginnging Incremental update comparisions |INFO|INDEXING ERROR: Unable to index new content Lock obtain timed out: Lock@/usr/tomcat/jakarta-tomcat-5.0.19/temp/lucene-398fbd170a5457d05e2f4d432 10f7fe8-write.lock |ENTER|UpdateCacheEventProcessor.visit(ContentNodeDeleteEvent) Here is my code. You will recognize it pretty much as the IndexHTML class from the Lucene demo written by Doug Cutting. I have put a ton of comments in a attempt to understand what is going on. Any help would be appreciated. Luke package com.fbhm.bolt.search; /* * Created on Nov 11, 2004 * * This class will create a single index file for the Content * Management System (CMS). It contains logic to ensure * indexing is done intelligently. Based on IndexHTML.java * from the demo folder that ships with Lucene */ import java.io.File; import java.io.IOException; import java.util.Arrays; import java.util.Date; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.index.IndexReader; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.index.TermEnum; import org.pdfbox.searchengine.lucene.LucenePDFDocument; import org.apache.lucene.demo.HTMLDocument; import com.alaia.common.debug.Trace; import com.alaia.common.util.AppProperties; /** * @author lshannon Description: br * This class is used to index a content folder. It contains logic to * ensure only new