Re: Lock files in a read-only application
"Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : locks without upgrading to 2.1. Our application uses its own custom > : locking mechanism, so that lucene locking is actually redundant. We > : are currently using Lucene version 2.0. > > since before the 2.0.0 release there has been a static > FSDirectory.setDisableLocks that can be called before opening any indexes > to prevent locking -- it's only intended to be used on indexes on read > only disk -- which is not the case in your situation, since a seperate > process is in fact modifying the index, but if you are confident in your > own locking mechanism you can use it. You need to be really certain your own locking protects Lucene properly. Specifically, no IndexReader can be created (restarted) while a writer is open against the index, and, only one writer can be open on the index at once (it sounds like you already have that). If you're sure about that then disabling the locks as Hoss describes above is OK. > : The application has multiple threads (different web requests) reading > : the same index simultaneously (say 20 concurrent threads). Can that be > : a reason of this problem. Sometimes the lockfiles remain there for > : long periods of time (more than a few minutes, which is bad). > > mutliple reader threads should not cause the commit lock to stay arround > that long, even if each thread is opening it's on IndexReader (which they > should not do, it's better to open one and reuse it among many threads) This part (commit lock staying around for so long) is definitely odd and I'd like to get to the root cause. Multiple threads are fine (though, you should share one IndexReader). The only way I know of for this to happen is if JVM crashes while IndexReader or IndexWriter is being initialized. Even then it's quite unlikely because JVM has to crash right when segments file is being read or written. > : Yes, JVM sometimes crashes when it runs out of memory. There should be > : someway that the lock files are removed after such crash (any fixes is > : 2.1?). > > As Michael said, in 2.1 the commit lock doesn't even exist, and in general > there is a much more robust lock management system that lets you decide > what type of lock mechanism to use. In fact with 2.1 we have a new optional locking implementation called NativeFSLockFactory. One of its big benefits over the default Lucene locking (SimpleFSLockFactory) is that if the JVM crashes then the lock file(s) are correctly released (ie, no more "stale lock files" left in the filesystem). This way if the JVM of the writer crashes then the write.lock that it held is properly freed by the OS. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Emulating Pages Search
Mosen, In order to support pagination, I wrapped the Hits is a class just like java.sql.ResultSet You can create a wrapper class and put the Hits in that and implement some methods like next() prev() to forward and backward through the docuements. Hope this help you. -- Regards, Mohammad
Re: Emulating Pages Search
This is possible, but the problem here is performance. Why is it not possible to support pagination in a more efficient way? Suppose, a Searcher looks through Documents and find the matching ones. Theoretically, it can stop searching when the result hit number gets more than a threshold. Searcher may save it's state (reference to the last matched document) whithin the searcher instance, making it possible for incremental search. What is the restriction here in Lucene indices structure, which prevents us from having this kind of search? is_maximum wrote: > > Mosen, > In order to support pagination, I wrapped the Hits is a class just like > java.sql.ResultSet > You can create a wrapper class and put the Hits in that and implement some > methods like next() prev() to forward and backward through the docuements. > > Hope this help you. > > -- > Regards, > Mohammad > -- View this message in context: http://www.nabble.com/Emulating-Pages-Search-tf3500169.html#a9776722 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Emulating Pages Search
It has no performance problem and works fine. whenever you are going to access a document the searcher will load the document from the index. On 4/1/07, Mohsen Saboorian <[EMAIL PROTECTED]> wrote: This is possible, but the problem here is performance. Why is it not possible to support pagination in a more efficient way? Suppose, a Searcher looks through Documents and find the matching ones. Theoretically, it can stop searching when the result hit number gets more than a threshold. Searcher may save it's state (reference to the last matched document) whithin the searcher instance, making it possible for incremental search. What is the restriction here in Lucene indices structure, which prevents us from having this kind of search? is_maximum wrote: > > Mosen, > In order to support pagination, I wrapped the Hits is a class just like > java.sql.ResultSet > You can create a wrapper class and put the Hits in that and implement some > methods like next() prev() to forward and backward through the docuements. > > Hope this help you. > > -- > Regards, > Mohammad > -- View this message in context: http://www.nabble.com/Emulating-Pages-Search-tf3500169.html#a9776722 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- Regards, Mohammad
Re: Emulating Pages Search
Efficient in your situation, maybe. Good for everybody? Probably not. The key is exactly your use of the word "state". Personally, I do NOT want the core search engine to be stateful, that brings a whole raft of problems with it. And Lucene is a search engine, not a search application. I really don't want my underlying search engine to keep track of the states of many thousands of, say, web users. This, without even asking the question of how to keep track of the state for each user in a complex web application. Not to mention the added requirements that I somehow indicate to Lucene which user's state to use. And I'm not even going to go down the path of how to accomplish the bookkeeping for dropped sessions, timeouts, coordinating underlying index changes with all these states, etc. etc. etc. I think that if you consider the larger community, asking Lucene to "save its state" is much more complex that you think. That said, I can certainly imagine that there are situations where making the search process stateful is a good thing. But do you have any evidence that the current architecture actually is hurting you other than "theoretically"? I certainly wouldn't go down the stateful path until I'd demonstrated this in my situation. If, however, you'd like to make a stateful way to do things and submit it to the contrib section, I'm sure the guys would be thrilled. Erick On 4/1/07, Mohsen Saboorian <[EMAIL PROTECTED]> wrote: This is possible, but the problem here is performance. Why is it not possible to support pagination in a more efficient way? Suppose, a Searcher looks through Documents and find the matching ones. Theoretically, it can stop searching when the result hit number gets more than a threshold. Searcher may save it's state (reference to the last matched document) whithin the searcher instance, making it possible for incremental search. What is the restriction here in Lucene indices structure, which prevents us from having this kind of search? is_maximum wrote: > > Mosen, > In order to support pagination, I wrapped the Hits is a class just like > java.sql.ResultSet > You can create a wrapper class and put the Hits in that and implement some > methods like next() prev() to forward and backward through the docuements. > > Hope this help you. > > -- > Regards, > Mohammad > -- View this message in context: http://www.nabble.com/Emulating-Pages-Search-tf3500169.html#a9776722 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Emulating Pages Search
Just to add to the thoughtful responses from the others, it isn't really that bad to do a new search each time. First, the later searches may likely be "warm" searches and thus won't take as long as the first search; second, it's the searcher.doc(docId) part that will likely hurt the most, but hopefully you'll only need to do that for one "page" at a time only (for each request). Thanks, Xiaocheng Mohsen Saboorian <[EMAIL PROTECTED]> wrote: Hi, Is there a way to do emulate paged search in Lucene? I can use the following peace of code for returning the first page (10 items in each page), but don't know how to navigate to the next page :-) IndexSearcher is = new ... ... TopFieldDocs tops = is.search(query, null /*filter*/, 10, Sort.RELEVANCE); for (int i = 0; i < tops.scoreDocs.length; i++) { ScoreDoc scoreDoc = tops.scoreDocs[i]; System.out.println(is.doc(scoreDoc.doc)); } I can see that tops.totalHits, returns all matched documents. So is this really "paged search", or I'm just doing a complete search and put a window on the returned result each time? Thanks. -- View this message in context: http://www.nabble.com/Emulating-Pages-Search-tf3500169.html#a9775141 Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta.
Re: Help - FileNotFoundException during IndexWriter.init()
Michael McCandless wrote: Yes, I've disabled it currently while the new test runs. Let's see. I'll re-run the test a few more times and see if I can re-create the problem. OK let's see if that makes it go away! Hopefully :) I ran the tests several times over the weekend with no virus checker in the DB directory and haven't managed to reproduce the problem. Thanks for the help Mike. Nothing like an exception never seen before, two days before the product is due to go live, to induce mild panic ;) Antony - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Help - FileNotFoundException during IndexWriter.init()
"Antony Bowesman" <[EMAIL PROTECTED]> wrote: > Michael McCandless wrote: > > >> Yes, I've disabled it currently while the new test runs. Let's see. > >> I'll re-run the test a few more times and see if I can re-create the > >> problem. > > > > OK let's see if that makes it go away! Hopefully :) > > I ran the tests several times over the weekend with no virus checker in the > DB > directory and haven't managed to reproduce the problem. > > Thanks for the help Mike. Nothing like an exception never seen before, two > days > before the product is due to go live, to induce mild panic ;) Phew, I'm glad to hear that! Thanks for bringing closure to this. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lock files in a read-only application
thanks for your replies. i have two more questions. You need to be really certain your own locking protects Lucene properly. Specifically, no IndexReader can be created (restarted) while a writer is open against the index, and, only one writer can be open on the index at once (it sounds like you already have that). If you're sure about that then disabling the locks as Hoss describes above is OK. 1. If our locking fails, what will happen in the worst case, i.e., an IndexSearcher tries to read while an IndexWriter is updating the index. Can it lead to index corruption, or just that the searcher will give garbage results (or fail with exception) for that query. 2. Currently we are not using any IndexReader. When a request arrives, we create a new IndexSearcher, and destroy it when it finishes searching. Is it more efficient to create just one IndexSearcher and share it with all threads? Or create one IndexReader and use it for creating all IndexSearchers. thanks again, Nilesh -- Nilesh Bansal. http://queens.db.toronto.edu/~nilesh/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]