IndexMergeTool - Close indexes before merge?
Hi The IndexMergeTool (see url below) creates a new index, the mergedIndex. Do the other indexes, index1, index2, etc, need to be closed before performing the merge? This is the same as asking if the indexes passed to IndexWriter.addIndexes need to be closed before they are added to the new index. http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/miscellaneous/src/java/org/apache/lucene/misc/IndexMergeTool.java Thanks for your help, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
pkimber [EMAIL PROTECTED] wrote: We are still getting various issues on our Lucene indexes running on an NFS share. It has taken me some time to find some useful information to report to the mailing list. Bummer! Can you zip up your test application that shows the issue, as well as the full logs from both servers? I can look at them try to reproduce the error. Mike Yeh, I know! I cannot send you the source code without speaking to my manager first. I guess he would want me to change the code before sending it to you. You could have the log files now, but I expect you want to wait until the test application is ready to send? Thanks for your help, Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 答复: 答复: About muti-Threads in Lucene
Hi Kai No, I have no problem returning hits. When I do have problems like this, I usually find I have something more to learn about Lucene indexing. Try looking at the data and query in Luke. I usually find this is the best way to understand what is going on. Here is the link to Luke: http://www.getopt.org/luke/ I hope you get it sorted. Patrick On 07/08/07, Kai Hu [EMAIL PROTECTED] wrote: By the way, Patrick,did you have a problem that IndexSearcher.search(Query query) cann't get the all matched hits.it only return a part of matched hits. my test code is: String key = title:good; Directory directory = FSDirectory.getDirectory(d:\\index\\); IndexSearcher searcher = new IndexSearcher(directory); QueryParser queryParser = new QueryParser(,analyzer); Query query = queryParser.parse(key); hits = searcher.search(query,sort); there are two documents indexed which title value is good,but when I searched by keytitle:good,it returned only one documents.is it a bug? kai Hi Kai We keep a synchronized map of LuceneIndexAccessor instances, one instance per The map is keyed on the directory path. We then re-use the accessor rather than creating a new one each time. Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Thanks , Patrick, It is useful. But I found a problem that I use new LuceneIndexAccessor(accessProvider); when a request comes in B/S,the LuceneIndexAccessor.getWriter() will lose its sense,it will new an IndexWriter. public IndexWriter getWriter() throws IOException { IndexWriter result; synchronized (this) {//here synchronized will lose its sense checkClosed(); ... if (cachedWriter != null) { log.debug(returning cached writer); result = cachedWriter; writerUseCount++; } else { log.debug(opening new writer and caching it); result = accessProvider.getWriter();// when new a LuceneIndexAccessor Object ,it will new an IndexWriter Object cachedWriter = result; writerUseCount = 1; } } } It will also throw a Exception cann't obtain the Lock,should I use a single instance of LuceneIndexAccessor? Suppose I use a single instance of LuceneIndexAccessor,how to set a different Directory or Analyzer at one time. kai /// /// Hi Kai We use the Lucene Index Accessor contribution: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Hi, How do you solve the problems when add,update,delete documents in muti-threads,use synchronized ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: About muti-Threads in Lucene
Hi Kai We use the Lucene Index Accessor contribution: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Hi, How do you solve the problems when add,update,delete documents in muti-threads,use synchronized ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: 答复: About muti-Threads in Lucene
Hi Kai We keep a synchronized map of LuceneIndexAccessor instances, one instance per Directory. The map is keyed on the directory path. We then re-use the accessor rather than creating a new one each time. Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Thanks , Patrick, It is useful. But I found a problem that I use new LuceneIndexAccessor(accessProvider); when a request comes in B/S,the LuceneIndexAccessor.getWriter() will lose its sense,it will new an IndexWriter. public IndexWriter getWriter() throws IOException { IndexWriter result; synchronized (this) {//here synchronized will lose its sense checkClosed(); ... if (cachedWriter != null) { log.debug(returning cached writer); result = cachedWriter; writerUseCount++; } else { log.debug(opening new writer and caching it); result = accessProvider.getWriter();// when new a LuceneIndexAccessor Object ,it will new an IndexWriter Object cachedWriter = result; writerUseCount = 1; } } } It will also throw a Exception cann't obtain the Lock,should I use a single instance of LuceneIndexAccessor? Suppose I use a single instance of LuceneIndexAccessor,how to set a different Directory or Analyzer at one time. kai /// /// Hi Kai We use the Lucene Index Accessor contribution: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 Patrick On 06/08/07, Kai Hu [EMAIL PROTECTED] wrote: Hi, How do you solve the problems when add,update,delete documents in muti-threads,use synchronized ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: What replaced org.apache.lucene.document.Field.Text?
Hi Andy I think: Field.Text(name, value); has been replaced with: new Field(name, value, Field.Store.YES, Field.Index.TOKENIZED); Patrick On 25/07/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Please reference How do I get code written for Lucene 1.4.x to work with Lucene 2.x? http://wiki.apache.org/lucene-java/LuceneFAQ#head-86d479476c63a2579e867b 75d4faa9664ef6cf4d Andy -Original Message- From: Lindsey Hess [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 25, 2007 12:31 PM To: Lucene Subject: What replaced org.apache.lucene.document.Field.Text? I'm trying to get some relatively old Lucene code to compile (please see below), and it appears that Field.Text has been deprecated. Can someone please suggest what I should use in its place? Thank you. Lindsey public static void main(String args[]) throws Exception { String indexDir = System.getProperty(java.io.tmpdir, tmp) + System.getProperty(file.separator) + address-book; Analyzer analyzer = new WhitespaceAnalyzer(); boolean createFlag = true; IndexWriter writer = new IndexWriter(indexDir, analyzer, createFlag); Document contactDocument = new Document(); contactDocument.add(Field.Text(type, individual)); contactDocument.add(Field.Text(name, Zane Pasolini)); contactDocument.add(Field.Text(address, 999 W. Prince St.)); contactDocument.add(Field.Text(city, New York)); contactDocument.add(Field.Text(province, NY)); contactDocument.add(Field.Text(postalcode, 10013)); contactDocument.add(Field.Text(country, USA)); contactDocument.add(Field.Text(telephone, 1-212-345-6789)); writer.addDocument(contactDocument); writer.close(); } - Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Michael Just to let you know, I am on holiday for one week so will not be able to send a progress report until I return. I have deployed the new code to a test site so I will be informed if the users notice any issues. Thanks for your help Patrick On 04/07/07, Michael McCandless [EMAIL PROTECTED] wrote: Patrick Kimber [EMAIL PROTECTED] wrote: Yes, there are many lines in the logs saying: hit FileNotFoundException when loading commit segment_X; skipping this commit point ...so it looks like the new code is working perfectly. Super! I am sorry to be vague... but how do I check which segments file is opened when a new writer is created? Oh, sorry, it's not exactly obvious. Here's what to look for: On machine #1 (the machine that added docs then closed its writer) you should see lines like this, which are printed every time the writer flushes its docs: checkpoint: wrote segments file segments_X Find the last such line on machine #1 before it closes the writer, and that's the current segments_X in the index. Then on machine #2 (the machine that immediately opens a new writer after machine #1 closed its writer) you should see a line like this: [EMAIL PROTECTED] main: init: current segments file is segments_Y which indicates which segments file was loaded by this writer. The thing to verify is that X is always equal to Y whenever a writer quickly moves from machine #1 to machine #2. I will add a check to my test to see if all documents are added. This should tell us if any documents are being silently lost. Very good! Keep us posted, and good luck, Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Michael Yes, there are many lines in the logs saying: hit FileNotFoundException when loading commit segment_X; skipping this commit point ...so it looks like the new code is working perfectly. I am sorry to be vague... but how do I check which segments file is opened when a new writer is created? I will add a check to my test to see if all documents are added. This should tell us if any documents are being silently lost. Thanks Patrick On 03/07/07, Michael McCandless [EMAIL PROTECTED] wrote: Patrick Kimber [EMAIL PROTECTED] wrote: I have been running the test for over an hour without any problem. The index writer log file is getting rather large so I cannot leave the test running overnight. I will run the test again tomorrow morning and let you know how it goes. Ahhh, that's good news, I'm glad to hear that! You should go ahead and turn off the logging and make sure things are still fine (just in case logging is changing timing of events since timing is a factor here). In your logs, do you see lines like this?: ... hit FileNotFoundException when loading commit segment_X; skipping this commit point That would confirm the new code (to catch the FileNotFoundException) is indeed being hit. Actually, could you also check the logs and try to verify that each time one machine closed its writer and a 2nd machine opened a new writer that the 2nd machine indeed loaded the newest segments_N file and not segments_N-1? (This is the possible new issue I was referring to). I fear that this new issue could silently lose documents added by another machine and possibly not throw an exception. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi I have added more logging to my test application. I have two servers writing to a shared Lucene index on an NFS partition... Here is the logging from one server... [10:49:18] [DEBUG] LuceneIndexAccessor closing cached writer [10:49:18] [DEBUG] ExpirationTimeDeletionPolicy onCommit() delete [segments_n] and the other server (at the same time): [10:49:18] [DEBUG] LuceneIndexAccessor opening new writer and caching it [10:49:18] [DEBUG] IndexAccessProvider getWriter() [10:49:18] [ERROR] DocumentCollection update(DocumentData) com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) I think the exception is being thrown when the IndexWriter is created: new IndexWriter(directory, false, analyzer, false, deletionPolicy); I am confused... segments_n should not have been touched for 3 minutes so why would a new IndexWriter want to read it? Here is the whole of the stack trace: com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) at com.company.lucene.IndexUpdate.addDocument(IndexUpdate.java:364) at com.company.lucene.IndexUpdate.addDocument(IndexUpdate.java:342) at com.company.lucene.IndexUpdate.update(IndexUpdate.java:67) at com.company.lucene.icm.DocumentCollection.update(DocumentCollection.java:390) at lucene.icm.test.Write.add(Write.java:105) at lucene.icm.test.Write.run(Write.java:79) at lucene.icm.test.Write.main(Write.java:43) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:271) at java.lang.Thread.run(Thread.java:534) Caused by: java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:204) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:531) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:193) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:156) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:626) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:573) at com.subshell.lucene.indexaccess.impl.IndexAccessProvider.getWriter(IndexAccessProvider.java:68) at com.subshell.lucene.indexaccess.impl.LuceneIndexAccessor.getWriter(LuceneIndexAccessor.java:171) at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:176) ... 13 more Thank you very much for your previous comments and emails. Any help solving this issue would be appreciated. Patrick On 30/06/07, Michael McCandless [EMAIL PROTECTED] wrote: Patrick Kimber wrote: I have been checking the application log. Just before the time when the lock file errors occur I found this log entry: [11:28:59] [ERROR] IndexAccessProvider java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_h75 (No such file or directory) at java.io.RandomAccessFile.open(Native Method) I think this exception is the root cause. On hitting this IOException in reader.close(), that means this reader has not released its write lock. Is it possible to see the full stack trace? Having the wrong deletion policy or even a buggy deletion policy (if indeed file.lastModified() varies by too much across machines) can't cause this (I think). At worse, the wrong deletion policy should cause other already-open readers to hit Stale NFS handle IOExceptions during searching. So, you should use your ExpirationTimeDeletionPolicy when opening your readers if they will be doing deletes, but I don't think it explains this root-cause exception during close(). It's a rather spooky exception ... in close(), the reader initializes an IndexFileDeleter which lists the directory and opens any segments_N files that it finds. Do you have a writer on one machine closing, and then very soon thereafter this reader
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi I am using the NativeFSLockFactory. I was hoping this would have stopped these errors. Patrick On 03/07/07, Neeraj Gupta [EMAIL PROTECTED] wrote: Hi this is the case where index create by one server is updated by other server, results into index corruption. This exception occuring while creating instance of Index writer because at the time of index writer instance creation it checks if index exists or not, if you are not creating a new Index. And it keeps the information with it, but when you go to add some document, now the indexes has been modified by other server. Now the previous and current state doesnt match and results into exception. What kind of locking you are using? i think you should obey some kind of locking algo so that till the time one server is updating the indexes, other server should not interfere. Once a server finishes its updation into the indexes, it should close all writers and reader to release all the locking. The alternate solution to this problem is you can create seperate indexes for each server, this will help because only one thread will be updating the indexes so there wont be any problem. Cheers, Neeraj Patrick Kimber [EMAIL PROTECTED] 07/03/2007 03:47 PM Please respond to java-user@lucene.apache.org To java-user@lucene.apache.org cc Subject Re: Lucene 2.2, NFS, Lock obtain timed out Hi I have added more logging to my test application. I have two servers writing to a shared Lucene index on an NFS partition... Here is the logging from one server... [10:49:18] [DEBUG] LuceneIndexAccessor closing cached writer [10:49:18] [DEBUG] ExpirationTimeDeletionPolicy onCommit() delete [segments_n] and the other server (at the same time): [10:49:18] [DEBUG] LuceneIndexAccessor opening new writer and caching it [10:49:18] [DEBUG] IndexAccessProvider getWriter() [10:49:18] [ERROR] DocumentCollection update(DocumentData) com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) I think the exception is being thrown when the IndexWriter is created: new IndexWriter(directory, false, analyzer, false, deletionPolicy); I am confused... segments_n should not have been touched for 3 minutes so why would a new IndexWriter want to read it? Here is the whole of the stack trace: com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) at com.company.lucene.IndexUpdate.addDocument(IndexUpdate.java:364) at com.company.lucene.IndexUpdate.addDocument(IndexUpdate.java:342) at com.company.lucene.IndexUpdate.update(IndexUpdate.java:67) at com.company.lucene.icm.DocumentCollection.update(DocumentCollection.java:390) at lucene.icm.test.Write.add(Write.java:105) at lucene.icm.test.Write.run(Write.java:79) at lucene.icm.test.Write.main(Write.java:43) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:271) at java.lang.Thread.run(Thread.java:534) Caused by: java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:204) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:531) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:193) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:156) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:626) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:573) at com.subshell.lucene.indexaccess.impl.IndexAccessProvider.getWriter(IndexAccessProvider.java:68) at com.subshell.lucene.indexaccess.impl.LuceneIndexAccessor.getWriter
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Michael I am really pleased we have a potential fix. I will look out for the patch. Thanks for your help. Patrick On 03/07/07, Michael McCandless [EMAIL PROTECTED] wrote: Patrick Kimber [EMAIL PROTECTED] wrote: I am using the NativeFSLockFactory. I was hoping this would have stopped these errors. I believe this is not a locking issue and NativeFSLockFactory should be working correctly over NFS. Here is the whole of the stack trace: Caused by: java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:204) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:531) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:193) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:156) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:626) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:573) at com.subshell.lucene.indexaccess.impl.IndexAccessProvider.getWriter(IndexAccessProvider.java:68) at com.subshell.lucene.indexaccess.impl.LuceneIndexAccessor.getWriter(LuceneIndexAccessor.java:171) at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:176) ... 13 more OK, indeed the exception is inside IndexFileDeleter's initialization (this is what I had guessed might be happening). I have added more logging to my test application. I have two servers writing to a shared Lucene index on an NFS partition... Here is the logging from one server... [10:49:18] [DEBUG] LuceneIndexAccessor closing cached writer [10:49:18] [DEBUG] ExpirationTimeDeletionPolicy onCommit() delete [segments_n] and the other server (at the same time): [10:49:18] [DEBUG] LuceneIndexAccessor opening new writer and caching it [10:49:18] [DEBUG] IndexAccessProvider getWriter() [10:49:18] [ERROR] DocumentCollection update(DocumentData) com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) I think the exception is being thrown when the IndexWriter is created: new IndexWriter(directory, false, analyzer, false, deletionPolicy); I am confused... segments_n should not have been touched for 3 minutes so why would a new IndexWriter want to read it? Whenever a writer is opeened, it initializes the deleter (IndexFileDeleter). During that initialization, we list all files in the index directory, and for every segments_N file we find, we open it and incref all index files that it's using. We then call the deletion policy's onInit to give it a chance to remove any of these commit points. What's happening here is the NFS directory listing is stale and is reporting that segments_n exists when in fact it doesn't. This is almost certainly due to the NFS client's caching (directory listing caches are in general not coherent for NFS clients, ie, they can lie for a short period of time, especially in cases like this). I think this fix is fairly simple: we should catch the FileNotFoundException and handle that as if the file did not exist. I will open a Jira issue get a patch. Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Michael I have been running the test for over an hour without any problem. The index writer log file is getting rather large so I cannot leave the test running overnight. I will run the test again tomorrow morning and let you know how it goes. Thanks again... Patrick On 03/07/07, Patrick Kimber [EMAIL PROTECTED] wrote: Hi Michael I am setting up the test with the take2 jar and will let you know the results as soon as I have them. Thanks for your help Patrick On 03/07/07, Michael McCandless [EMAIL PROTECTED] wrote: OK I opened issue LUCENE-948, and attached a patch new 2.2.0 JAR. Please make sure you use the take2 versions (they have added instrumentation to help us debug): https://issues.apache.org/jira/browse/LUCENE-948 Patrick, could you please test the above take2 JAR? Could you also call IndexWriter.setDefaultInfoStream(...) and capture all output from both machines (it will produce quite a bit of output). However: I'm now concerned about another potential impact of stale directory listing caches, specifically that the writer on the 2nd machine will not see the current segments_N file written by the first machine and will incorrectly remove the newly created files. I think that take2 JAR should at least resolve this FileNotFoundException but I think likely you are about to hit this new issue. Mike Patrick Kimber [EMAIL PROTECTED] wrote: Hi Michael I am really pleased we have a potential fix. I will look out for the patch. Thanks for your help. Patrick On 03/07/07, Michael McCandless [EMAIL PROTECTED] wrote: Patrick Kimber [EMAIL PROTECTED] wrote: I am using the NativeFSLockFactory. I was hoping this would have stopped these errors. I believe this is not a locking issue and NativeFSLockFactory should be working correctly over NFS. Here is the whole of the stack trace: Caused by: java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:204) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.init(FSDirectory.java:506) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:536) at org.apache.lucene.store.FSDirectory$FSIndexInput.init(FSDirectory.java:531) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:440) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:193) at org.apache.lucene.index.IndexFileDeleter.init(IndexFileDeleter.java:156) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:626) at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:573) at com.subshell.lucene.indexaccess.impl.IndexAccessProvider.getWriter(IndexAccessProvider.java:68) at com.subshell.lucene.indexaccess.impl.LuceneIndexAccessor.getWriter(LuceneIndexAccessor.java:171) at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:176) ... 13 more OK, indeed the exception is inside IndexFileDeleter's initialization (this is what I had guessed might be happening). I have added more logging to my test application. I have two servers writing to a shared Lucene index on an NFS partition... Here is the logging from one server... [10:49:18] [DEBUG] LuceneIndexAccessor closing cached writer [10:49:18] [DEBUG] ExpirationTimeDeletionPolicy onCommit() delete [segments_n] and the other server (at the same time): [10:49:18] [DEBUG] LuceneIndexAccessor opening new writer and caching it [10:49:18] [DEBUG] IndexAccessProvider getWriter() [10:49:18] [ERROR] DocumentCollection update(DocumentData) com.company.lucene.LuceneIcmException: I/O Error: Cannot add the document to the index. [/mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_n (No such file or directory)] at com.company.lucene.RepositoryWriter.addDocument(RepositoryWriter.java:182) I think the exception is being thrown when the IndexWriter is created: new IndexWriter(directory, false, analyzer, false, deletionPolicy); I am confused... segments_n should not have been touched for 3 minutes so why would a new IndexWriter want to read it? Whenever a writer is opeened, it initializes the deleter (IndexFileDeleter). During that initialization, we list all files in the index directory, and for every segments_N file we find, we open it and incref all index files that it's using. We then call the deletion policy's onInit to give it a chance to remove any of these commit points. What's happening here is the NFS directory listing is stale and is reporting that segments_n exists when in fact it doesn't. This is almost
Lucene 2.2, NFS, Lock obtain timed out
Hi, We are sharing a Lucene index in a Linux cluster over an NFS share. We have multiple servers reading and writing to the index. I am getting regular lock exceptions e.g. Lock obtain timed out: NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81-n-write.lock - We are using Lucene 2.2.0 - We are using kernel NFS and lockd is running. - We are using a modified version of the ExpirationTimeDeletionPolicy found in the Lucene test suite: http://svn.apache.org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy.java I have set the expiration time to 600 seconds (10 minutes). - We are using the NativeFSLockFactory with the lock folder being within the index folder: /mnt/nfstest/repository/lucene/lock/ - I have implemented a handler which will pause and retry an update or delete operation if a LockObtainFailedException or StaleReaderException is caught. The handler will retry the update or delete once every second for 1 minute before re-throwing the exception and aborting. The issue appears to be caused by a lock file which is not deleted. The handlers keep retrying... the process holding the lock eventually aborts... this deletes the lock file - any applications still running then continue normally. The application does not throw these exceptions when it is run on a standard Linux file system or Windows workstation. I would really appreciate some help with this issue. The chances are I am doing something stupid... but I cannot think what to try next. Thanks for your help Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Doron Thanks for your reply. I am working on the details of the update pattern. It will take me some time as I cannot reproduce the issue on demand. To answer your other questions, yes, we do have multiple writers. One writer per node in the cluster. I will post the results of my investigations as soon as possible. Thanks for your help Patrick On 29/06/07, Doron Cohen [EMAIL PROTECTED] wrote: hi Patrick, Mike is the expert in this, but until he gets in, can you add details on the update pattern - note that the DeletionPolicy you describe below is not (afaik) related to the write lock time-out issues you are facing. The DeletionPolicy manages better the interaction between an IndexWriter that deletes old files, and an IndexReader that might still use this file. The write lock, on the hand, just synchronizes between multiple IndexWriter objects attempting to open the same index for write. So, do you have multiple writers? Can you print/describe the writers timing scenario when this time-out problem occur, e.g, something like this w1.open w1.modify w1.close w2.open w2.modify w2.close w3.open w3.modify w3.close w2.open . time-out... but w3 closed the index so the lock-file was supposed to be removed, why wasn't it? Can write attempt come from different nodes in the cluster? Can you make sure that when the writer gets the lock time-out there is indeed no other active writer? Doron Patrick Kimber [EMAIL PROTECTED] wrote on 29/06/2007 02:01:08: Hi, We are sharing a Lucene index in a Linux cluster over an NFS share. We have multiple servers reading and writing to the index. I am getting regular lock exceptions e.g. Lock obtain timed out: NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81- n-write.lock - We are using Lucene 2.2.0 - We are using kernel NFS and lockd is running. - We are using a modified version of the ExpirationTimeDeletionPolicy found in the Lucene test suite: http://svn.apache. org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy. java I have set the expiration time to 600 seconds (10 minutes). - We are using the NativeFSLockFactory with the lock folder being within the index folder: /mnt/nfstest/repository/lucene/lock/ - I have implemented a handler which will pause and retry an update or delete operation if a LockObtainFailedException or StaleReaderException is caught. The handler will retry the update or delete once every second for 1 minute before re-throwing the exception and aborting. The issue appears to be caused by a lock file which is not deleted. The handlers keep retrying... the process holding the lock eventually aborts... this deletes the lock file - any applications still running then continue normally. The application does not throw these exceptions when it is run on a standard Linux file system or Windows workstation. I would really appreciate some help with this issue. The chances are I am doing something stupid... but I cannot think what to try next. Thanks for your help Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi As requested, I have been trying to improve the logging in the application so I can give you more details of the update pattern. I am using the Lucene Index Accessor contribution to co-ordinate the readers and writers: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 If the close method, in the IndexAccessProvider, fails the exception is logged but not re-thrown: public void close(IndexReader reader) { if (reader != null) { try { reader.close(); } catch (IOException e) { log.error(, e); } } } I have been checking the application log. Just before the time when the lock file errors occur I found this log entry: [11:28:59] [ERROR] IndexAccessProvider java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_h75 (No such file or directory) at java.io.RandomAccessFile.open(Native Method) - I guess the missing segments file could result in the lock file not being removed? - Is it safe to ignore this exception (probably not)? - Why would the segments file be missing? Could this be connected to the NFS issues in some way? Thanks for your help Patrick On 29/06/07, Patrick Kimber [EMAIL PROTECTED] wrote: Hi Doron Thanks for your reply. I am working on the details of the update pattern. It will take me some time as I cannot reproduce the issue on demand. To answer your other questions, yes, we do have multiple writers. One writer per node in the cluster. I will post the results of my investigations as soon as possible. Thanks for your help Patrick On 29/06/07, Doron Cohen [EMAIL PROTECTED] wrote: hi Patrick, Mike is the expert in this, but until he gets in, can you add details on the update pattern - note that the DeletionPolicy you describe below is not (afaik) related to the write lock time-out issues you are facing. The DeletionPolicy manages better the interaction between an IndexWriter that deletes old files, and an IndexReader that might still use this file. The write lock, on the hand, just synchronizes between multiple IndexWriter objects attempting to open the same index for write. So, do you have multiple writers? Can you print/describe the writers timing scenario when this time-out problem occur, e.g, something like this w1.open w1.modify w1.close w2.open w2.modify w2.close w3.open w3.modify w3.close w2.open . time-out... but w3 closed the index so the lock-file was supposed to be removed, why wasn't it? Can write attempt come from different nodes in the cluster? Can you make sure that when the writer gets the lock time-out there is indeed no other active writer? Doron Patrick Kimber [EMAIL PROTECTED] wrote on 29/06/2007 02:01:08: Hi, We are sharing a Lucene index in a Linux cluster over an NFS share. We have multiple servers reading and writing to the index. I am getting regular lock exceptions e.g. Lock obtain timed out: NativeFSLock@/mnt/nfstest/repository/lucene/lock/lucene-2d3d31fa7f19eabb73d692df44087d81- n-write.lock - We are using Lucene 2.2.0 - We are using kernel NFS and lockd is running. - We are using a modified version of the ExpirationTimeDeletionPolicy found in the Lucene test suite: http://svn.apache. org/repos/asf/lucene/java/trunk/src/test/org/apache/lucene/index/TestDeletionPolicy. java I have set the expiration time to 600 seconds (10 minutes). - We are using the NativeFSLockFactory with the lock folder being within the index folder: /mnt/nfstest/repository/lucene/lock/ - I have implemented a handler which will pause and retry an update or delete operation if a LockObtainFailedException or StaleReaderException is caught. The handler will retry the update or delete once every second for 1 minute before re-throwing the exception and aborting. The issue appears to be caused by a lock file which is not deleted. The handlers keep retrying... the process holding the lock eventually aborts... this deletes the lock file - any applications still running then continue normally. The application does not throw these exceptions when it is run on a standard Linux file system or Windows workstation. I would really appreciate some help with this issue. The chances are I am doing something stupid... but I cannot think what to try next. Thanks for your help Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Mark Yes, thank you. I can see your point and I think we might have to pay some attention to this issue. But, we sometimes see this error on an NFS share within 2 minutes of starting the test so I don't think this is the only problem. Once again, thanks for the idea. I will certainly be looking to modify the code in the LuceneIndexAccessor to take this into account. Patrick On 29/06/07, Mark Miller [EMAIL PROTECTED] wrote: This is an interesting choice. Perhaps you have modified LuceneIndexAccessor, but it seems to me (without knowing much about your setup) that you would have odd reader behavior. On a 3 node system, if you add docs with node 1 and 2 but not 3 and your doing searches against all 3 nodes, node 3 will have old readers opened until you add a doc to node 3. This is an odd consistency issue (node 1 and 2 have current views because you are adding docs to them, but node 3 will be stale until it gets a doc), but also if you keep adding docs to node 1 and 2, or just plain add no docs to node 3, won't node 3's reader's index files be pulled out from under it after 10 minutes? Node 3 (or 1 and 2 for that matter) will not give up its cached readers *until* you add a doc with that particular node. Perhaps I am all wet on this (I havn't used NFS with Lucene), but I think you may need to somehow coordinate the delete policy with the LuceneIndexAccessor on each node. This may be unrelated to your problem,and perhaps you get around the issue somehow, but just to throw it out there... - Mark On 6/29/07, Patrick Kimber [EMAIL PROTECTED] wrote: I am using the Lucene Index Accessor contribution to co-ordinate the readers and writers: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene 2.2, NFS, Lock obtain timed out
Hi Mark I just ran my test again... and the error occurred after 10 minutes - which is the time when my deletion policy is triggered. So... I think you might have found the answer to my problem. I will spend more time looking at it on Monday. Thank you very much for your help and enjoy your weekend. Patrick On 29/06/07, Mark Miller [EMAIL PROTECTED] wrote: If your getting java.io.FileNotFoundException: /mnt/nfstest/repository/lucene/lucene-icm-test-1-0/segments_h75 within 2 minutes, this is very odd indeed. That would seem to imply your deletion policy is not working. You might try just using one of the nodes as the writer. In Michaels comments, he always seems to mention the pattern of one writer many readers on nfs. In this case you could use no LockFactory and perhaps gain a little speed there. - Mark Patrick Kimber wrote: Hi Mark Yes, thank you. I can see your point and I think we might have to pay some attention to this issue. But, we sometimes see this error on an NFS share within 2 minutes of starting the test so I don't think this is the only problem. Once again, thanks for the idea. I will certainly be looking to modify the code in the LuceneIndexAccessor to take this into account. Patrick On 29/06/07, Mark Miller [EMAIL PROTECTED] wrote: This is an interesting choice. Perhaps you have modified LuceneIndexAccessor, but it seems to me (without knowing much about your setup) that you would have odd reader behavior. On a 3 node system, if you add docs with node 1 and 2 but not 3 and your doing searches against all 3 nodes, node 3 will have old readers opened until you add a doc to node 3. This is an odd consistency issue (node 1 and 2 have current views because you are adding docs to them, but node 3 will be stale until it gets a doc), but also if you keep adding docs to node 1 and 2, or just plain add no docs to node 3, won't node 3's reader's index files be pulled out from under it after 10 minutes? Node 3 (or 1 and 2 for that matter) will not give up its cached readers *until* you add a doc with that particular node. Perhaps I am all wet on this (I havn't used NFS with Lucene), but I think you may need to somehow coordinate the delete policy with the LuceneIndexAccessor on each node. This may be unrelated to your problem,and perhaps you get around the issue somehow, but just to throw it out there... - Mark On 6/29/07, Patrick Kimber [EMAIL PROTECTED] wrote: I am using the Lucene Index Accessor contribution to co-ordinate the readers and writers: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Lucene indexing pdf
Hi Teresa You need to convert the pdf file into text format before adding the text to the Lucene index. You may like to look at http://www.pdfbox.org/ for a library to convert pdf files to text format. Patrick On 27/06/06, mcarcelen [EMAIL PROTECTED] wrote: Hi, I´m new with Lucene and I´m trying to index a pdf but when I query everything it returns nothing. Can anyone help me? Thans a lot Teresa - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: similar ArrayIndexOutOfBoundsException on searching and optimizing
Hi Adam We are getting the same error. Did you manage to work out what was causing the problem? Thanks Patrick On 21/04/06, Adam Constabaris [EMAIL PROTECTED] wrote: This is a puzzler, I'm not sure if I'm doing something wrong or whether I have a poisoned document, a corrupted index (failing to close my IndexModifier properly?) or what. The setup is this: I have two processes (the backend and frontend of a CMS) that run in two different VMs -- both use Lucene 1.9.1 with the PorterStemmerAnalyzer wrapper over the StandardAnalyzer (from lucene-memory AnalyzerUtils). The backend is responsible for index creation, updates, etc., while the frontend process uses the created index. What's puzzling is that some queries will die with an ArrayIndexOutOfBoundsException being thrown out of the BitVector class: Caused by: java.lang.ArrayIndexOutOfBoundsException: 240 at org.apache.lucene.util.BitVector.get(BitVector.java:63) at org.apache.lucene.index.SegmentTermDocs.read(SegmentTermDocs.java:133) at org.apache.lucene.search.TermScorer.next(TermScorer.java:105) at org.apache.lucene.search.DisjunctionSumScorer.advanceAfterCurrent(DisjunctionSumScorer.java:151) at org.apache.lucene.search.DisjunctionSumScorer.next(DisjunctionSumScorer.java:125) at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.java:290) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:99) at org.apache.lucene.search.Hits.getMoreDocs(Hits.java:65) at org.apache.lucene.search.Hits.init(Hits.java:44) at org.apache.lucene.search.Searcher.search(Searcher.java:44) at org.apache.lucene.search.Searcher.search(Searcher.java:36) The only pattern I've been able to discern in queries that cause this problem is that (a) they search the contents field (tokenized, unstored, TermVector.YES), and (b) it *seems* that it mostly happens with longer terms in the query. Although the frontend defaults to a multifield query, the same happens when I use contents:term and does not happen if I specify term and any other of the default fields used by the MultiFieldQueryParser. Here's where it gets interesting: I've noticed that calling optimize() on the index as it's created by the server process is also throwing a hissy fit, with an *eerily similar* index: java.lang.ArrayIndexOutOfBoundsException: 239 at org.apache.lucene.util.BitVector.get(BitVector.java:63) at org.apache.lucene.index.SegmentReader.isDeleted(SegmentReader.java:288) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:185) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:88) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:681) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:658) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:517) at org.apache.lucene.index.IndexWriter.addIndexes(IndexWriter.java:553) Does anybody have any ideas about what I might be doing wrong, or if I've possibly uncovered a bug? I'm too new to the scene to know where I ought to start with this. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How to write to and read from the same index
Hi Nick Have you tried the Lucene Index Accessor contribution? We have a similar update/search pattern and it works very well. http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 Patrick On 28/03/06, Nick Atkins [EMAIL PROTECTED] wrote: I'm using Lucene running on Tomcat to index a large amount of email data and as the indexer runs through the mailbox creating, merging and deleting documents it does lots of searches at the same time to see if the document exists. Actually all my modification operations are done in batch every x seconds or so. This seems to cause me lots of problems. It believe it is not possible to keep a single Searcher open while the index is being modified so the only way is to detect the index changes, close the old one and create a new one. However, doing this causes the number of file handles to grow beyond the max allowed by the system. I have tried using Luc's DelayCloseIndexSearcher with his Factory example but as my index is modified frequently this causes lots of new DelayCloseIndexSearcher objects. The way it calls close on them when there are no more usages doesn't seem to keep the number of file handles down, they just grow. I would expect close to release file handles to the system when nothing is using the object (I even set it explicitly to null) but this does not happen. If this problem makes sense, has anyone else faced it, and does anyone have a solution? Cheers, Nick. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: IndexSearcher and IndexWriter in conjuction
Hi Nikhil We are using the index accessor contribution. For more information see: http://www.nabble.com/Fwd%3A-Contribution%3A-LuceneIndexAccessor-t17416.html#a47049 This should help you to co-ordinate the IndexSearcher and IndexWriter. Patrick On 13/03/06, Nikhil Goel [EMAIL PROTECTED] wrote: Hi, Can someone please explain how does IndexSearcher and IndexWriter works in conjuction. As far as i know after reading all the posts in newgroup, it seems everything works fine if we have one IndexWriter thread and multiple IndexSearcher thread. But my doubt here is, looking at IndexSearcher class, it seems it first reads the segments file and then one by one go to the respective .fnm files in the index...hence can occur a case, where it has read segments file but in the meantime IndexWriter thread has updated the index and the corresponding .fnm file doesnt exist in the index and this will give us the error .fnm doesn't exist and we will get an IOException. Am I missing something in making sure that there can be multiple IndexSearcher thread and one IndexWriter Thread and still everything works fine. thanks -Nikhil - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: steps for building lucene 1.9
Hi Haritha Hope the following helps: Build Lucene Core from SVN Download the lucene Subversion repository from: http://svn.apache.org/repos/asf/lucene/java/trunk Note: The CVS repository is still accessible but is out of date. I downloaded to: C:\src\lucene-svn\ To build (using ANT): cd C:\src\lucene-svn\ ant The following jar file is produced: C:\src\lucene-svn\build\lucene-core-1.9-rc1-dev.jar I have just built lucene using these instructions on my workstation and it builds without any errors. Patrick On 09/03/06, Haritha_Parvatham [EMAIL PROTECTED] wrote: Hi, I have downloaded lucene 1.9 version .please tell me how to build it.Iam finding so many errors in lucene 1.9 source code. Thanks. Haritha - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: How to intergrate snowball in lucene
Hi You should download the snowball contribution which is in the SubVersion repository: http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/snowball This can be built using ANT. Patrick On 06/03/06, Haritha_Parvatham [EMAIL PROTECTED] wrote: Hi, Can anyone giude me to intergrate snowball in lucene. I have downloaded snowball srcs.But some files are written in 'c' language.I have compiled it . Pls tell me how i add snowball in lucene for multilingual support. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Lucene, Cannot rename segments.new to segments
I am getting intermittent errors with Lucene. Here are two examples: java.io.IOException: Cannot rename E:\lucene\segments.new to E:\lucene\segments java.io.IOException: Cannot rename E:\lucene\_8ya.tmp to E:\lucene\_8ya.del This issue has an open BugZilla entry: http://issues.apache.org/bugzilla/show_bug.cgi?id=36241 I thought this error must be caused by an error in my application. To try and solve the error I used the LuceneIndexAccessor in my application: http://issues.apache.org/bugzilla/show_bug.cgi?id=34995 I am still getting the error. 1) Is there a reason (other than time and resource) why the bug report is still set to NEW after 6 months (since August 2005)? 2) Is the problem likely to be in my application? Any ideas how I could go about solving this issue? Thanks for your help Patrick - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: OT: how do I connect to the SVN repository to grab the latest source?
Hi Colin Did you get some help? Are you using Windows? If so, you can install TortoiseSVN which is a shell extension: http://tortoisesvn.tigris.org/ If you are using Windows or Linux you can use SmartSVN http://www.smartcvs.com/smartsvn/ The url for Lucene on SVN is: http://svn.apache.org/repos/asf/lucene/java/trunk If you want to learn all about SubVersion (SVN), the best source of information is the SubVersion book: http://svnbook.red-bean.com/ Hope this helps Patrick On 04/01/06, Colin Young [EMAIL PROTECTED] wrote: Normally I wouldn't post this here, but I haven't been able to find any info about how I would go about downloading the latest source from the SVN repository. I've got a bit of experience with CVS, but I can't even figure out where to start with SVN. If anyone could point me in the right direction I'd appreciate it (we could do it offline to avoid polluting this list any further). Thanks Colin Young - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
http://www.textmining.org/ is hacked
Hi I am trying to download the source code for tm-extractors-0.4.jar from http://www.textmining.org/ Looks like the site has been hacked. Does anyone know the location of the CVS or SVN repository? Thanks for your help... Pat - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: http://www.textmining.org/ is hacked
Thanks for the very quick response. On 24/11/05, Guilherme Barile [EMAIL PROTECTED] wrote: I have it here, uploaded it to rapidshare http://rapidshare.de/files/8097202/textmining.zip.html c ya On Thu, 2005-11-24 at 16:46 +, Patrick Kimber wrote: Hi I am trying to download the source code for tm-extractors-0.4.jar from http://www.textmining.org/ Looks like the site has been hacked. Does anyone know the location of the CVS or SVN repository? Thanks for your help... Pat - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Deprecated API in BooleanQuery broken in Lucene from CVS?
Daniel You are correct. The latest version from SVN works correctly. Very confusing - I only checked out Lucene from CVS a few days ago. I didn't realise that changes were only being made in the SVN repository. Thank you very much for your help. Regards Patrick On 17/11/05, Daniel Naber [EMAIL PROTECTED] wrote: On Dienstag 15 November 2005 11:24, Patrick Kimber wrote: I have checked out the latest version of Lucene from CVS and have found a change in the results compared to version 1.4.3. Lucene isn't in CVS anymore, it's in SVN. With the latest version from SVN, I cannot reproduce your problem. Regards Daniel -- http://www.danielnaber.de - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]