RE: Can i use lucene to search the internet.
Title: Can i use lucene to search the internet. Hi, Can we use NUTCH in windows OS -Original Message-From: gekkokid [mailto:[EMAIL PROTECTED]Sent: Thursday, March 23, 2006 11:22 AMTo: java-user@lucene.apache.orgSubject: Re: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apache.org/nutch/:) _gk - Original Message - From: Babu, KameshNarayana (GE, Research, consultant) To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 5:44 AM Subject: Can i use lucene to search the internet. hi all, Can i use lucene to search the internet. Are do we have nay open source applications. Thanks in advance GE Global Research Kamesh NarayanaBabu John F. Welch Technology CentreInformation Technology Management, Plot 122, Export Promotion Industrial Park,Phase II, Hoodi Village, Whitefield Road, Bangalore, Karnataka - 560066, INDIA.Phone: +91 (80) 2503 0457 | GE Dial comm.: 8 * 901 0359 | Mobile: +91 9986259850 | Email:- [EMAIL PROTECTED]
RE: Can i use lucene to search the internet.
Title: Can i use lucene to search the internet. Hai All, Can NUTCH be used in Windoes OS -Original Message-From: gekkokid [mailto:[EMAIL PROTECTED]Sent: Thursday, March 23, 2006 11:22 AMTo: java-user@lucene.apache.orgSubject: Re: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apache.org/nutch/:) _gk - Original Message - From: Babu, KameshNarayana (GE, Research, consultant) To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 5:44 AM Subject: Can i use lucene to search the internet. hi all, Can i use lucene to search the internet. Are do we have nay open source applications. Thanks in advance GE Global Research Kamesh NarayanaBabu John F. Welch Technology CentreInformation Technology Management, Plot 122, Export Promotion Industrial Park,Phase II, Hoodi Village, Whitefield Road, Bangalore, Karnataka - 560066, INDIA.Phone: +91 (80) 2503 0457 | GE Dial comm.: 8 * 901 0359 | Mobile: +91 9986259850 | Email:- [EMAIL PROTECTED]
Re: Can i use lucene to search the internet.
Hi It can be used if you run cygwin (the latest version) Please have a look at nutch wiki And you are mailing the wrong list Rgds Prabhu On 3/23/06, Babu, KameshNarayana (GE, Research, consultant) [EMAIL PROTECTED] wrote: Hai All, Can NUTCH be used in Windoes OS -Original Message- *From:* gekkokid [mailto:[EMAIL PROTECTED] *Sent:* Thursday, March 23, 2006 11:22 AM *To:* java-user@lucene.apache.org *Subject:* Re: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apache.org/nutch/ :) _gk - Original Message - *From:* Babu, KameshNarayana (GE, Research, consultant)[EMAIL PROTECTED] *To:* java-user@lucene.apache.org *Sent:* Thursday, March 23, 2006 5:44 AM *Subject:* Can i use lucene to search the internet. hi all, Can i use lucene to search the internet. Are do we have nay open source applications. Thanks in advance [image: ole0.bmp]* GE Global Research* *Kamesh NarayanaBabu* *John F. Welch Technology Centre Information Technology Management, Plot 122, Export Promotion Industrial Park, Phase II, Hoodi Village, Whitefield Road, Bangalore, Karnataka - 560066, INDIA. Phone: +91 (80) 2503 0457 | GE Dial comm.: 8 * 901 0359 | Mobile: +91 9986259850 | Email:- [EMAIL PROTECTED]
RE: Can i use lucene to search the internet.
hi , thanks for the reply. Can i do without cygwin. Which list i should use for these queries. kindly help me. -Original Message- From: Raghavendra Prabhu [mailto:[EMAIL PROTECTED] Sent: Thursday, March 23, 2006 3:48 PM To: java-user@lucene.apache.org Subject: Re: Can i use lucene to search the internet. Hi It can be used if you run cygwin (the latest version) Please have a look at nutch wiki And you are mailing the wrong list Rgds Prabhu On 3/23/06, Babu, KameshNarayana (GE, Research, consultant) [EMAIL PROTECTED] wrote: Hai All, Can NUTCH be used in Windoes OS -Original Message- *From:* gekkokid [mailto:[EMAIL PROTECTED] *Sent:* Thursday, March 23, 2006 11:22 AM *To:* java-user@lucene.apache.org *Subject:* Re: Can i use lucene to search the internet. Hi, are you asking does it have a crawler? no it doesn't but nutch does http://lucene.apache.org/nutch/ :) _gk - Original Message - *From:* Babu, KameshNarayana (GE, Research, consultant)[EMAIL PROTECTED] *To:* java-user@lucene.apache.org *Sent:* Thursday, March 23, 2006 5:44 AM *Subject:* Can i use lucene to search the internet. hi all, Can i use lucene to search the internet. Are do we have nay open source applications. Thanks in advance [image: ole0.bmp]* GE Global Research* *Kamesh NarayanaBabu* *John F. Welch Technology Centre Information Technology Management, Plot 122, Export Promotion Industrial Park, Phase II, Hoodi Village, Whitefield Road, Bangalore, Karnataka - 560066, INDIA. Phone: +91 (80) 2503 0457 | GE Dial comm.: 8 * 901 0359 | Mobile: +91 9986259850 | Email:- [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: FileNotFoundException: Corrupted Index? = Use jvm ShutdownHook
Hi Otis, Thanks for your reply. I will also put the writer shutdown hook for this index, as you said. I had already done that for other part of our code where we use other lucene index, but thought it would not be needed for this special index due to the fact that we rarely write on it. But this is a stupid thought as the jvm can also be shutdown during those rare case... and this corruption proves it.. I will watch if the problem still occurs and if it does not, I'll update the wiki FAQ with the following code (left here for search history purpose and for other users) // clean writer reader and searcher correctly Thread shutdown = new Thread() { public void run() { if (writer != null) { try { writer.close(); } catch (Exception ex){ /*empty*/ } writer = null; } if (reader != null) { try { reader.close(); } catch (IOException ex){ /*empty*/ } reader = null; } if (searcher != null) { try { searcher.close(); } catch (IOException ex){ /*empty*/ } searcher = null; } } }; Runtime.getRuntime().addShutdownHook(shutdown); Otis Gospodnetic wrote: Hi Olivier, You have shutdown hooks for read-only operations. They won't corrupt your index. I'd add shutdown hooks for IndexWriter. If that fixes your problem, it would be great if you could add your shutdown hook code to the FAQ on the Wiki, or at least post it to java-user, so somebody else can put it there. Otis - Original Message From: Olivier Jaquemet [EMAIL PROTECTED] To: Lucene Java User ML java-user@lucene.apache.org Sent: Wednesday, March 22, 2006 10:08:28 AM Subject: FileNotFoundException: Corrupted Index? Hi all, We are using the last version of lucene (1.9.1), and sometimes we end up with such error when opening one of the index our application uses: java.io.FileNotFoundException: [...]/LuceneIndex/_ 46.fnm (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.init(RandomAccessFile.java:204) at org.apache.lucene.store.FSIndexInput$Descriptor.init(FSDirectory.java:425) at org.apache.lucene.store.FSIndexInput.init(FSDirectory.java:434) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:324) at org.apache.lucene.index.FieldInfos.init(FieldInfos.java:56) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:144) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:110) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:154) at org.apache.lucene.store.Lock$With.run(Lock.java:109) at org.apache.lucene.index.IndexReader.open(IndexReader.java:143) at org.apache.lucene.index.IndexReader.open(IndexReader.java:138) The only solution available in this case being to completely remove and recreate the index. I have the corrupted index available for testing should you need it. Apparently this corruption occurs if the JVM has crashed or was shutdown too violently (kill -9) I was wondering how a corruption of a lucene index could occur and how to prevent it, fix it on reopening or in a last resort, detect it to be able to recreate the index. Note that I already have that kind of hook in the code for shutdown: // clean writer reader and searcher correctly Thread shutdown = new Thread() { public void run() { if (reader != null) { try { reader.close(); } catch (IOException ex){ /*empty*/ } reader = null; } if (searcher != null) { try { searcher.close(); } catch (IOException ex){ /*empty*/ } searcher = null; } } }; Runtime.getRuntime().addShutdownHook(shutdown); Or, on opening, code such as: Directory indexDir = FSDirectory.getDirectory(luceneDir, !IndexReader.indexExists(luceneDir)); IndexReader.unlock(indexDir); // unlock directory in case of unproper shutdown if (!IndexReader.indexExists(luceneDir)) { writer = new IndexWriter(indexDir, analyzer, true); writer.close(); } Any suggestion or remark? Thanks! -- Olivier Jaquemet [EMAIL PROTECTED] Ingénieur RD Jalios S.A. Tel: 01.39.23.92.83 http://www.jalios.com/ http://support.jalios.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: java.lang.OutOfMemoryError in lucene
But i have the IBM JDK 1.4.2, do you know if this version still have the problem?? -- View this message in context: http://www.nabble.com/java.lang.OutOfMemoryError-in-lucene-t1324911.html#a3551247 Sent from the Lucene - Java Users forum at Nabble.com. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Speed up Indexing
I run Lucene.Net as well, and your indexing performance is dependent on more factors aside from whether you're using the Java or C# version. As a basic suggestion, learn what you can about minMergeDocs and mergeFactor as well as the compound file format. Try different combinations to understand what is faster vs. slower. As a strategy for your specific scenario, you might consider building several indexes in parallel, then merging the indexes at the end. Hope this helps. -- j On 3/22/06, hu andy [EMAIL PROTECTED] wrote: Hi,everyone. I have a large mount of xml files of size 1G. I use lucene(the dotNet edition) to index . There are 8 fields for a document, with 4 keyword fields and 4 unstored fields. I have set the minMergeDocs to 1 and mergeFactor to 100. It took about 2.5 hours (main memeory 3G, CPU p4 ) .I also try in-memory indexing which is also more than 2.5hours. Due to the performance requirement , I need complete the indexing in one hour without the use of distributing or clustering system . Cant it be possible? Is it faster to use java Lucene than dotNet one? Any advice will be appreciated. Thank you in advance.
Re: Multiple threads in Lucene
Hi Otis, Thanks for the reply but I have one question to ask here. You said big no no for opening opening multiple IndexWriters. I want to clarify :- 1) Do you mean multiple IndexWriters at the same time? I am not doing this. At a time there is only one Indexwriter opened. or 1) Do you mean I cant open another IndexWriter again after closing the prior one. In my writing thread, for every file I index, I open a new IndexWriter and close it and as soon as I have second file available for indexing. I open the IndexWriter again and close it and directory object is the same across all the threads as well as while reopening IndexWriters. If the latter is NO too, then how would a developer make sure that this index is closed when the Program is killed. Suppose a program is killed in between and Index is not closed, then next time when I run the program there will be a write.lock in Index and it won't allow us to open another index. Please let me know if I am wrong in what I said. Thanks -Nikhil On 3/22/06, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, 1 IndexWriter + multiple IndexSearchers definitely work together :) I can't tell what you're doing wrong with the threads... it looks like you might be opening multiple IndexWriters on the same index/directory (big no no). Otis - Original Message From: Nikhil Goel [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Wednesday, March 22, 2006 6:04:41 PM Subject: Multiple threads in Lucene Hi Lucene Developers, According to Lucene Documentation, IndexWriter can exist with multiple IndexSearcher and its thread safe. To verify that: I wrote a simple program to simulate that condition but unfortunately I get an exception. Please let me know if anyone has ever tested the Lucene claim that IndexWriter and IndexSearcher are thread safe. I have a program in which I have 4 threads. 1) One IndexWriter Thread 2) 3 IndexSearcher Thread. Everytime when we need to index a file. We run the following code in IndexWriter Thread:- function IndexFile(Document doc) { writer = new IndexWriter(directory, new StandardAnalyzer(), false); writer.addDocument(doc); writer.close(); } Our IndexSearcherThread looks like this:- function IndexSearch(String termToBeSearched) { IndexSearchersearcher = new IndexSearcher(directory); //Note: This directory is the same reference as used to initiate IndexWriter in Indexfile function. Hence this directory //reference is used across all the threads. Query query = QueryParser.parse(termToBeSearched, contents, new StandardAnalyzer()); Hits hits = searcher.search(query); } If I execute these 4 threads above together, then whenever a search routine gets executed and IndexWriter is also in use, then I get an error at the following line:- writer.close(); Stack Strace looks like this:- unable to close the writer stream java.io.IOException: read past EOF at org.apache.lucene.store.InputStream.refill(InputStream.java:192) at org.apache.lucene.store.InputStream.readByte(InputStream.java:81) at org.apache.lucene.store.InputStream.readBytes(InputStream.java:95) at org.apache.lucene.index.SegmentReader.norms(SegmentReader.java:375) at org.apache.lucene.index.SegmentReader.norms(SegmentReader.java:342) at org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java :306) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:99) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java :430) at org.apache.lucene.index.IndexWriter.flushRamSegments( IndexWriter.java :383) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:193) Thanks in advance -Nikhil - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Multiple threads in Lucene
Hi, In order to prevent such problem, here is how you should open your index: Directory indexDir = FSDirectory.getDirectory(luceneDir, !IndexReader.indexExists(luceneDir)); IndexReader.unlock(indexDir); // unlock directory in case of unproper shutdown if (!IndexReader.indexExists(luceneDir)) { writer = new IndexWriter(indexDir, analyzer, true); writer.close(); } And to prevent problems with writer/reader/searcher not being closed properly on exit, here is how you should make sure they are closed (although it is not guaranteed to be called at all by the jvm, it's better than nothing) // clean writer reader and searcher correctly Thread shutdown = new Thread() { public void run() { if (writer != null) { try { writer.close(); } catch (Exception ex){ /*empty*/ } writer = null; } if (reader != null) { try { reader.close(); } catch (IOException ex){ /*empty*/ } reader = null; } if (searcher != null) { try { searcher.close(); } catch (IOException ex){ /*empty*/ } searcher = null; } } }; Runtime.getRuntime().addShutdownHook(shutdown); As another reminder if you start with lucene: - Keep your reader/searcher open as long as possible until you write to the index. It increases performance. You can use a class like this one (taken from this ML): /** * For optimized used of the searcher, we keep it open as much as possible and * delay its close only when it is replaced by a new one when modifying index. */ public class IndexSearcherWrapper extends IndexSearcher { private int referenceCount; public IndexSearcherWrapper(Directory dir) throws IOException { super(dir); this.referenceCount = 1; } public IndexSearcherWrapper getReference() { referenceCount++; return this; } public void close() throws IOException { referenceCount--; if (referenceCount = 0) { super.close(); } } }; Use it like that: IndexSearcher localSearcher = searcher.getReference(); Hits hits = localSearcher.search(query); [...] And use a method such as this one every time you write to the index: /** * Renew internal reader and searcher, call this method after index change. */ public void renewReaderAndSeacher() throws IOException { // Reader IndexReader oldReader = reader; reader = IndexReader.open(index); if (oldReader != null) { oldReader.close(); } // Searcher IndexSearcherWrapper oldSearcher = searcher; searcher = new IndexSearcherWrapper(index); if (oldSearcher != null) { oldSearcher.close(); } } Hope it will help! :) Nikhil Goel wrote: Hi Otis, Thanks for the reply but I have one question to ask here. You said big no no for opening opening multiple IndexWriters. I want to clarify :- 1) Do you mean multiple IndexWriters at the same time? I am not doing this. At a time there is only one Indexwriter opened. or 1) Do you mean I cant open another IndexWriter again after closing the prior one. In my writing thread, for every file I index, I open a new IndexWriter and close it and as soon as I have second file available for indexing. I open the IndexWriter again and close it and directory object is the same across all the threads as well as while reopening IndexWriters. If the latter is NO too, then how would a developer make sure that this index is closed when the Program is killed. Suppose a program is killed in between and Index is not closed, then next time when I run the program there will be a write.lock in Index and it won't allow us to open another index. Please let me know if I am wrong in what I said. Thanks -Nikhil On 3/22/06, Otis Gospodnetic [EMAIL PROTECTED] wrote: Yes, 1 IndexWriter + multiple IndexSearchers definitely work together :) I can't tell what you're doing wrong with the threads... it looks like you might be opening multiple IndexWriters on the same index/directory (big no no). Otis - Original Message From: Nikhil Goel [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Wednesday, March 22, 2006 6:04:41 PM Subject: Multiple threads in Lucene Hi Lucene Developers, According to Lucene Documentation, IndexWriter can exist with multiple IndexSearcher and its thread safe. To verify that: I wrote a simple program to simulate that condition but unfortunately I get an exception. Please let me know if anyone has ever tested the Lucene claim that IndexWriter and IndexSearcher are thread safe. I have a program in which I have 4 threads. 1) One IndexWriter Thread 2) 3 IndexSearcher Thread. Everytime when we need to index a file. We run the following code in IndexWriter Thread:- function IndexFile(Document doc) { writer = new IndexWriter(directory, new StandardAnalyzer(), false); writer.addDocument(doc); writer.close(); } Our IndexSearcherThread looks like this:- function IndexSearch(String
Re: Changing ranking
The place to start would be to look at the DefaultSimilarity, and the norms method there. Perhaps you want to create your own Similarity implementation that returns either a constant 1 or something else that will favour longer text. Somebody else with more experience in this area may have better or more precise suggestions. Otis - Original Message From: Leon Chaddock [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 9:43:14 AM Subject: Changing ranking Hi, At present lucene seems to rank very short documents over longer documents where the phrase occurs more regularily for instance which the search term cat the cat went home ranks higher than the black cat when home past some other cats, on cat street Is there anyway I can change luicene to rank longer documents with more phrase occurences higher Many thanks Leon - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Changing ranking
On Mar 23, 2006, at 11:22 AM, Otis Gospodnetic wrote: The place to start would be to look at the DefaultSimilarity, and the norms method there. Perhaps you want to create your own Similarity implementation that returns either a constant 1 or something else that will favour longer text. Somebody else with more experience in this area may have better or more precise suggestions. Here's an implementation of lengthNorm() that stops stops the weighting at 100 tokens. public float lengthNorm(String fieldName, int numTerms) { numTerms = numTerms 100 ? 100 : numTerms; return (float)(1.0 / Math.sqrt(numTerms)); } If you adopt it, you must boost short but important fields (e.g. title), or they won't contribute enough. KinoSearch (my loose Perl/C port of Lucene) uses this algorithm, and it seems to work well. To see an earlier discussion on this subject perform a web search for proposal defaultsimilarity lengthnorm. Marvin Humphrey Rectangular Research http://www.rectangular.com/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Joins between index and database
Hi - I have an application where I'm using Lucene to index the contents of a database. That's working fine. But I have a problem where I'd like to retrieve a subset of the documents that match a search, based on a join table in the database. How do people typically handle combining the results of a Lucene based search with the results of a database search? Thanks, Tom - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Joins between index and database
On Thursday 23 March 2006 20:51, Tom Hill wrote: Hi - I have an application where I'm using Lucene to index the contents of a database. That's working fine. But I have a problem where I'd like to retrieve a subset of the documents that match a search, based on a join table in the database. How do people typically handle combining the results of a Lucene based search with the results of a database search? One way is to get the values of some key field from the database, create a Filter using terms created from these values, and use that Filter in a search, or in a FilteredQuery. See RangeFilter.bits() for some example code that creates a filter from terms. Sorting the key values beforehand helps performance for creating the filter. CachingWrapperFilter can also be handy. In case you need a lot of filters for relatively few documents, have a look here: http://issues.apache.org/jira/browse/LUCENE-328 Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Multiple threads in Lucene
Olivier Jaquemet wrote: IndexReader.unlock(indexDir); // unlock directory in case of unproper shutdown This should be used very carefully. In particular, you should only call it when you are certain that no other applications are accessing the index. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Can i use lucene to search the internet.
Let's stop this thread. Can i use lucene to search the internet. No. You may be able to use Lucene to *index* the internet, and then search the resulting index. Read the book Lucene in Action for a better idea of what this would entail. Bill - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Query question
: Use Keyword (untokenized) field to index your paths. : Consider using PerFieldAnalyzerWrapper to specify KeywordAnalyzer for your path field. : Use the force, Luke - http://www.getopt.org/luke/ , to ensure your paths are indexed correctly. you also don't wnat to use QueryParser.escape when you build the term query explicitly -- that's only needed if you are passing the string to QueryParser... : Ex: Hits hits = multisearch.search(new TermQuery(new Term(key, : QueryParser.escape(key; -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Changing ranking
: Is there anyway I can change luicene to rank longer documents with more : phrase occurences higher if what you care about is only the number of occurences, and you don't want the length to be a factor at all, then using Field.setOmitNorms(true) on the Field for every document you add will not only accomplish this, but will also save one byte per field per document in your index. that can add up if you have a lot of fields whose length you don't care about. -Hoss - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Native code compilation
Hi all, Has anyone tried to compile their Lucene applications into native code? Mine works fine in a VM but the call to search() on IndexSearcher is crashing the application, after I compile it into native code. There is apparently no problem in instantiating an IndexSearcher though. I tried this on both Linux and Windows and am getting the same problem. Thnx Seeta - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
lucene NFS support
Hi, Does anyone know whether Lucene plans to support NFS in later release(2.0)? We are planning to integrate Lucene into our products and cluster support is definitely needed. We want to check whether NFS support is in the plan or not before implementing a new file locking ourselves with it. Thanks. Chunhe
Re: Read past EOF error in Windows
No that doesnt seem to be the problem. Anyone have any other ideas? On Tue, 21 Mar 2006 [EMAIL PROTECTED] I had a problem in the past with security on the folder where your index is located...but your error does not seem to show that ... I would check anyway though... -Original Message- From: Chris Cain cbc20[at]hermes.cam.ac.uk To: java-user[at]lucene.apache.org Sent: Tue, 21 Mar 2006 15:33:26 + (GMT) Subject: Read past EOF error in Windows Hi all, I wrote a lucene program which runs fine under Linux and Mac but fails on most Windows machines. (I have managed to get it to work on one version of XP however) Specifically when i open or search the index i get the following error message. Any help would be appreciated, Cheers, Chris caught a class java.io.IOException with message: read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.FSIndexInput.readInternal(FSDirectory.java:451) at org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:45) at org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:219) at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:64) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:33) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:46) at org.apache.lucene.index.SegmentTermEnum.init(SegmentTermEnum.java:47) at org.apache.lucene.index.TermInfosReader.init(TermInfosReader.java:48) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:147) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:115) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:150) at org.apache.lucene.store.Lock$With.run(Lock.java:109) at org.apache.lucene.index.IndexReader.open(IndexReader.java:143) at org.apache.lucene.index.IndexReader.open(IndexReader.java:127) at org.apache.lucene.search.IndexSearcher.init(IndexSearcher.java:42) - To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org For additional commands, e-mail: java-user-help[at]lucene.apache.org - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Native code compilation
Native code There is a C++ port called CLucene, if that suits you more than coffee beans... Otis - Original Message From: Seeta Somagani [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 4:47:33 PM Subject: Native code compilation Hi all, Has anyone tried to compile their Lucene applications into native code? Mine works fine in a VM but the call to search() on IndexSearcher is crashing the application, after I compile it into native code. There is apparently no problem in instantiating an IndexSearcher though. I tried this on both Linux and Windows and am getting the same problem. Thnx Seeta - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: java.lang.OutOfMemoryError in lucene
But i have the IBM JDK 1.4.2, do you know if this version still have the problem?? I'm sorry I don't know that. But you can try it and if it solves the problem, you can add your experience to FAQ :) Koji - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: lucene NFS support
Hi Chunhe, There are no NFS-specific plans. Out of personal curiosity - why go for NFS and not NAS? Otis - Original Message From: Dai, Chunhe [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 4:58:13 PM Subject: lucene NFS support Hi, Does anyone know whether Lucene plans to support NFS in later release(2.0)? We are planning to integrate Lucene into our products and cluster support is definitely needed. We want to check whether NFS support is in the plan or not before implementing a new file locking ourselves with it. Thanks. Chunhe - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Joins between index and database
See RangeFilter.bits() for some example code that creates a filter from terms. Also see TermsFilter in the queries module in the contrib section. ___ To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: lucene NFS support
Thanks, Otis. The reason is that some of our customers definitely use NFS and it is hard to convince all of the hundreds of customers not to use NFS. So naturally, the correct thing for us to do is to just support it since we already have file locking mechanism that works on NFS. -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Thursday, March 23, 2006 5:49 PM To: java-user@lucene.apache.org Subject: Re: lucene NFS support Hi Chunhe, There are no NFS-specific plans. Out of personal curiosity - why go for NFS and not NAS? Otis - Original Message From: Dai, Chunhe [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 4:58:13 PM Subject: lucene NFS support Hi, Does anyone know whether Lucene plans to support NFS in later release(2.0)? We are planning to integrate Lucene into our products and cluster support is definitely needed. We want to check whether NFS support is in the plan or not before implementing a new file locking ourselves with it. Thanks. Chunhe - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Read past EOF error in Windows
Check Whether it has got anything to do with UTF There is a new line difference between windows and linux Rgds Prabhu On 3/24/06, Chris Cain [EMAIL PROTECTED] wrote: No that doesnt seem to be the problem. Anyone have any other ideas? On Tue, 21 Mar 2006 [EMAIL PROTECTED] I had a problem in the past with security on the folder where your index is located...but your error does not seem to show that ... I would check anyway though... -Original Message- From: Chris Cain cbc20[at]hermes.cam.ac.uk To: java-user[at]lucene.apache.org Sent: Tue, 21 Mar 2006 15:33:26 + (GMT) Subject: Read past EOF error in Windows Hi all, I wrote a lucene program which runs fine under Linux and Mac but fails on most Windows machines. (I have managed to get it to work on one version of XP however) Specifically when i open or search the index i get the following error message. Any help would be appreciated, Cheers, Chris caught a class java.io.IOException with message: read past EOF java.io.IOException: read past EOF at org.apache.lucene.store.FSIndexInput.readInternal(FSDirectory.java:451) at org.apache.lucene.store.BufferedIndexInput.readBytes( BufferedIndexInput.java:45) at org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal( CompoundFileReader.java:219) at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java :64) at org.apache.lucene.store.BufferedIndexInput.readByte( BufferedIndexInput.java:33) at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:46) at org.apache.lucene.index.SegmentTermEnum.init(SegmentTermEnum.java:47) at org.apache.lucene.index.TermInfosReader.init(TermInfosReader.java:48) at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:147) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:129) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:115) at org.apache.lucene.index.IndexReader$1.doBody(IndexReader.java:150) at org.apache.lucene.store.Lock$With.run(Lock.java:109) at org.apache.lucene.index.IndexReader.open(IndexReader.java:143) at org.apache.lucene.index.IndexReader.open(IndexReader.java:127) at org.apache.lucene.search.IndexSearcher.init(IndexSearcher.java:42) - To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org For additional commands, e-mail: java-user-help[at]lucene.apache.org - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: lucene NFS support
Dai, Chunhe wrote: Does anyone know whether Lucene plans to support NFS in later release(2.0)? We are planning to integrate Lucene into our products and cluster support is definitely needed. We want to check whether NFS support is in the plan or not before implementing a new file locking ourselves with it. I think that nio-based locking would probably fix this, and could easily be provided in addition or in place of the existing locking mechanism. I think the last time this was considered Lucene was still attempting to be compatible with Java 1.3. But I think Lucene 2.0 is aimed at Java 1.4. Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Native code compilation
Yeah, I'm too lazy to write the code again in C++. Was just trying to see if compiling to native code works. Thanks Seeta -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Thursday, March 23, 2006 5:44 PM To: java-user@lucene.apache.org Subject: Re: Native code compilation Native code There is a C++ port called CLucene, if that suits you more than coffee beans... Otis - Original Message From: Seeta Somagani [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, March 23, 2006 4:47:33 PM Subject: Native code compilation Hi all, Has anyone tried to compile their Lucene applications into native code? Mine works fine in a VM but the call to search() on IndexSearcher is crashing the application, after I compile it into native code. There is apparently no problem in instantiating an IndexSearcher though. I tried this on both Linux and Windows and am getting the same problem. Thnx Seeta - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]