Re: optimize fails with Negative seek offset
Hi, sorry for following up my own mail, but since no one responded so far, I thought the stacktrace might be of interested. The following exception always occurs when trying to optimize one of our indizes, which always went ok for about a year now. I just tried with 1.4-rc3, but with the same result: java.io.IOException: Negative seek offset at java.io.RandomAccessFile.seek(Native Method) at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:405) at org.apache.lucene.store.InputStream.readBytes(InputStream.java:61) at org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(CompoundFileReader.java:222) at org.apache.lucene.store.InputStream.refill(InputStream.java:158) at org.apache.lucene.store.InputStream.readByte(InputStream.java:43) at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:63) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:238) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:185) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:483) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:362) at LuceneRPCHandler.optimize(LuceneRPCHandler.java:398) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.apache.xmlrpc.Invoker.execute(Invoker.java:168) at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:123) at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:185) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:151) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:773) at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:656) at java.lang.Thread.run(Thread.java:534) Any hint would be greatly appreciated. Thanks, Sascha -- Gallileus - the power of knowledge Gallileus GmbHhttp://www.gallileus.info/ Pintschstraße 16 fon +49-(0)30-41 93 43 43 10249 Berlin fax +49-(0)30-41 93 43 45 Germany ++ AKTUELLER HINWEIS (Mai 2004) Literatur Alerts - Literatursuche (wie) im Schlaf! Ab jetzt mehr dazu unter: http://www.gallileus.info/gallileus/about/products/alerts/ ++ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: new Lucene release: 1.4 RC3
I presume this still requires Java 1.4 to build, but will run with Java 1.3? Regards, Terry - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Tuesday, May 11, 2004 4:51 PM Subject: new Lucene release: 1.4 RC3 Version 1.4 RC3 of Lucene is available for download from: http://cvs.apache.org/dist/jakarta/lucene/v1.4-rc3/ Changes are described at: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-lucene/CHANGES.txt?rev=1.85 Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: new Lucene release: 1.4 RC3
I don't recall any JDK 1.4 methods/classes being used, and I just saw Doug replacing one AssertException (1.4) with RuntimeException. Are there some 1.4 dependencies I'm not aware of? Otis --- Terry Steichen [EMAIL PROTECTED] wrote: I presume this still requires Java 1.4 to build, but will run with Java 1.3? Regards, Terry - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Tuesday, May 11, 2004 4:51 PM Subject: new Lucene release: 1.4 RC3 Version 1.4 RC3 of Lucene is available for download from: http://cvs.apache.org/dist/jakarta/lucene/v1.4-rc3/ Changes are described at: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-lucene/CHANGES.txt?rev=1.85 Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: new Lucene release: 1.4 RC3
Last time I checked, JDK 1.4 was needed to compile the classes implementing the new sorting features. Part of the issue was the inclusion of the regex classes, but the other dependency had to do (as I recall) with some kind of inner class constructs (that JDK 1.3 won't compile). I believe that the contributor, Tim Jones, fixed some of then to work with JDK 1.3, but to the best of my knowledge, not the inner class stuff. Regards, Terry - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Wednesday, May 12, 2004 8:04 AM Subject: Re: new Lucene release: 1.4 RC3 I don't recall any JDK 1.4 methods/classes being used, and I just saw Doug replacing one AssertException (1.4) with RuntimeException. Are there some 1.4 dependencies I'm not aware of? Otis --- Terry Steichen [EMAIL PROTECTED] wrote: I presume this still requires Java 1.4 to build, but will run with Java 1.3? Regards, Terry - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Tuesday, May 11, 2004 4:51 PM Subject: new Lucene release: 1.4 RC3 Version 1.4 RC3 of Lucene is available for download from: http://cvs.apache.org/dist/jakarta/lucene/v1.4-rc3/ Changes are described at: http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-lucene/CHANGES.txt?rev=1.85 Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: optimize fails with Negative seek offset
Looks like the same error I got when I tried to use Lucene version 1.3 to search on an index I had created with Lucene version 1.4. The versions are not forward compatible. Did you by chance create the index with version 1.4 and are now searching with version 1.3. It's easy to get the dependencies out of sync for different apps, which is what happened to me. -vito On Wed, 2004-05-12 at 04:59, Sascha Ottolski wrote: Hi, sorry for following up my own mail, but since no one responded so far, I thought the stacktrace might be of interested. The following exception always occurs when trying to optimize one of our indizes, which always went ok for about a year now. I just tried with 1.4-rc3, but with the same result: java.io.IOException: Negative seek offset at java.io.RandomAccessFile.seek(Native Method) at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:405) at org.apache.lucene.store.InputStream.readBytes(InputStream.java:61) at org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(CompoundFileReader.java:222) at org.apache.lucene.store.InputStream.refill(InputStream.java:158) at org.apache.lucene.store.InputStream.readByte(InputStream.java:43) at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:63) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:238) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:185) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92) at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:483) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:362) at LuceneRPCHandler.optimize(LuceneRPCHandler.java:398) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:324) at org.apache.xmlrpc.Invoker.execute(Invoker.java:168) at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:123) at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:185) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:151) at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139) at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:773) at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:656) at java.lang.Thread.run(Thread.java:534) Any hint would be greatly appreciated. Thanks, Sascha -- Gallileus - the power of knowledge Gallileus GmbHhttp://www.gallileus.info/ Pintschstrae 16 fon +49-(0)30-41 93 43 43 10249 Berlin fax +49-(0)30-41 93 43 45 Germany ++ AKTUELLER HINWEIS (Mai 2004) Literatur Alerts - Literatursuche (wie) im Schlaf! Ab jetzt mehr dazu unter: http://www.gallileus.info/gallileus/about/products/alerts/ ++ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: optimize fails with Negative seek offset
Am Mittwoch, 12. Mai 2004 18:54 schrieb Anthony Vito: Looks like the same error I got when I tried to use Lucene version 1.3 to search on an index I had created with Lucene version 1.4. The versions are not forward compatible. Did you by chance create the index with version 1.4 and are now searching with version 1.3. It's easy to get the dependencies out of sync for different apps, which is what happened to me. -vito Hi vito, thanks for the reply, but no, we only upgraded so far, but did not downgade. More than that, the failing index was just rebuilt completely with 1.4-rc2, only two weeks ago. The problem started a short time afterwards (but not immediately). Greets, Sascha -- Gallileus - the power of knowledge Gallileus GmbHhttp://www.gallileus.info/ Pintschstrae 16 fon +49-(0)30-41 93 43 43 10249 Berlin fax +49-(0)30-41 93 43 45 Germany ++ AKTUELLER HINWEIS (Mai 2004) Literatur Alerts - Literatursuche (wie) im Schlaf! Ab jetzt mehr dazu unter: http://www.gallileus.info/gallileus/about/products/alerts/ ++ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: multivalue fields
I don't know if it will help, but take a look at the following email and enclosing thread from a few weeks ago. http://nagoya.apache.org/eyebrowse/[EMAIL PROTECTED]msgNo=7737 Ryan Sonnek [EMAIL PROTECTED] 05/11/04 12:40PM using lucene 1.3-final, it appears to only search the first field with that name. here's the code i'm using to construct the index, and I'm using Luke to check that the index is created correctly. Everything looks fine, but my search returns empty. do i have to use a special query to work with multivalue fields? is there a testcase in the source that performs this kind of work that I could look at? //indexing Document doc = new Document(); Iterator values = myValues.iterator(); while (values.hasNext()) { Object value = values.next(); doc.add(Field.Keyword(test, value.toString())); } //searching BooleanQuery query = new BooleanQuery(); Query fieldQuery = QueryParser.parse(searchValue, test, ANALYZER); query.add(fieldQuery, true, false); Ryan -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 11, 2004 11:31 AM To: Lucene Users List Subject: Re: multivalue fields Just add multiple Fields with the exact same name. Otis --- Ryan Sonnek [EMAIL PROTECTED] wrote: How can I construct a document that has multiple values for one field (ex: locale en_US, de_DE, etc). I've been concatonating the values into one string and storing them in one field, but I think this affects the search rankings (more text to search produces lower score). is it possible to append the seperate values to the same field without concatonating them together? Ryan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Mixing database and lucene searches
I think I follow what you're saying. Thanks Phil. Regards, Glen Phil brunet [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] -- Snip -- If you can't guaranty a fixed number of Lucene results (and it is often the case !), a good way is to duplicate the last PK and so to round to a fixed number. Hi... I'm not sure what you mean by that last bit. Hi ... i'm going to try to express myself correctly ... in english :-) We were talking about the need to cross Lucene results and DB results. And that it could be a good idea to execute a query like : SELECT * FROM my_table WHERE 1st criteria // this criteria was not expressed in the Lucene query AND 2nd criteria // this criteria was not expressed in the Lucene query AND AND my_pk IN (pk_value_1, pk_value_2, pk_value_n); where pk_values have been previously retrieved by the Lucene query. In the JDBC statement, using bind variables is a good way to avoid useless query parsing time. But if the number of pk_value retrieved by the Lucene query is different for each query, using bind variables will not avoid the query parsing time. Because the SQL query signature will be differente, so the rdbms will need to parse the query again. To bypass this problem, you can round the number of b ind variable. For exemple, you know that your Lucene queries will retrieve ... let's say ... a maximum of 1000 results. Sometimes only one result is retieved = you have one pk_value Sometime 5 results are retrieved = you have five pk_value Sometime etc I suggest that in each case, you duplicate the last pk_value in order to have always the same number of bind variables in the SQL statement. In my exemple, you will always have 1000 bind variables in the SQL statement, whatever you had one, five or n results. Especially for short SQL queries, avoiding parsing time is really precious (i work with Oracle DB- sic !) _ MSN Search, le moteur de recherche qui pense comme vous ! http://search.msn.fr/ - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
clean up html before indexing or add tags to ignore list
Hi This is a typical web crawler, indexing and search application development. I have wrote my crawler and planning to add lucene in next. One questions pop to my mind, in terms of performance, do i clean up the html removing all tags before indexing, or i add all tags into the ignore list during indexing/search stage. Which is better? Thanks Sebastian Ho - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
lucene-1.4-rc2 and JVM version
Hi, We were starting to learn and use lucene about 3 weeks ago, it is really a great product! Here we have some problems with certain JVM versions (SUN jdk). We are using lucene-1.4-rc2 on Solaris 2.8 platform: (1) We have a program to index about 230 documents. If using jdk1.4.1_02, our program often hanged at IndexWriter.addDocument(doc); At which document it hanges is essentially random. My question is: is there any known issues with jdk1.4.1_02 and lucene-1.4-rc2 (BUILD.txt said any jdk later than 1.2 is OK) ? (2) We also found for some trivial search program, jdk1.3.0 would crash, but jdk1.3.1_03 is OK (below I attached my search code). If running on jdk1.3.0, I got the following message (at the line calling IndexSearcher.search(...)): # # HotSpot Virtual Machine Error, Unexpected Signal 11 # Please report this error at # http://java.sun.com/cgi-bin/bugreport.cgi # # Error ID: 4F533F534F4C415249530E435050079A 01 # # Problematic Thread: prio=5 tid=0x29800 nid=0x1 runnable # Is this a known problem with jdk1.3.0 ? The same program run through with jdk1.3.1_03 fine. I would really appreciate any help and guidance on these two issues. Best regards, Lisheng ## import java.io.*; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.*; import org.apache.lucene.index.Term; import org.apache.lucene.search.Searcher; import org.apache.lucene.search.IndexSearcher; import org.apache.lucene.search.Query; import org.apache.lucene.search.Hits; import org.apache.lucene.search.Sort; import org.apache.lucene.queryParser.QueryParser; class UrSearch { private static void log(String msg) { System.out.println(msg); } public static void main(String[] argv) { try { Searcher searcher = new IndexSearcher(./myindex); Searcher[] searches = new Searcher[1]; searches[0] = searcher; Analyzer analyzer = new StandardAnalyzer(); Query query0 = simpleQuery(analyzer); log(Q= + query0.toString()); log(QueryClass= + query0.getClass().toString()); Sort sort = new Sort(); // Crash on this line if jdk1.3.0 !!! Hits hits = searcher.search(query0, sort); log(hits.length() + total matching documents); for(int i=0; ihits.length(); i++) { Document doc = hits.doc(i); log(docid= + doc.get(docid)); log(score= + hits.score(i)); } searcher.close(); } catch (Exception ex) { log(EXTYPE: + ex.getClass().getName()); log(EXMSG: + ex.getMessage()); try { PrintWriter mout = new PrintWriter(new FileOutputStream(err.dat), true); ex.printStackTrace(mout); } catch(FileNotFoundException newex) { log(TERRIBLE: + newex.getMessage()); System.exit(0); } } } static Query simpleQuery(Analyzer analyzer) throws Exception { Query q1 = QueryParser.parse(iepeditorial, all, analyzer); return q1; } } ## - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]