DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=27587>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=27587 java.io.IOException: read past EOF when searching index ------- Additional Comments From [EMAIL PROTECTED] 2004-03-15 17:58 ------- I have not had a chance to look at this closely yet, but it sounds to me like a case where numSkipped is incorrectly computed. This algorithm has proven tricky to get right. The first version I checked in was buggy and failed in a similar way. I spent a day staring at at, and came up with the current version, which may not yet be right. The workaround is to comment out lines 179 to 222. If things work when you do that, then this is probably a bug in the commented-out code. If someone else who is good at debugging fussy algorithms has time, please look at this. Your prize will be much admiration from your peers. Otherwise I'll try to get to it when I next have time. The skip data is written by SegmentMerger.java, lines 415 to 445. I think that code is correct. It writes a sequence of <docNumDelta,freqPointerDelta,proxPointerDelta> tuples. Each docNumDelta is the difference between itself and the previous docNum. The docNums contained in this sequence are the docNum *before* every 16th entry in the TermDocs. The freqPointer and proxPointer indicate the position of every 16th entry in the TermDocs and TermPositions in the .frq and .prx file, respectively. The sequence is stored in the .frq file, at the end of the TermDocs for each term whose frequency is greater than 16. I hope this makes some sense. I still need to add this to the 1.4 file format documentation... The sequence is read only by TermDocs.skipTo(), to enable skipping ahead by 16 entries at a time, which can accelerate many kinds of queries. This is the logic that has proven tricky. Any volunteers looking for hacker points? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]