Hi Andrew, I also found a way to fix it, but I don't like it because it causes additional unnecessary seeks in the file. Could you please send me you version of the BufferedIndexInput class (just the files you have changed)? I would test it and publish it into the GIT repository together with some extended tests.
Thanks, Jiri -----Original Message----- From: Andrew McCann [mailto:mcc...@deviantart.com] Sent: Tuesday, December 14, 2010 12:07 AM To: clucene-developers@lists.sourceforge.net Subject: Re: [CLucene-dev] read past EOF ERROR while searching Hi guys, I've never responded to this list.. I have encountered this bug myself.. and I made a fix.. (Took a while to track down) jiri is on the right track, that was the problem I had.. I refactored the bufferedindexinput class slightly to prevent it from happening, was a minimal change. I'm not sure what the procedure is for submitting fixes though. -Andrew 2010/12/13 Šplíchal Jiří <splic...@tovek.cz>: > Hi, > > some more details: > > the exception comes from the method: > > void FSDirectory::FSIndexInput::readInternal(uint8_t* b, const int32_t len) > > > > which is called from: > > void BufferedIndexInput::refill() > > > > the strange thing is: in the refill buffer there is a member start with the > value 1024 > > but when calling the readInternal method, both members > > handle->_fpos and _pos have values 1025 (one more) and so it tries to read > different data from the file. > > > > Hope, It helps. > > > > Jiri > > > > > > > > From: Šplíchal Jiří [mailto:splic...@tovek.cz] > Sent: Monday, December 13, 2010 9:28 PM > To: clucene-developers@lists.sourceforge.net > Subject: [CLucene-dev] read past EOF ERROR while searching > > > > Hi, > > > > we found serious problem while searching. In some special situation > (probably depends on the index size) > > repeating search does not return correct results or event ends with > CLuceneError "read past EOF". > > > > To achieve the error, the following sequence must be called: > > 1) run a query > > 2) delete an instance of Analyzer (can be instantiated earlier but in the > same thread) - this causes that the ThreadLocals objects are freed > > 3) run the same query - THIS FAILS!! > > > > In all the cases where the seach failed we switched off the compound files. > > Could some one help us with this issue? It seems that reading the index > files is not working correctly. > > > > Jiri > > > > PS: The following code is a test that demostrated the problem: > > > > > > > > /** > > * Create index > > */ > > Directory* prepareDirectory1() > > { > > const TCHAR * tszDocText = _T( "a b c d e f g h i j k l m n o p q r s > t u v w x y z ab bb cb db eb fb gb hb ib jb kb lb mb nb ob pb qb rb sb tb ub > vb wb xb yb zb ac bc cc dc ec fc gc hc ic jc kc lc mc nc oc pc qc rc sc tc > uc vc wc xc yc zc ad bd cd dd ed fd gd hd id jd kd ld md nd od pd qd rd sd > td ud vd wd xd yd zd ae be ce de ee fe ge he ie je ke le me ne oe pe qe re > se te ue ve we xe ye ze af bf cf df ef ff gf hf if jf kf lf mf" ); > > > > char fsdir[CL_MAX_PATH]; > > _snprintf(fsdir,CL_MAX_PATH,"%s/%s",cl_tempDir, "test.search"); > > > > WhitespaceAnalyzer analyzer; > > Directory* pDirectory = > (Directory*)FSDirectory::getDirectory(fsdir); > > IndexWriter writer( pDirectory, &analyzer, true ); > > > > writer.setUseCompoundFile( false ); > > > > Document* d = _CLNEW Document(); > > d->add( *_CLNEW Field( _T("_content"), tszDocText, Field::STORE_NO | > Field::INDEX_TOKENIZED )); > > writer.addDocument(d); > > _CLDELETE( d ); > > writer.close(); > > > > return pDirectory; > > } > > > > /** > > * Run test > > */ > > void testReadPastEOF(CuTest *tc) > > { > > Directory* pDirectory = prepareDirectory1(); > > Analyzer * pAnalyzer = NULL; > > Hits * pHits = NULL; > > IndexReader* pReader = IndexReader::open( pDirectory ); > > IndexSearcher searcher( pReader ); > > > > CLUCENE_ASSERT( pReader->numDocs() == 1 ); > > > > Term * t1 = new Term( _T( "_content" ), _T( "ze" ) ); > > TermQuery * pQry1 = new TermQuery( t1 ); > > _CLDECDELETE( t1 ); > > > > pAnalyzer = new SimpleAnalyzer(); > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > // Removing the analyzer causes removing of ThreadLocals - also cached > SegmentTermEnum > > _CLDELETE( pAnalyzer ); > > > > // THE NEXT CALL WILL FAIL > > pHits = searcher.search( pQry1 ); > > _ASSERT( pHits->length() == 1 ); > > CLUCENE_ASSERT( pHits->length() == 1 ); > > _CLDELETE( pHits ); > > > > _CLDELETE( pQry1 ); > > > > searcher.close(); > > _CLDELETE( pReader ); > > > > pDirectory->close(); > > _CLDECDELETE( pDirectory ); > > } > > > > > > ------------------------------------------------------------------------------ > Lotusphere 2011 > Register now for Lotusphere 2011 and learn how > to connect the dots, take your collaborative environment > to the next level, and enter the era of Social Business. > http://p.sf.net/sfu/lotusphere-d2d > _______________________________________________ > CLucene-developers mailing list > CLucene-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers