During indexing set of documents (about 10000 already indexed) I get the
exception "vector subscript out of range" from ArrayBase operator [ ].
I did some research and it seems it is because threadState->postingEquals()
method is called with invalid threadState->p set.
The postingsHash[hashPos] probably contains pointer to already deleted object,
as 0xfeee is in all members (I'm running it under MSVC 2005 Debugger).
See call stack and threadState->p dump below.
Source (documentswriterthreadstate.cpp:1010)
======
// Locate Posting in hash
threadState->p = postingsHash[hashPos];
if (threadState->p != NULL && !threadState->postingEquals(tokenText,
tokenTextLen)) {
...
Call stack
========
clucene-cored.dll!lucene::util::ArrayBase<wchar_t *>::operator[](unsigned int
_Pos=0xfffffbbb) Line 92 C++
clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState::postingEquals(const
wchar_t * tokenText=0x032772a8, const int tokenTextLen=0x00000008) Line 577 +
0x25 bytes C++
clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState::FieldData::addPosition(lucene::analysis::Token
* token=0x0100c770) Line 1012 + 0x26 bytes C++
clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState::FieldData::invertField(lucene::document::Field
* field=0x04d2a9e0, lucene::analysis::Analyzer * analyzer=0x010a5fa0, const
int maxFieldLength=0x00002710) Line 902 C++
clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState::FieldData::processField(lucene::analysis::Analyzer
* analyzer=0x010a5fa0) Line 797 C++
clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState::processDocument(lucene::analysis::Analyzer
* analyzer=0x010a5fa0) Line 554 + 0x1a bytes C++
clucene-cored.dll!lucene::index::DocumentsWriter::updateDocument(lucene::document::Document
* doc=0x0012f600, lucene::analysis::Analyzer * analyzer=0x010a5fa0,
lucene::index::Term * delTerm=0x00000000) Line 934 + 0xc bytes C++
clucene-cored.dll!lucene::index::DocumentsWriter::addDocument(lucene::document::Document
* doc=0x0012f600, lucene::analysis::Analyzer * analyzer=0x010a5fa0) Line 919
C++
clucene-cored.dll!lucene::index::IndexWriter::addDocument(lucene::document::Document
* doc=0x0012f600, lucene::analysis::Analyzer * analyzer=0x010a5fa0) Line 670
+ 0x13 bytes C++
clucene-cored.dll!lucene::index::IndexModifier::addDocument(lucene::document::Document
* doc=0x0012f600, lucene::analysis::Analyzer * docAnalyzer=0x010a5fa0) Line
100 C++
mkidx.exe!tovek::index::Index::indexDocument(tovek::index::Document &
doc={...}, bool bInsert=false, unsigned long & ulPrevDoc=0x00000007,
tovek::analysis::CachedAnalyzer * pCachedAnalyzer=0x010a5fa0) Line 472 C++
Problematic item in PostingHash:
=========================
- threadState->p 0x02538fd8 {textStart=0xfeeefeee
docFreq=0xfeeefeee freqStart=0xfeeefeee ...}
lucene::index::DocumentsWriter::Posting *
textStart 0xfeeefeee int
docFreq 0xfeeefeee int
freqStart 0xfeeefeee int
freqUpto 0xfeeefeee int
proxStart 0xfeeefeee int
proxUpto 0xfeeefeee int
lastDocID 0xfeeefeee int
lastDocCode 0xfeeefeee int
lastPosition 0xfeeefeee int
+ vector 0xfeeefeee {p=??? lastOffset=??? offsetStart=??? ...}
lucene::index::DocumentsWriter::PostingVector *
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
CLucene-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/clucene-developers