I don't understand how subclassing can help, as the member in base class is 
private, so it isn't accesible even for children.

I'm not a friend of Friend classes (it seems to me an uggly technique which 
breaks encapsulation) and it also needs changes to DocumentWriter.

So the only way I see is to change method to be public. I'm not very happy 
doing so, but I cannot see any other way...

Borek


> -----Original Message-----
> From: Itamar Syn-Hershko [mailto:ita...@code972.com]
> Sent: Thursday, June 24, 2010 12:11 AM
> To: clucene-developers@lists.sourceforge.net
> Subject: Re: [CLucene-dev] vector subscript out of range 
> exceptionduringindexing
> 
> In IndexWriter.h (line 1163) there are a few functions marked as being for
> test purposes only. From what I could tell, they are not being accessed from
> anywhere right now.
> 
> Your options as I see them are:
> 
> * Make them public (I'm not sure how Java gets around that one without doing
> this)
> * Subclass IndexWriter in the test suite and make them available only under
> it
> * "Friend" the classes
> 
> Decide which to do based on the way JL uses them (apparently we aren't using
> them at all at the moment, so don't look at CL for this). If it is possible
> to make this code available from within the test suite alone, I'd definitely
> preffer to compile those out of the core's IndexWriter. "Friend"ing is
> probably not possible to do without putting test code in CL, which as I said
> - the core is better left without.
> 
> HTH
> 
> Itamar.
> 
> > -----Original Message-----
> > From: Kostka Bořivoj [mailto:kos...@tovek.cz]
> > Sent: Thursday, June 24, 2010 12:22 AM
> > To: clucene-developers@lists.sourceforge.net
> > Subject: Re: [CLucene-dev] vector subscript out of range
> > exception duringindexing
> >
> > I started porting of test but I have problem with
> > private/protected methods. Some JLucene methods are used in
> > tests but marked private in CLucene, e.g.
> >
> >     IndexWriter writer = new IndexWriter(dir, analyzer, true);
> >     writer.addDocument(testDoc);
> >     writer.flush();
> >     SegmentInfo info = writer.newestSegment();
> >
> > Can be easily ported to
> >
> >     IndexWriter * writer = _CLNEW IndexWriter(dir, analyzer, true);
> >     writer->addDocument(&testDoc);
> >     writer->flush();
> >     SegmentInfo * info = writer->newestSegment();
> >
> > But the newestSegment method is private, so test cannot be compiled.
> >
> > Any hint how to go around that?
> >
> > Borek
> >
> >
> >
> > > -----Original Message-----
> > > From: Kostka Bořivoj [mailto:kos...@tovek.cz]
> > > Sent: Wednesday, June 23, 2010 5:00 PM
> > > To: clucene-developers@lists.sourceforge.net
> > > Subject: Re: [CLucene-dev] vector subscript out of
> > > rangeexceptionduringindexing
> > >
> > > I'll try to port whole TestDocumentsWriter, it is not so big
> > >
> > > > -----Original Message-----
> > > > From: Itamar Syn-Hershko [mailto:ita...@code972.com]
> > > > Sent: Wednesday, June 23, 2010 12:39 PM
> > > > To: clucene-developers@lists.sourceforge.net
> > > > Subject: Re: [CLucene-dev] vector subscript out of range
> > > > exceptionduringindexing
> > > >
> > > > Use Java Lucene 2.3.2, which the git master branch is
> > based on. Grab
> > > > it from http://archive.apache.org/dist/lucene/java/, or
> > you can use
> > > > tools like Krugle to read the code on-line.
> > > >
> > > > You may only need this to port TestDocumentsWriter as a whole. To
> > > > fix this specific issue I think it is enough to follow the patch
> > > > attached to the JIRA issue. I'm not sure it was deployed
> > to the 2.3.2 sources, btw.
> > > >
> > > > Itamar.
> > > >
> > > > > -----Original Message-----
> > > > > From: Kostka Bořivoj [mailto:kos...@tovek.cz]
> > > > > Sent: Wednesday, June 23, 2010 12:10 PM
> > > > > To: clucene-developers@lists.sourceforge.net
> > > > > Subject: Re: [CLucene-dev] vector subscript out of
> > range exception
> > > > > duringindexing
> > > > >
> > > > > I'm not sure which JLucene version I should use (and
> > where to get
> > > > > it)
> > > > >
> > > > > Borek
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Itamar Syn-Hershko [mailto:ita...@code972.com]
> > > > > > Sent: Wednesday, June 23, 2010 12:11 AM
> > > > > > To: clucene-developers@lists.sourceforge.net
> > > > > > Subject: Re: [CLucene-dev] vector subscript out
> > > > > > ofrangeexceptionduringindexing
> > > > > >
> > > > > > Those are the postings array and its staging area for
> > > > > flushing. Once
> > > > > > flushed, a Posting object can be deleted.
> > > > > >
> > > > > > The code you quoted is originally written in Java as:
> > > > > >     Arrays.fill(postingsFreeList,
> > postingsFreeCount-numToFree,
> > > > > > postingsFreeCount, null);
> > > > > >
> > > > > > Meaning, this is not a deletion but rather a nullification.
> > > > > This may
> > > > > > actually be a proper behavior for Java, since it maintains
> > > > > > internal reference counting of all objects. However,
> > it seem to
> > > > > > have caused issues with JLucene as well for documents
> > with many terms:
> > > > > > https://issues.apache.org/jira/browse/LUCENE-1072.
> > Only question
> > > > > > is how come we haven't seen this until now, and whats special
> > > > > with the reuters corpus?
> > > > > >
> > > > > > I think, if you could port TestDocuemntsWriter to cl_test (at
> > > > > > least the relevant test case they have added) and check if it
> > > > > crashes with
> > > > > > the same characteristics of your issue, we could
> > verify this is
> > > > > > the same issue. Then we can apply their patch (while
> > following
> > > > > > the JIRA
> > > > > > discussion) accordingly to DocumentsWriter.cpp.
> > > > > >
> > > > > > Itamar.
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Kostka Bořivoj [mailto:kos...@tovek.cz]
> > > > > > > Sent: Tuesday, June 22, 2010 11:53 PM
> > > > > > > To: clucene-developers@lists.sourceforge.net
> > > > > > > Subject: Re: [CLucene-dev] vector subscript out of
> > > > > > > rangeexceptionduringindexing
> > > > > > >
> > > > > > > I did some research and found following:
> > > > > > >
> > > > > > > The problem is caused by freeing cycle in balanceRAM()
> > > > > > > (documentswriter.cpp:1325)
> > > > > > >
> > > > > > >         for ( size_t i =
> > > > > > > this->postingsFreeCountDW-numToFree;i<
> > > > > > > this->postingsFreeListDW.length; i++ ){
> > > > > > >           _CLDELETE(this->postingsFreeListDW.values[i]);
> > > > > > >         }
> > > > > > >
> > > > > > > Because this->postingsFreeListDW.values contains pointers
> > > > > which are
> > > > > > > also used in postingsHash table, the _CLDELETE
> > makes them invalid.
> > > > > > >
> > > > > > > So the main question is why Postings objects referenced in
> > > > > > > postingsHash are also referenced by postingsFreeListDW.
> > > > > > >
> > > > > > > Until now I was not able to find the reason.
> > > > > > >
> > > > > > > Borek
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Itamar Syn-Hershko [mailto:ita...@divrei-tora.com]
> > > > > > > > Sent: Monday, June 21, 2010 2:08 PM
> > > > > > > > To: clucene-developers@lists.sourceforge.net
> > > > > > > > Subject: Re: [CLucene-dev] vector subscript out of range
> > > > > > > > exceptionduringindexing
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > This seems to be the same error reported by Klemens Friedl
> > > > > > > last week [1].
> > > > > > > >
> > > > > > > > I can confirm your findings. After setting the demo
> > > > > application to
> > > > > > > > index the reuters corpora distributed with CLucene (see
> > > > > my patch
> > > > > > > > to master today), and setting maxFieldLength to
> > MAX_INT, the
> > > > > > > applications
> > > > > > > > is failing on one of the files (for me it was
> > > > > reut2-002.sgm). Call
> > > > > > > > stack points to DocumentsWriterThreadState.cpp ln 1142,
> > > > > > > > where
> > > > > > > > threadState->p is pointing to freed or invalid memory.
> > > > > > > >
> > > > > > > > Unfortunately at the moment I cannot work on tracing this
> > > > > > > properly. If
> > > > > > > > you can do this yourself, I'll be happy to assist with
> > > > > > > whatever I can.
> > > > > > > >
> > > > > > > > Itamar.
> > > > > > > >
> > > > > > > > [1]
> > > > > > >
> > http://comments.gmane.org/gmane.comp.jakarta.lucene.clucene.de
> > > > > > vel/3449 .
> > > > > > > Also see
> > > > > > >
> > > > >
> > http://sourceforge.net/tracker/?func=detail&aid=2981449&group_id=8
> > > > > 00
> > > > > > > 13
> > > > > > > &atid=
> > > > > > > 558446.
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Kostka Bořivoj [mailto:kos...@tovek.cz]
> > > > > > > > Sent: Monday, June 21, 2010 2:50 PM
> > > > > > > > To: clucene-developers@lists.sourceforge.net
> > > > > > > > Subject: [CLucene-dev] vector subscript out of range
> > > > > > > > exception duringindexing
> > > > > > > >
> > > > > > > > During indexing set of documents (about 10000 already
> > > > > > > > indexed) I get the exception "vector subscript
> > out of range"
> > > > > > > > from ArrayBase operator [ ].
> > > > > > > > I did some research and it seems it is because
> > > > > > > > threadState->postingEquals() method is called with
> > > > > invalid p set.
> > > > > > > > The postingsHash[hashPos] probably contains pointer to
> > > > > > > > already deleted object, as 0xfeee is in all members (I'm
> > > > > running it under
> > > > > > > > MSVC 2005 Debugger).
> > > > > > > > See call stack and threadState->p dump below.
> > > > > > > >
> > > > > > > > Source (documentswriterthreadstate.cpp:1010)
> > > > > > > > ======
> > > > > > > >
> > > > > > > >   // Locate Posting in hash
> > > > > > > >   threadState->p = postingsHash[hashPos];
> > > > > > > >
> > > > > > > >   if (threadState->p != NULL &&
> > > > > > > > !threadState->postingEquals(tokenText,
> > tokenTextLen)) { ...
> > > > > > > >
> > > > > > > >
> > > > > > > > Call stack
> > > > > > > > ========
> > > > > > > > clucene-cored.dll!lucene::util::ArrayBase<wchar_t
> > > > > > > > *>::operator[](unsigned int _Pos=0xfffffbbb)
> > Line 92     C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState:
> > > > > > > > :postingEquals(const wchar_t * tokenText=0x032772a8, const
> > > > > > > > int tokenTextLen=0x00000008)  Line 577 + 0x25
> > bytes       C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState:
> > > > > > > > :FieldData::addPosition(lucene::analysis::Token *
> > > > > > > > token=0x0100c770)  Line 1012 + 0x26 bytes       C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState:
> > > > > > > > :FieldData::invertField(lucene::document::Field *
> > > > > > > > field=0x04d2a9e0, lucene::analysis::Analyzer *
> > > > > > > > analyzer=0x010a5fa0, const int
> > > > > > > > maxFieldLength=0x00002710)
> > > > > > > > Line 902        C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState:
> > > > > > > > :FieldData::processField(lucene::analysis::Analyzer *
> > > > > > > > analyzer=0x010a5fa0)  Line 797  C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::ThreadState:
> > > > > > > > :processDocument(lucene::analysis::Analyzer *
> > > > > > > > analyzer=0x010a5fa0)  Line 554 + 0x1a bytes     C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::updateDocu
> > > > > > > > me nt(lucene::document::Document * doc=0x0012f600,
> > > > > > > > lucene::analysis::Analyzer * analyzer=0x010a5fa0,
> > > > > > > > lucene::index::Term * delTerm=0x00000000)  Line 934 + 0xc
> > > > > > > > bytes   C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::DocumentsWriter::addDocumen
> > > > > > > > t( lucene::document::Document * doc=0x0012f600,
> > > > > > > > lucene::analysis::Analyzer * analyzer=0x010a5fa0)  Line
> > > > > 919   C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::IndexWriter::addDocument(lu
> > > > > > > > ce ne::document::Document * doc=0x0012f600,
> > > > > > > > lucene::analysis::Analyzer
> > > > > > > > * analyzer=0x010a5fa0)  Line 670 +
> > > > > > > > 0x13 bytes      C++
> > > > > > > >
> > > > > > > >
> > clucene-cored.dll!lucene::index::IndexModifier::addDocument(
> > > > > > > > lu cene::document::Document * doc=0x0012f600,
> > > > > > > > lucene::analysis::Analyzer * docAnalyzer=0x010a5fa0)  Line
> > > > > > > > 100     C++
> > > > > > > >
> > > > > > > >
> > mkidx.exe!tovek::index::Index::indexDocument(tovek::index::D
> > > > > > > > oc ument & doc={...}, bool bInsert=false, unsigned long &
> > > > > > > > ulPrevDoc=0x00000007, tovek::analysis::CachedAnalyzer *
> > > > > > > > pCachedAnalyzer=0x010a5fa0)  Line 472   C++
> > > > > > > >
> > > > > > > >
> > > > > > > > Problematic item in PostingHash:
> > > > > > > > =========================
> > > > > > > >
> > > > > > > > -               threadState->p  0x02538fd8
> > > > > > > > {textStart=0xfeeefeee docFreq=0xfeeefeee
> > freqStart=0xfeeefeee
> > > > > > > > ...}    lucene::index::DocumentsWriter::Posting *
> > > > > > > >                 textStart       0xfeeefeee      int
> > > > > > > >                 docFreq 0xfeeefeee      int
> > > > > > > >                 freqStart       0xfeeefeee      int
> > > > > > > >                 freqUpto        0xfeeefeee      int
> > > > > > > >                 proxStart       0xfeeefeee      int
> > > > > > > >                 proxUpto        0xfeeefeee      int
> > > > > > > >                 lastDocID       0xfeeefeee      int
> > > > > > > >                 lastDocCode     0xfeeefeee      int
> > > > > > > >                 lastPosition    0xfeeefeee      int
> > > > > > > > +               vector  0xfeeefeee {p=??? lastOffset=???
> > > > > > > > offsetStart=??? ...}
> > > > > > lucene::index::DocumentsWriter::PostingVector *
> > > > > > > >
> > > > > > > >
> > ------------------------------------------------------------
> > > > > > > > --
> > > > > > > > ----------------
> > > > > > > > ThinkGeek and WIRED's GeekDad team up for the Ultimate
> > > > > > > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> > > > > > > > lucky
> > > > > parental unit.
> > > > > > > > See the prize list and enter to win:
> > > > > > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > > > > > _______________________________________________
> > > > > > > > CLucene-developers mailing list
> > > > > > > > CLucene-developers@lists.sourceforge.net
> > > > > > > >
> > https://lists.sourceforge.net/lists/listinfo/clucene-develop
> > > > > > > > ers
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > >
> > ------------------------------------------------------------------
> > > > > --
> > > > > > > --
> > > > > > > -------- ThinkGeek and WIRED's GeekDad team up for the
> > > > > > > Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE
> > PRIZE to
> > > > > > > the lucky parental unit.  See the prize list and
> > enter to win:
> > > > > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > > > > _______________________________________________
> > > > > > > CLucene-developers mailing list
> > > > > > > CLucene-developers@lists.sourceforge.net
> > > > > > >
> > https://lists.sourceforge.net/lists/listinfo/clucene-developer
> > > > > > > s
> > > > > >
> > > > > >
> > > > >
> > ------------------------------------------------------------------
> > > > > ----
> > > > > > ------
> > > > > > --
> > > > > > ThinkGeek and WIRED's GeekDad team up for the
> > Ultimate GeekDad
> > > > > > Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky
> > > > > parental unit.
> > > > > > See the prize list and enter to win:
> > > > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > > > _______________________________________________
> > > > > > CLucene-developers mailing list
> > > > > > CLucene-developers@lists.sourceforge.net
> > > > > >
> > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > ------------------------------------------------------------------
> > > > > ----
> > > > > > -------- ThinkGeek and WIRED's GeekDad team up for
> > the Ultimate
> > > > > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> > > > > lucky parental
> > > > > > unit.  See the prize list and enter to win:
> > > > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > > > _______________________________________________
> > > > > > CLucene-developers mailing list
> > > > > > CLucene-developers@lists.sourceforge.net
> > > > > >
> > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > > > >
> > > > > --------------------------------------------------------------
> > > > > ----------------
> > > > > ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad
> > > > > Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental
> > > > > unit.  See the prize list and enter to win:
> > > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > > _______________________________________________
> > > > > CLucene-developers mailing list
> > > > > CLucene-developers@lists.sourceforge.net
> > > > > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > > > >
> > > >
> > > >
> > > >
> > --------------------------------------------------------------------
> > > > ---------- ThinkGeek and WIRED's GeekDad team up for the Ultimate
> > > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky
> > > > parental unit.  See the prize list and enter to win:
> > > > http://p.sf.net/sfu/thinkgeek-promo
> > > > _______________________________________________
> > > > CLucene-developers mailing list
> > > > CLucene-developers@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> > >
> > >
> > ----------------------------------------------------------------------
> > > -------- ThinkGeek and WIRED's GeekDad team up for the Ultimate
> > > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> > lucky parental
> > > unit.  See the prize list and enter to win:
> > > http://p.sf.net/sfu/thinkgeek-promo
> > > _______________________________________________
> > > CLucene-developers mailing list
> > > CLucene-developers@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> >
> > --------------------------------------------------------------
> > ----------------
> > ThinkGeek and WIRED's GeekDad team up for the Ultimate
> > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky
> > parental unit.  See the prize list and enter to win:
> > http://p.sf.net/sfu/thinkgeek-promo
> > _______________________________________________
> > CLucene-developers mailing list
> > CLucene-developers@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> >
> 
> 
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> CLucene-developers mailing list
> CLucene-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/clucene-developers

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to