Hi Itamar,

Thanks for the reply.  No, it's a single threaded application.  It used to
be multi-threaded where I would build the search engine on one thread and
then optimize it on another to increase speed. However, that was causing a
different problem so I'm slowly trying to simplify it down as much as
possible to get to the root of the problem.   My dataset is pretty large to
be able to release.   But, I'll try and strip it down as much as I can to
make a reproducible test case that I can maybe send you.  I've also turned
on the Info stream logging which I hope will be informative.   I think that
something might be clobbering the engines RAMBuffer.  Just trying to find
out what that something would be.


Regards,

Teryl


On Sat, May 7, 2011 at 8:35 PM, Itamar Syn-Hershko <ita...@code972.com>wrote:

>  Hi,
>
>
>  Is this a multi-threaded scenario?
>
>
>  Any chance you can send a failing test for us to work on?
>
>
>  Itamar.
>
>
>  On 07/05/2011 00:25, Teryl Taylor wrote:
>
> Hi there,
>
> I've been playing around with clucene (great piece of software by the way)
> and I seem to be getting a random assertional error every once in a while
> when I'm writing out indexes.    The assertional error is as follows:
>
> Assertion failed: (doc < numDocsInRAM), function appendPostings, file
> /Users/terylt/Projects/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp,
> line 759.
>
> The stack trace of the error is as follows:
>
> #12 0x00018daa in ConnectionBuffer::AddConnection (this=0x3b7f180,
> conn=0x337d880) at
> /Users/terylt/Projects/Canaris2/trunk/mega_collector/MegaCollector/ConnectionBuffer.cpp:410
> (gdb) frame 10
> #10 0x0022c518 in lucene::index::IndexWriter::addDocument (this=0x3b7f310,
> doc=0x0, analyzer=0x0) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:702
> (gdb) frame 9
> #9  0x00229ed3 in lucene::index::IndexWriter::flush (this=0x3b7f310,
> triggerMerge=true, _flushDocStores=false) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:1330
> (gdb) frame 8
> #8  lucene::index::IndexWriter::doFlush (this=0x3b7f310,
> _flushDocStores=false) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:0
> (gdb) frame 7
> #7  std::string::c_str () at basic_string.h:1559
> (gdb) frame 6
> #6  0x001eb24f in lucene::index::DocumentsWriter::flush (this=0x45f5b430,
> _closeDocStore=true) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:472
> (gdb) frame 5
> #5  0x001e9e6b in lucene::index::DocumentsWriter::writeSegment
> (this=0x45f5b430, flushedFiles=@0x45f5b55c) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:595
> (gdb) frame 4
> #4  0x001e58c4 in lucene::index::DocumentsWriter::appendPostings
> (this=0x45f5b430, fields=0xbfff9394, termsOut=0x53eeb3e0,
> freqOut=0x437b65d0, proxOut=0x437b65f0) at
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:760
> (gdb) frame 3
> #3  0x932283db in __assert_rtn ()
> (gdb)
>
>
> Basically, what I'm doing is building a set (a few hundred) of small search
> engine files each 250 MBs in size and having 500,000 documents.    I build
> the indexes one at a time using code similar to the demo:
>
>
> *********************************Create Writer:
>
> memset(output, 0, 2048);
> strcpy(output, tempdir);
> char* temp = mkdtemp(output)
> if(temp == NULL)
> {
>       printf("Can't create temp indice file");
>       exit(1);
> }
> m_writer = _CLNEW IndexWriter( output ,&an, true);
> m_writer->setMaxFieldLength(0x7FFFFFFFL); // LUCENE_INT32_MAX_SHOULDBE
> m_writer->setRAMBufferSizeMB(250.0);
> m_writer->setMaxBufferedDocs(PARTITION_SIZE);
> // Turn this off to make indexing faster; we'll turn it on later before
> optimizing
> m_writer->setUseCompoundFile(false);
>
> **************************************Add document 500,000 times:
>
> sub->m_writer->addDocument(&(conn->m_doc), &(sub->an));
>
>
> ******************************************Then I clean up:
>
> sub->m_writer->setUseCompoundFile(true);
> sub->m_writer->optimize();
>
> // Close and clean up
> sub->m_writer->close();
>
> _CLLDELETE(sub->m_writer);
>
>
> And then I recreate the next writer.    Everything works great, output is
> exactly how I want it and everything.    The problem is that I get the above
> assertional error at random times.  It doesn't seem to be attached to any in
> particular input to the search engine, because if I run it again on the same
> files, it won't happen on that file.    There seems to be a bit of an issue
> with the Internal RAM Buffer.   Memory usage for the process seems to be
> good up until the assertional error.
>
> Anyone have any idea what might be wrong or what I might be able to do to
> get a better understanding of the problem?
>
> Any ideas would be great.
>
> Best Regards,
>
>
> Teryl
>
>
>
> ------------------------------------------------------------------------------
> WhatsUp Gold - Download Free Network Management Software
> The most intuitive, comprehensive, and cost-effective network
> management toolset available today.  Delivers lowest initial
> acquisition cost and overall TCO of any competing 
> solution.http://p.sf.net/sfu/whatsupgold-sd
>
>
> _______________________________________________
> CLucene-developers mailing 
> listCLucene-developers@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to