Hello !
I want to search in many file with many extention file in CLucene , can
you help me ?
Thanks !
==============================
BKActive Group
Số 1 Đại Cồ Việt Hai Bà Trưng Hà Nội
No 1 Dai Co Viet Stress - Hai Ba Trung - Ha Noi
Website : www.bkactive.com
Sent from Hanoi, Vietnam
2011/5/10 <clucene-developers-requ...@lists.sourceforge.net>
> Send CLucene-developers mailing list submissions to
> clucene-developers@lists.sourceforge.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
> or, via email, send a message with subject or body 'help' to
> clucene-developers-requ...@lists.sourceforge.net
>
> You can reach the person managing the list at
> clucene-developers-ow...@lists.sourceforge.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of CLucene-developers digest..."
>
>
> Today's Topics:
>
> 1. Re: Strange Assertion Error. clucene 2.3.3.4 (Teryl Taylor)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 9 May 2011 16:43:55 -0400
> From: Teryl Taylor <teryl.tay...@gmail.com>
> Subject: Re: [CLucene-dev] Strange Assertion Error. clucene 2.3.3.4
> To: Itamar Syn-Hershko <ita...@code972.com>
> Cc: clucene-developers@lists.sourceforge.net
> Message-ID: <BANLkTim0nFpYvA+ZwKLej2NHUfWq-BTa=a...@mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi Itamar,
>
> Thanks for the reply. No, it's a single threaded application. It used to
> be multi-threaded where I would build the search engine on one thread and
> then optimize it on another to increase speed. However, that was causing a
> different problem so I'm slowly trying to simplify it down as much as
> possible to get to the root of the problem. My dataset is pretty large to
> be able to release. But, I'll try and strip it down as much as I can to
> make a reproducible test case that I can maybe send you. I've also turned
> on the Info stream logging which I hope will be informative. I think that
> something might be clobbering the engines RAMBuffer. Just trying to find
> out what that something would be.
>
>
> Regards,
>
> Teryl
>
>
> On Sat, May 7, 2011 at 8:35 PM, Itamar Syn-Hershko <ita...@code972.com
> >wrote:
>
> > Hi,
> >
> >
> > Is this a multi-threaded scenario?
> >
> >
> > Any chance you can send a failing test for us to work on?
> >
> >
> > Itamar.
> >
> >
> > On 07/05/2011 00:25, Teryl Taylor wrote:
> >
> > Hi there,
> >
> > I've been playing around with clucene (great piece of software by the
> way)
> > and I seem to be getting a random assertional error every once in a while
> > when I'm writing out indexes. The assertional error is as follows:
> >
> > Assertion failed: (doc < numDocsInRAM), function appendPostings, file
> >
> /Users/terylt/Projects/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp,
> > line 759.
> >
> > The stack trace of the error is as follows:
> >
> > #12 0x00018daa in ConnectionBuffer::AddConnection (this=0x3b7f180,
> > conn=0x337d880) at
> >
> /Users/terylt/Projects/Canaris2/trunk/mega_collector/MegaCollector/ConnectionBuffer.cpp:410
> > (gdb) frame 10
> > #10 0x0022c518 in lucene::index::IndexWriter::addDocument
> (this=0x3b7f310,
> > doc=0x0, analyzer=0x0) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:702
> > (gdb) frame 9
> > #9 0x00229ed3 in lucene::index::IndexWriter::flush (this=0x3b7f310,
> > triggerMerge=true, _flushDocStores=false) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:1330
> > (gdb) frame 8
> > #8 lucene::index::IndexWriter::doFlush (this=0x3b7f310,
> > _flushDocStores=false) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/IndexWriter.cpp:0
> > (gdb) frame 7
> > #7 std::string::c_str () at basic_string.h:1559
> > (gdb) frame 6
> > #6 0x001eb24f in lucene::index::DocumentsWriter::flush (this=0x45f5b430,
> > _closeDocStore=true) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:472
> > (gdb) frame 5
> > #5 0x001e9e6b in lucene::index::DocumentsWriter::writeSegment
> > (this=0x45f5b430, flushedFiles=@0x45f5b55c) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:595
> > (gdb) frame 4
> > #4 0x001e58c4 in lucene::index::DocumentsWriter::appendPostings
> > (this=0x45f5b430, fields=0xbfff9394, termsOut=0x53eeb3e0,
> > freqOut=0x437b65d0, proxOut=0x437b65f0) at
> >
> /Users/terylt/Projects/Canaris2/libs/clucene/src/core/CLucene/index/DocumentsWriter.cpp:760
> > (gdb) frame 3
> > #3 0x932283db in __assert_rtn ()
> > (gdb)
> >
> >
> > Basically, what I'm doing is building a set (a few hundred) of small
> search
> > engine files each 250 MBs in size and having 500,000 documents. I
> build
> > the indexes one at a time using code similar to the demo:
> >
> >
> > *********************************Create Writer:
> >
> > memset(output, 0, 2048);
> > strcpy(output, tempdir);
> > char* temp = mkdtemp(output)
> > if(temp == NULL)
> > {
> > printf("Can't create temp indice file");
> > exit(1);
> > }
> > m_writer = _CLNEW IndexWriter( output ,&an, true);
> > m_writer->setMaxFieldLength(0x7FFFFFFFL); // LUCENE_INT32_MAX_SHOULDBE
> > m_writer->setRAMBufferSizeMB(250.0);
> > m_writer->setMaxBufferedDocs(PARTITION_SIZE);
> > // Turn this off to make indexing faster; we'll turn it on later before
> > optimizing
> > m_writer->setUseCompoundFile(false);
> >
> > **************************************Add document 500,000 times:
> >
> > sub->m_writer->addDocument(&(conn->m_doc), &(sub->an));
> >
> >
> > ******************************************Then I clean up:
> >
> > sub->m_writer->setUseCompoundFile(true);
> > sub->m_writer->optimize();
> >
> > // Close and clean up
> > sub->m_writer->close();
> >
> > _CLLDELETE(sub->m_writer);
> >
> >
> > And then I recreate the next writer. Everything works great, output is
> > exactly how I want it and everything. The problem is that I get the
> above
> > assertional error at random times. It doesn't seem to be attached to any
> in
> > particular input to the search engine, because if I run it again on the
> same
> > files, it won't happen on that file. There seems to be a bit of an
> issue
> > with the Internal RAM Buffer. Memory usage for the process seems to be
> > good up until the assertional error.
> >
> > Anyone have any idea what might be wrong or what I might be able to do to
> > get a better understanding of the problem?
> >
> > Any ideas would be great.
> >
> > Best Regards,
> >
> >
> > Teryl
> >
> >
> >
> >
> ------------------------------------------------------------------------------
> > WhatsUp Gold - Download Free Network Management Software
> > The most intuitive, comprehensive, and cost-effective network
> > management toolset available today. Delivers lowest initial
> > acquisition cost and overall TCO of any competing solution.
> http://p.sf.net/sfu/whatsupgold-sd
> >
> >
> > _______________________________________________
> > CLucene-developers mailing listCLucene-developers
> @lists.sourceforge.nethttps://
> lists.sourceforge.net/lists/listinfo/clucene-developers
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
>
> ------------------------------------------------------------------------------
> Achieve unprecedented app performance and reliability
> What every C/C++ and Fortran developer should know.
> Learn how Intel has extended the reach of its next-generation tools
> to help boost performance applications - inlcuding clusters.
> http://p.sf.net/sfu/intel-dev2devmay
>
> ------------------------------
>
> _______________________________________________
> CLucene-developers mailing list
> CLucene-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/clucene-developers
>
>
> End of CLucene-developers Digest, Vol 61, Issue 3
> *************************************************
>
------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers