Hello Muhammad, What analyzer are you using? It seem to me that your code is based on that application "clucene_qt", well i think that you are an old version of that application, it has several bugs concerning this problem, and the ArabicAnalyzer is not well implemented nor optimized, you should use the 0.6 or 0.8 version of this application.
Ahmed 2011/1/6, muhammad ismael <m.ismae...@gmail.com>: > Hello ben, > the files are more than 6 GB. > I am sorry i was remember to add this function but i forgot and here it is > > Document* IndexEngine::fileDocument(const QString &id, const QString &bookid > , const QString &text) > > { > > // make a new, empty document > > Document* doc = _CLNEW Document(); > > ///page ID > > doc->add( *_CLNEW Field(_T("id"), QSTRING_TO_TCHAR(id) , > > Field::STORE_YES | Field::INDEX_UNTOKENIZED) ); > > doc->add( *_CLNEW Field(_T("bookid"), QSTRING_TO_TCHAR(bookid) , > > Field::STORE_YES | Field::INDEX_UNTOKENIZED ) ); > > doc->add( *_CLNEW Field(_T("text"), QSTRING_TO_TCHAR(text), > > Field::STORE_NO | Field::INDEX_TOKENIZED) ); > > return doc; > > } > > and also i tried to remove adding documents and the size did not increased, > which means that leaks are in addDocument() I am trying to debug it but i am > lost. > > sounds pretty high. how big are the files? could you be leaking memory in >> the 'fileDocument' function? >> >> as a test, try not actually adding the document >> >> ben >> >> On Thu, Jan 6, 2011 at 7:43 AM, muhammad ismael <m.ismae...@gmail.com >> >wrote: >> >> > Hello, >> > I am trying to index large files as follows >> > >> > for(int j = 0; (j < pagesIds.count())&& !m_stop ; j++) >> > >> > { >> > >> > pagesText = >> m_DbManager->getBookPage(m_booksIds.at(i), pagesIds.at(j)).toUtf8(); >> > >> > if(!pagesText.isEmpty()) >> > >> > { >> > >> > Document* doc = >> fileDocument(QString::number(pagesIds.at(j)), >> QString::number(m_booksIds.at(i)), pagesText); >> > >> > writer->addDocument(doc); >> > >> > _CLDELETE(doc); >> > >> > } >> > >> > } >> > >> > >> > but when the number of files exceeds 5000 files the application usage of >> my >> > computer ram is 2 GB >> > I tried to debug and i found that this happens in >> > >> > IndexWriter::addDocument(Document*) >> > >> > i tried to set >> > IndexWriter->setMergeFactor(5); >> > and also >> > IndexWriter->setRAMBufferSizeMB(10); >> > I know the default ram usage should be 16 MB but i just tried. >> > >> > I am working on master branch and i merged with it memory_leaks branch. >> > is am i missing something? >> > > Mohammad Ismael > -- Envoyé avec mon mobile ------------------------------------------------------------------------------ Gaining the trust of online customers is vital for the success of any company that requires sensitive data to be transmitted over the Web. Learn how to best implement a security strategy that keeps consumers' information secure and instills the confidence they need to proceed with transactions. http://p.sf.net/sfu/oracle-sfdevnl _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers