Hi Bill,
 
Yes, this is a known leak. Unfortunately fixing it requires diving into the
inner workings of BooleanScorer2, and having too much on my plate at the
moment I don't expect to get to this in the next few weeks.
 
In short (and based on commit 8a704a4a27a9e79d9c871ac8226dcaee53ebb368),
what's happening is this: scorers are added to an array, but when the array
gets deleted it doesn't care to delete the scorer objects contained in it.
The scorer arrays are defined at BooleanScorer2.cpp lines 548-550 with false
in their constructor - meaning the values will not get deleted when the
array does. If changed to true, we risk double-deletion (some of our tests
in cl_test test just that). If we keep it the way it is now, we have
mem-leaks.
 
Another small leak is due to BooleanScorer2::BSConjunctionScorer deriving
from ConjunctionScorer and not cleaning up ConjunctionScorer::_scorers. If
you'd look at ~ConjunctionScorer you'll notice _CLDELETE(scorers) is
commented out - this is for exactly the same reason as above (possible
double deletion vs. memory leak). Here it should be safe to delete the
scorers object from BooleanScorer2::~BSConjunctionScorer, but it requires
some testing and planning on ConjunctionScorer (how will it be cleaned up
when it is being called directly?).
 
Someone needs to take this piece of code and dive into it to spec the
various ways this class can be called and used, and find a way to properly
manage it's memory. This is either by having a flag to indicate a deletion
of those items has been already made (set internally or by the caller - for
example through the constructor with a default value), or a function to be
called by friend classes to make such a deletion on-demand. It is possible
that the class needs a re-write (partial or full) to get this done properly.
While you're at it, porting tests for it from the Java test suit could help
a lot in verifying the code is working and no memleaks present.
 
Jim Weir was the original code author, might be useful to get a hold of him
for advice or help if he's not too busy.
 
Hopefully you guys can take this up and help us all get this issue closed.
 
Itamar.

  _____  

From: Miller, Bill (QuickWire) [mailto:[email protected]] 
Sent: Thursday, November 26, 2009 11:20 PM
To: [email protected]
Subject: [CLucene-dev] BooleanQuery memory leak (Branch 2.3.2)



Hi all, I've been testing 2.3.2 this week and found a memory leak with
multiple clause Boolean queries.

The clauses must be added as occur::MUST or occur::MUST_NOT for the leak to
show.

Could someone verify this for me?

I'm running under Windows, Visual Studio 2008 using DLL build. 

 

Here's some sample code to reproduce it:

 

void DoTest(CuTest *tc)

{

      StandardAnalyzer a((const TCHAR**)_T("\0"));

      

      RAMDirectory ram;

      IndexWriter writer(&ram, &a, true);

 

      Document doc;

      doc.add(*(_CLNEW Field(_T("first"), _T("Blah blah blah aaa")

            , Field::STORE_YES | Field::INDEX_TOKENIZED)));

      writer.addDocument(&doc);

      writer.close();

      IndexSearcher searcher(&ram);

 

      BooleanQuery *q = _CLNEW BooleanQuery();

      

      Term *t = _CLNEW Term(_T("first"), _T("aaa"));

      q->add(_CLNEW TermQuery(t ), true, BooleanClause::MUST);

      _CLDECDELETE(t);

 

      // adding second clause causes leak

      // less leaks with BooleanClause::MUST_NOT 

      // no leaks with BooleanClause::SHOULD

      t = _CLNEW Term(_T("first"), _T("blah"));

      q->add(_CLNEW TermQuery(t), true, BooleanClause::MUST); 

      _CLDECDELETE(t);

 

      // of course I can also reproduce simply using the query parser

      // Query* q = QueryParser::parse(_T("+blah +aaa "), _T("first"), &a);

 

      Hits* h = searcher.search(q);

      _CLLDELETE(h);

      _CLLDELETE(q);

}

 

 

Here is the MS mem leak dump:

 

Dumping objects ->

{1454} normal block at 0x03ACAB00, 1024 bytes long.

 Data: <                > 00 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

{1453} normal block at 0x03ACDE58, 8 bytes long.

 Data: <        > B8 CA AC 03 20 D3 AC 03 

.\CLucene\search\ConjunctionScorer.cpp(32) : {1452} normal block at
0x03ACDE08, 16 bytes long.

 Data: < {...@j    X       > BC 7B 40 6A CD CD CD CD 58 DE AC 03 02 00 00 00 

.\CLucene\search\TermQuery.cpp(154) : {1443} normal block at 0x03ACD320, 560
bytes long.

 Data: <  @j        (U  > E8 86 40 6A 10 FF CD 00 04 D2 AC 03 28 55 AB 03 

.\CLucene\index\SegmentReader.cpp(540) : {1440} normal block at 0x03ACD1A8,
100 bytes long.

 Data: <  >j      >j    > 94 8B 3E 6A CD CD CD CD A4 8B 3E 6A CD CD CD CD 

.\CLucene\search\TermQuery.cpp(154) : {1436} normal block at 0x03ACCAB8, 560
bytes long.

 Data: <  @j        (U  > E8 86 40 6A 10 FF CD 00 9C C9 AC 03 28 55 AB 03 

.\CLucene\config\threads.cpp(43) : {1434} normal block at 0x03ACCA60, 24
bytes long.

 Data: <h6              > 68 36 BC 00 FF FF FF FF 00 00 00 00 00 00 00 00 

.\CLucene\index\CompoundFile.cpp(118) : {1433} normal block at 0x03ACC9E0,
64 bytes long.

 Data: <( ?j      ?j`   > 28 20 3F 6A CD CD CD CD 14 20 3F 6A 60 CA AC 03 

.\CLucene\index\SegmentReader.cpp(540) : {1432} normal block at 0x03ACC940,
100 bytes long.

 Data: <  >j      >j    > 94 8B 3E 6A CD CD CD CD A4 8B 3E 6A CD CD CD CD 

Object dump complete.

 

 

Thanks for your time - it's extremely appreciated!

Bill 

 

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
CLucene-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to