Hi all, This is a Request For Comment, discussing the way CLucene currently manages memory, and requesting any further thoughts / comments on a suggested alternative.
Roughly, there are 3 types of memory allocations we need to worry about within CLucene. 1) Classes which are impossible to track instances of without ref-counting (or that is too difficult a task). index::Term is a good sample for such. 2) Other classes being used internally, and are more easily managed without ref-counting. 3) All CLucene classes which are exposed to the end user, and being used by them, currently via the _CLNEW, _CLDELETE etc. macros. In our current codebase, objects of type #1 are tracked using an internal ref-counting mechanism (located under debug::LuceneBase), via the _CLNEW and _CLDECDELETE macros. Objects of types #2 and #3 are using raw pointers - without ref-counting, and no smart pointers. Ownerwhip is being passed when necessary, usually through constructors. Consuming code (the library's end user) has to take care of memory management while abiding sometimes confusing memory contracts, what could result in memory leaks (or AVs). A move to smart pointers (from boost libs, and std::auto_ptr) has been recently suggested by a few of our users, to ease integration with consuming code, and to help in eliminating some stuborn memory leaks. The way I see it, this should be carried out as follows: 1. All internal classes being ref-counted (classes of type #1), should remain as such, but move from our custom coded ref-counting to boost::shared_ptr. This will probably require removing LuceneBase - see question 1 below. 2. All other classes being used internally should lose their derivation from LuceneVoidBase, and that psuedo-class should be removed for good. Those classes which aren't being ref-counted today, should remain as such. 3. Move code to using simple smart pointers (scoped_ptr / auto_ptr) in places where it could simplify things, or easily fix memory leaks (for example, the BooleanScorer2 one reported here a few weeks ago). For this first stage I think we rather put efforts on making things better where they're really needed than to refactor 100% tested and working (and not leaking) code, so raw pointers will still be used where there's no compelling reason not to. 4. Based on the approach smart pointers will be implemented with in CLucene, make sure taking advantage of them is kept easy and simple for the end-user (for example, by providing a special header file containing typedef's to wrap classes with shared_ptr / scoped_ptr). 5. Deprecating all CLucene's new/delete macros. Attention to detail is required, as we have quite a variety of those meant to provide x-platform compatibility, and also macros for objects, strings and arrays. 6. Better cmake script to detect and link to boost libs, and allow it to fail and use a local copy which will be distributed along with the CL package. Half is already done (see the isidor_working branch). Questions: 1. How would naming conventions be kept? Do we have any other option than to have a class named "index::TermImpl" and then wrap it as "typedef boost::shared_ptr<TermImpl> Term;"? Since Term is being used on client code as well, forcing him to explicitly call boost::shared_ptr<Term> each time is not an option... 2. What is your opinion regarding using raw pointers (vs. smart pointers) where no ref-counting is required? Is it something we should "get rid of", or if the code using them is working and tested we can leave it be? 3. Other comments on the path suggested above? Your opinion or advice highly matters. Itamar. ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ CLucene-developers mailing list CLucene-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/clucene-developers