Digging into this memory issue somewhat deeper, it seems to me that  
there is problem of a new zip stream object which is created but is  
never deleted.

Within OBConversion::Convert(istream* is, ostream* os)  there is this  
part of code

#ifdef HAVE_LIBZ
     zlib_stream::zip_istream *zIn;

     // only try to decode the gzip stream once
     if (!CheckedForGzip) {
       zIn = new zlib_stream::zip_istream(*pInStream);
       if (zIn->is_gzip()) {
         pInStream = zIn;
         CheckedForGzip = true;
       }
       else
         delete zIn;
     }
#endif

As far as I understand, zIn is purely a object local to  
OBConversion::Convert(istream* is, ostream* os). When zIn is an actual  
gzipped stream it is not deleted when the function exits. So I guess  
adding some code to deleted this zip stream would help.




On Sep 15, 2010, at 4:28 PM, Noel O'Boyle wrote:

> How does it perform with an unzipped SD file?
>
> On 15 September 2010 15:19, Gert Thijs <gert.th...@silicos.com> wrote:
>> Dear all,
>>
>> I have encountered a memory issue when using OBConversion in a large
>> batch run. What I am trying to do is to process a large set of  
>> gzipped
>> SD files and transform them into canonical smiles and write these
>> smiles string to std::cout. The file names of the  are generated  on
>> the fly based on some information about the directory structure.
>>
>> Below I have copied the main code used in the test script in which I
>> encountered a serious memory error.
>>
>> OpenBabel::OBConversion conv;
>> conv.SetInFormat("sdf");
>> conv.SetOutFormat("can");
>>
>> for ( unsigned int i=1; i<100000; ++i ){
>>        // get file name of sd file from i and store it in d1
>>        // d1 is then of the form "/here/is/my/sdf/dir/mol.sdf.gz"
>>        int2dir(i,d1);
>>
>>        std::ifstream ifs(d1.c_str());
>>
>>        conv.Convert(&ifs,&std::cout);
>>
>>        ifs.close();
>> }
>>
>>
>> If I run this code, I can see that it gradually eats all the RAM  
>> until
>> the program crashes with a memory allocation error. I have done
>> several tests to check where the problem could come from. As far as I
>> understand it, it seems that OBConversion is the main source of the
>> problem. For instance when I open the stream, read one line from it
>> and print this line (and do not use OBConversion), the same program
>> can handle easily more than 1,000,000 files without any hassle.
>>
>> Furthermore, when I use the same code but now I recreate the
>> OBConversion object each time within the for loop the exactly the  
>> same
>> kind of behavior is observed.
>> for ( unsigned int i=1; i<100000; ++i ){
>>        // get file name of sd file from i
>>        // d1 = /my/dir/mol.sdf.gz
>>        int2dir(i,d1);
>>
>>        std::ifstream ifs(d1.c_str());
>>
>>        OpenBabel::OBConversion conv;
>>        conv.SetInFormat("sdf");
>>        conv.SetOutFormat("can");
>>        conv.Convert(&ifs,&std::cout);
>>
>>        ifs.close();
>> }
>>
>> So my guess is that there is something strange going on within
>> OBConversion. But as I am not really familiar with the inner workings
>> of OBConversion, I am not sure where to start looking.
>>
>>
>> Any thoughts on this one.
>>
>> I am working on Mac OS X 10.5.8 using g++ 4.0.1
>>
>> many thanks,
>> Gert
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Start uncovering the many advantages of virtual appliances
>> and start using them to simplify application deployment and
>> accelerate your shift to cloud computing.
>> http://p.sf.net/sfu/novell-sfdev2dev
>> _______________________________________________
>> OpenBabel-Devel mailing list
>> OpenBabel-Devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>>

Gert Thijs

Director Chemoinformatics
Silicos NV.
Wetenschapspark 7
B-3590 Diepenbeek
Belgium

Tel:   +32 11 350703
Fax:  +32 11 220525

http://www.silicos.com/




------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to