Looking OBConversion.cpp, I already had the impression that it might  
have something to do with the creation of a new zipstream everytime I  
call Conversion(). So, I tried working with the zipstream instead of  
the ifstream directly as shown here:
                
OpenBabel::OBConversion conv;
conv.SetInFormat("sdf");
conv.SetOutFormat("can");
for ( unsigned int i=1; i<100000; ++i ){
        // get file name of sd file from i and store it in d1
        // d1 is then of the form "/here/is/my/sdf/dir/mol.sdf.gz"
        int2dir(i,d1);
        
        std::ifstream ifs(d1.c_str());
        zlib_stream::zip_istream zIn(ifs);              
        conv.Convert(&zIn,&std::cout);

        ifs.close();
}

And yes, adding this zip_istream line solves the memory issue.

Gert



On Sep 15, 2010, at 4:28 PM, Noel O'Boyle wrote:

> How does it perform with an unzipped SD file?
>
> On 15 September 2010 15:19, Gert Thijs <gert.th...@silicos.com> wrote:
>> Dear all,
>>
>> I have encountered a memory issue when using OBConversion in a large
>> batch run. What I am trying to do is to process a large set of  
>> gzipped
>> SD files and transform them into canonical smiles and write these
>> smiles string to std::cout. The file names of the  are generated  on
>> the fly based on some information about the directory structure.
>>
>> Below I have copied the main code used in the test script in which I
>> encountered a serious memory error.
>>
>> OpenBabel::OBConversion conv;
>> conv.SetInFormat("sdf");
>> conv.SetOutFormat("can");
>>
>> for ( unsigned int i=1; i<100000; ++i ){
>>        // get file name of sd file from i and store it in d1
>>        // d1 is then of the form "/here/is/my/sdf/dir/mol.sdf.gz"
>>        int2dir(i,d1);
>>
>>        std::ifstream ifs(d1.c_str());
>>
>>        conv.Convert(&ifs,&std::cout);
>>
>>        ifs.close();
>> }
>>
>>
>> If I run this code, I can see that it gradually eats all the RAM  
>> until
>> the program crashes with a memory allocation error. I have done
>> several tests to check where the problem could come from. As far as I
>> understand it, it seems that OBConversion is the main source of the
>> problem. For instance when I open the stream, read one line from it
>> and print this line (and do not use OBConversion), the same program
>> can handle easily more than 1,000,000 files without any hassle.
>>
>> Furthermore, when I use the same code but now I recreate the
>> OBConversion object each time within the for loop the exactly the  
>> same
>> kind of behavior is observed.
>> for ( unsigned int i=1; i<100000; ++i ){
>>        // get file name of sd file from i
>>        // d1 = /my/dir/mol.sdf.gz
>>        int2dir(i,d1);
>>
>>        std::ifstream ifs(d1.c_str());
>>
>>        OpenBabel::OBConversion conv;
>>        conv.SetInFormat("sdf");
>>        conv.SetOutFormat("can");
>>        conv.Convert(&ifs,&std::cout);
>>
>>        ifs.close();
>> }
>>
>> So my guess is that there is something strange going on within
>> OBConversion. But as I am not really familiar with the inner workings
>> of OBConversion, I am not sure where to start looking.
>>
>>
>> Any thoughts on this one.
>>
>> I am working on Mac OS X 10.5.8 using g++ 4.0.1
>>
>> many thanks,
>> Gert
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Start uncovering the many advantages of virtual appliances
>> and start using them to simplify application deployment and
>> accelerate your shift to cloud computing.
>> http://p.sf.net/sfu/novell-sfdev2dev
>> _______________________________________________
>> OpenBabel-Devel mailing list
>> OpenBabel-Devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/openbabel-devel
>>

Gert Thijs

Director Chemoinformatics
Silicos NV.
Wetenschapspark 7
B-3590 Diepenbeek
Belgium

Tel:   +32 11 350703
Fax:  +32 11 220525

http://www.silicos.com/




------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
OpenBabel-Devel mailing list
OpenBabel-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-devel

Reply via email to