Looking OBConversion.cpp, I already had the impression that it might have something to do with the creation of a new zipstream everytime I call Conversion(). So, I tried working with the zipstream instead of the ifstream directly as shown here: OpenBabel::OBConversion conv; conv.SetInFormat("sdf"); conv.SetOutFormat("can"); for ( unsigned int i=1; i<100000; ++i ){ // get file name of sd file from i and store it in d1 // d1 is then of the form "/here/is/my/sdf/dir/mol.sdf.gz" int2dir(i,d1); std::ifstream ifs(d1.c_str()); zlib_stream::zip_istream zIn(ifs); conv.Convert(&zIn,&std::cout);
ifs.close(); } And yes, adding this zip_istream line solves the memory issue. Gert On Sep 15, 2010, at 4:28 PM, Noel O'Boyle wrote: > How does it perform with an unzipped SD file? > > On 15 September 2010 15:19, Gert Thijs <gert.th...@silicos.com> wrote: >> Dear all, >> >> I have encountered a memory issue when using OBConversion in a large >> batch run. What I am trying to do is to process a large set of >> gzipped >> SD files and transform them into canonical smiles and write these >> smiles string to std::cout. The file names of the are generated on >> the fly based on some information about the directory structure. >> >> Below I have copied the main code used in the test script in which I >> encountered a serious memory error. >> >> OpenBabel::OBConversion conv; >> conv.SetInFormat("sdf"); >> conv.SetOutFormat("can"); >> >> for ( unsigned int i=1; i<100000; ++i ){ >> // get file name of sd file from i and store it in d1 >> // d1 is then of the form "/here/is/my/sdf/dir/mol.sdf.gz" >> int2dir(i,d1); >> >> std::ifstream ifs(d1.c_str()); >> >> conv.Convert(&ifs,&std::cout); >> >> ifs.close(); >> } >> >> >> If I run this code, I can see that it gradually eats all the RAM >> until >> the program crashes with a memory allocation error. I have done >> several tests to check where the problem could come from. As far as I >> understand it, it seems that OBConversion is the main source of the >> problem. For instance when I open the stream, read one line from it >> and print this line (and do not use OBConversion), the same program >> can handle easily more than 1,000,000 files without any hassle. >> >> Furthermore, when I use the same code but now I recreate the >> OBConversion object each time within the for loop the exactly the >> same >> kind of behavior is observed. >> for ( unsigned int i=1; i<100000; ++i ){ >> // get file name of sd file from i >> // d1 = /my/dir/mol.sdf.gz >> int2dir(i,d1); >> >> std::ifstream ifs(d1.c_str()); >> >> OpenBabel::OBConversion conv; >> conv.SetInFormat("sdf"); >> conv.SetOutFormat("can"); >> conv.Convert(&ifs,&std::cout); >> >> ifs.close(); >> } >> >> So my guess is that there is something strange going on within >> OBConversion. But as I am not really familiar with the inner workings >> of OBConversion, I am not sure where to start looking. >> >> >> Any thoughts on this one. >> >> I am working on Mac OS X 10.5.8 using g++ 4.0.1 >> >> many thanks, >> Gert >> >> >> >> >> ------------------------------------------------------------------------------ >> Start uncovering the many advantages of virtual appliances >> and start using them to simplify application deployment and >> accelerate your shift to cloud computing. >> http://p.sf.net/sfu/novell-sfdev2dev >> _______________________________________________ >> OpenBabel-Devel mailing list >> OpenBabel-Devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/openbabel-devel >> Gert Thijs Director Chemoinformatics Silicos NV. Wetenschapspark 7 B-3590 Diepenbeek Belgium Tel: +32 11 350703 Fax: +32 11 220525 http://www.silicos.com/ ------------------------------------------------------------------------------ Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev _______________________________________________ OpenBabel-Devel mailing list OpenBabel-Devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-devel