Dear Egueni, On Mon, Apr 13, 2009 at 11:20 AM, Evgueni Kolossov <[email protected]> wrote: > > Looking like I have passed successfully the first stage - writing > fingerprints into database as BLOB. > I have enclosed file where you will find for one structure the Smiles string > and the fingerprints as it extracted from the database (created with > RDKFingerprintMol(*mol)). > It will be very nice if you can check that I am storing/extracting the right > fingerprints for this structure. > Second stage - extract and re-create BitVector from the extracted string - > there is a problem: it failing in ExplicitBitVect vRtn(strNew) probably > because of allowOldFormat is false or I need to do something with the string > - can you suggest anything?
I don't think it has anything to do with the allowOldFormat since you probably aren't saving fingerprints in the old format. I'd guess you aren't extracting the full blob into the string. Again, this is something that's dependent on the details of the database system you are using and is probably covered in the documentation for your database. > Based on all this conversions and the cost of extracting BLOBs I am thinking > may be it better to store just SMILES and create fingerprints on fly? Have > you tried/compare this two ways? I would be very, very surprised if there was any substantial overhead associated with using BLOBs in your database. In sqlite, postgresql, and firebird it's pretty much none (maybe a few copies). In any case, it's nothing compared to the time required to build a molecule from SMILES and then generating a fingerprint for it. If this is not true for MySQL, then something is badly wrong. -greg > Regards, > Evgueni > > -----Original Message----- > From: Greg Landrum [mailto:[email protected]] > Sent: 08 April 2009 18:53 > To: Evgueni Kolossov > Subject: Re: [Rdkit-discuss] Fingerprints writing > > There's RDKit code either in python: > $RDBASE/Projects/DbCLI > or in C++ for sqlite: > $RDBASE/Code/Demos/sqlite/rdk_funcs.cpp > > Maybe there's enough there to get you started with mysql > > On Wed, Apr 8, 2009 at 5:55 PM, Evgueni Kolossov <[email protected]> > wrote: >> Thanks Greg, >> Can you describe how are you doing this? >> >> regards, >> Evgueni >> >> 2009/4/8 Greg Landrum <[email protected]> >>> >>> Evgueni, >>> >>> I'm afraid this is something specific to the database you're using and >>> I don't think I can help. The key is not to forget that the strings >>> from ToString() are *binary*, any operation that's expecting standard >>> ASCII text is very, very unlikely to work. >>> >>> On Wed, Apr 8, 2009 at 12:59 PM, Evgueni Kolossov <[email protected]> >>> wrote: >>> > Hi Greg, >>> > >>> > You probably getting sick with my questions.... Sorry. >>> > I still cannot manage to create SQL string for insert ToStrring() into >>> > the >>> > DB (MySQL). >>> > When I add this to my string: >>> > >>> > ............ >>> > strSQL += "'"; >>> > strSQL += fp->ToString(); //or std;:string generated by this method >>> > strSQL += "'"; >>> > The last single quote will not be inserted and nothing can be inserted >>> > into >>> > this string after ToString(). >>> > Any replacement of the single quotes will do the same. >>> > SMILES string works without problem >>> > >>> > Can you suggest something? >>> > Or I need to use file upload instead? >>> > >>> > Regards, >>> > Evgueni >>> > >>> > 2009/4/7 Evgueni Kolossov <[email protected]> >>> >> >>> >> Thanks, >>> >> >>> >> I still cannot figure out why it failing when I am trying to insert >>> >> ToString() value.... >>> >> May be I need replace single quote to something else... >>> >> >>> >> Regards, >>> >> Evgueni >>> >> >>> >> 2009/4/7 Greg Landrum <[email protected]> >>> >>> >>> >>> On Tue, Apr 7, 2009 at 7:32 PM, Evgueni Kolossov > <[email protected]> >>> >>> wrote: >>> >>> > Thanks Greg - you are right as usual. >>> >>> > Can you tell me - what are you storing in database: string from >>> >>> > ToString() >>> >>> > or string from BitVectorToText? >>> >>> > >>> >>> >>> >>> I use the ToString form, because it's more compact and faster to >>> >>> reconstruct (I believe). >>> >>> The argument in favor of the ToString form is that it's theoretically >>> >>> more interoperable; I figure that if I need that I can always add a >>> >>> bitstring column later. >>> >>> >>> >>> -greg >>> >> >>> >> >>> > >>> > >> >> >

