Hi Greg,

Thank you very much for your help. I should have read the source instead
of playing with grep. Once again thanks.

--
Kind regards,
Matthew




On 07/19/2013 09:25 PM, Greg Landrum wrote:
> Hi Matthew,
>
>
> On Fri, Jul 19, 2013 at 1:55 PM, Maciej Szymkiewicz
> <[email protected] <mailto:[email protected]>> wrote:
>
>     Hello everyone,
>
>     I new here so at the beginning I'd like to introduce myself. My
>     name is
>     Matthew and I'm student of Bioinformatics at the University of Warsaw.
>     I am also RDKit newbie so please be patient.
>
>
> Welcome.
>  
>
>
>     Currently I’m working on couple of small services using PostgreSQL
>     cartridge and RDKit Python wrapper.
>     Debian 3.9.6-1 x86_64 GNU/Linux
>     PostgreSQL 9.1.9
>     RDKit 2013_03_2
>
>     I obtain quite different similarity values using Postgres and Python.
>     For example for simple script available here:
>     http://pastebin.com/M8j3dMCj (empty db named foo, cartridge installed
>     with schema rdkit)
>     i get output like below.
>
>     Morgan: python = 0.145833333333, postgres = 0.179775280899
>     RDKit : python = 0.427549194991, postgres = 0.485889570552
>     MACCS : python = 0.597402597403, postgres = 0.597402597403
>     Atompair : python = 0.21935483871, postgres = 0.322335025381
>     Torsion (dice) : python = 0.102941176471, postgres = 0.246153846154
>     Layered: python = 0.555211558308, postgres = 0.654569892473
>
>     I assume it's mainly because of difference in fingerprint size
>     and I tried changing parameters on Python side but no luck so far.
>     I would be grateful for any help.
>
>
> It is indeed the fingerprint size.
> Here are the size parameters used by the cartridge
> (https://github.com/rdkit/rdkit/blob/master/Code/PgSQL/rdkit/adapter.cpp):
>
> const unsigned int SSS_FP_SIZE=2048;
> const unsigned int LAYERED_FP_SIZE=1024;
> const unsigned int MORGAN_FP_SIZE=512;
> const unsigned int HASHED_TORSION_FP_SIZE=1024;
> const unsigned int HASHED_PAIR_FP_SIZE=2048;
>
> the RDKit fingerprint uses LAYERED_FP_SIZE.
>
> Here's a sample with morgan:
> In [14]:
> DataStructs.FingerprintSimilarity(*[rdMolDescriptors.GetMorganFingerprintAsBitVect(Chem.MolFromSmiles(s),
> 2,nBits=512) for s in smi])
> Out[14]: 0.1797752808988764
>
> and with AtomPairs:
> In [15]:
> DataStructs.FingerprintSimilarity(*[rdMolDescriptors.GetHashedAtomPairFingerprintAsBitVect(Chem.MolFromSmiles(s),
> nBits=2048) for s in smi])
> Out[15]: 0.32233502538071068
>
> The rest is left as an exercise for the reader. ;-)
>
> Seriously, let us know if you need more info.
>
> -greg
>
>  


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to