Thank you Greg!

That is a great example. It works perfectly on my toy example. :)

I would like to clarify a question related to indexing in DB. In the examples on RDKit cartridge page gist indices are created on fields containing fingerprints. In the provided notebook you did not do this. Do indices speed up similarity search or what is a reason to add them?

Pavel.

On 04/26/2017 02:36 PM, Greg Landrum wrote:
Hi Pavel,

On Wed, Apr 26, 2017 at 8:41 AM, Pavel Polishchuk <pavel_polishc...@ukr.net <mailto:pavel_polishc...@ukr.net>> wrote:

    Hello,

       is it possible to store custom fingerprints in psql DB and use them
    for similarity search? And how to do this?


That's a great question and it took me a bit to get the answer working.

Since the answer isn't completely trivial and since it may be useful to others, it seems like a good topic for an RDKit blog post. Those take a while to write, but I've got an early version that lacks most of the explanatory text up here:
https://github.com/greglandrum/rdkit_blog/blob/master/notebooks/Custom%20fingerprint%20in%20PostgreSQL.ipynb

I hope this helps,
-greg

       I foundtwo commands bfp_to_binary_text(bfp) and
    bfp_from_binary_text(bytea)in RDKit cartridge but cannot
    understand how
    to use them.
       I want to store pharmacophore fingerprints. There is no a built-in
    command in RDKit cartridge to calculate them so I have to
    calculate them
    in a Pythonscript. Then I need to store them in psql DB and create
    similarity search index but I could not find a solution yet.
       It might be of general interest how to store and use arbitrary
    fingerprints in DB.

    An example of pharmacophore FP generation:

    from rdkit import Chem
    from rdkit import RDConfig
    from rdkit.Chem.Pharm2D.SigFactory import SigFactory
    from rdkit.Chem.Pharm2D import Generate

    import os

    fdefName = os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')
    factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
    sigFactory = SigFactory(factory, minPointCount=2, maxPointCount=3,
    trianglePruneBins=False)
    sigFactory.SetBins([(0,2),(2,5),(5,8)])
    sigFactory.Init()

    mol =
    Chem.MolFromSmiles('Cc1nc(CN(C)c2ncnc3ccc(-c4ccc5c(c4)OCO5)cc23)cs1')
    fp = Generate.Gen2DFingerprint(mol, sigFactory)

    Pavel.

    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    Rdkit-discuss mailing list
    Rdkit-discuss@lists.sourceforge.net
    <mailto:Rdkit-discuss@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
    <https://lists.sourceforge.net/lists/listinfo/rdkit-discuss>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to