Re: [Rdkit-discuss] reaction fingerprint as bitstring
Thanks Greg. I am trying to pre-calculate reaction fingerprints of all my database reactions and store it in database, so that for any new reaction I can run Tanimoto similarity or similar calculation and pick similar reactions. So I decided to convert it to a BitString of fixed length, but I get your point I am loosing information in this way. Any suggestion on how it can be done ? On Fri, Mar 24, 2017 at 3:08 AM Greg Landrum wrote: > Hi Ambrish, > > Assuming that I understand correctly what you want to do, here's an > example using built-in RDKit functionality that generates a reaction > fingerprint (using default parameters, you can change these) and then > converts it into a bit vector using a simple: "if the bit is set in the > original fingerprint set it in the bit vector": > > In [3]: from rdkit.Chem import rdChemReactions > > In [4]: fp = rdChemReactions.CreateDifferenceFingerprintForReaction(rxn) > > In [5]: fp > Out[5]: > > In [6]: from rdkit import DataStructs > > In [7]: ebv = DataStructs.ExplicitBitVect(2048) > > In [8]: for bit in fp: >...: ebv.SetBit(bit%ebv.GetNumBits()) >...: > > In [9]: ebv.GetNumOnBits() > Out[9]: 5 > > > I don't think this is the best strategy since it treats positive and > negative values the same, but without more information on what you want to > do it's the best I can do. > > Best, > -greg > > > > Best, > -greg > > > On Thu, Mar 23, 2017 at 6:10 PM, Ambrish wrote: > > Hi RDKitters, > > I am trying to calculate reaction fingerprints and store it in database. > The transformation fingerprint created using the routine below is a > IntSparseIntVect and I would like to convert it to a BitString of a > particular length. How do we do that . > > def create_transformation_FP(rxn, fp_size, fp_type): > rkfp = None > rfp = None > pfp = None > for react in range(rxn.GetNumReactantTemplates()): > mol = rxn.GetReactantTemplate(react) > mol.UpdatePropertyCache(strict=False) > Chem.GetSSSR(mol) > > try: > if fp_type == 'AP': > fp = AllChem.GetAtomPairFingerprint(mol=mol, > maxLength=fp_size) > elif fp_type == 'Morgan': > fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) > elif fp_type == 'Topological': > fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) > else: > print "Unsupported fingerprint type" > except: > print "Cannot build reactant fingerprint" > if rfp is None: > rfp = fp > else: > rfp += fp > > for product in range(rxn.GetNumProductTemplates()): > mol = rxn.GetProductTemplate(product) > mol.UpdatePropertyCache(strict=False) > Chem.GetSSSR(mol) > try: > if fp_type == 'AP': > fp = AllChem.GetAtomPairFingerprint(mol=mol, > maxLength=fp_size) > elif fp_type == 'Morgan': > fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) > elif fp_type == 'Topological': > fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) > else: > print "Unsupported fingerprint type" > except: > print "Cannot build product fingerprint" > if pfp is None: > pfp = fp > else: > pfp += fp > if pfp is not None and rfp is not None: > rkfp = pfp - rfp > > > return rkfp > > Thanks. > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] reaction fingerprint as bitstring
Hi Ambrish, Assuming that I understand correctly what you want to do, here's an example using built-in RDKit functionality that generates a reaction fingerprint (using default parameters, you can change these) and then converts it into a bit vector using a simple: "if the bit is set in the original fingerprint set it in the bit vector": In [3]: from rdkit.Chem import rdChemReactions In [4]: fp = rdChemReactions.CreateDifferenceFingerprintForReaction(rxn) In [5]: fp Out[5]: In [6]: from rdkit import DataStructs In [7]: ebv = DataStructs.ExplicitBitVect(2048) In [8]: for bit in fp: ...: ebv.SetBit(bit%ebv.GetNumBits()) ...: In [9]: ebv.GetNumOnBits() Out[9]: 5 I don't think this is the best strategy since it treats positive and negative values the same, but without more information on what you want to do it's the best I can do. Best, -greg Best, -greg On Thu, Mar 23, 2017 at 6:10 PM, Ambrish wrote: > Hi RDKitters, > > I am trying to calculate reaction fingerprints and store it in database. > The transformation fingerprint created using the routine below is a > IntSparseIntVect and I would like to convert it to a BitString of a > particular length. How do we do that . > > def create_transformation_FP(rxn, fp_size, fp_type): > rkfp = None > rfp = None > pfp = None > for react in range(rxn.GetNumReactantTemplates()): > mol = rxn.GetReactantTemplate(react) > mol.UpdatePropertyCache(strict=False) > Chem.GetSSSR(mol) > > try: > if fp_type == 'AP': > fp = AllChem.GetAtomPairFingerprint(mol=mol, > maxLength=fp_size) > elif fp_type == 'Morgan': > fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) > elif fp_type == 'Topological': > fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) > else: > print "Unsupported fingerprint type" > except: > print "Cannot build reactant fingerprint" > if rfp is None: > rfp = fp > else: > rfp += fp > > for product in range(rxn.GetNumProductTemplates()): > mol = rxn.GetProductTemplate(product) > mol.UpdatePropertyCache(strict=False) > Chem.GetSSSR(mol) > try: > if fp_type == 'AP': > fp = AllChem.GetAtomPairFingerprint(mol=mol, > maxLength=fp_size) > elif fp_type == 'Morgan': > fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) > elif fp_type == 'Topological': > fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) > else: > print "Unsupported fingerprint type" > except: > print "Cannot build product fingerprint" > if pfp is None: > pfp = fp > else: > pfp += fp > if pfp is not None and rfp is not None: > rkfp = pfp - rfp > > > return rkfp > > Thanks. > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
[Rdkit-discuss] reaction fingerprint as bitstring
Hi RDKitters, I am trying to calculate reaction fingerprints and store it in database. The transformation fingerprint created using the routine below is a IntSparseIntVect and I would like to convert it to a BitString of a particular length. How do we do that . def create_transformation_FP(rxn, fp_size, fp_type): rkfp = None rfp = None pfp = None for react in range(rxn.GetNumReactantTemplates()): mol = rxn.GetReactantTemplate(react) mol.UpdatePropertyCache(strict=False) Chem.GetSSSR(mol) try: if fp_type == 'AP': fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size) elif fp_type == 'Morgan': fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) elif fp_type == 'Topological': fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) else: print "Unsupported fingerprint type" except: print "Cannot build reactant fingerprint" if rfp is None: rfp = fp else: rfp += fp for product in range(rxn.GetNumProductTemplates()): mol = rxn.GetProductTemplate(product) mol.UpdatePropertyCache(strict=False) Chem.GetSSSR(mol) try: if fp_type == 'AP': fp = AllChem.GetAtomPairFingerprint(mol=mol, maxLength=fp_size) elif fp_type == 'Morgan': fp = AllChem.GetMorganFingerprint(mol=mol, radius=fp_size) elif fp_type == 'Topological': fp = AllChem.GetTopologicalTorsionFingerprint(mol=mol) else: print "Unsupported fingerprint type" except: print "Cannot build product fingerprint" if pfp is None: pfp = fp else: pfp += fp if pfp is not None and rfp is not None: rkfp = pfp - rfp return rkfp Thanks. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss