Hi Dave,
On Thu, Sep 25, 2014 at 9:53 AM, Dave Wood <[email protected]> wrote:
> Hi All,
>
> I have been generating explicit bit strings of RDKit fingerprints and was
> surprised by this result. Is this the expected behaviour? From the
> documentation it looks like the default length should be 2048.
>
> Interestingly if I call the RDKFingerprint method directly the lengths are
> all the expected 2048.
>
I wouldn't really recommend using the functionality in the FingerprintMols
module. It probably should be deprecated. It is much safer (and not much
more difficult) to call the fingerprint functions directly.
Here's what's going on: The function FingerprintMols.FingerprintsFromMols()
uses FingerprintMols.FingerprintMol() internally. This sets fingerprint
options using the class FingerprintMols.FingerprinterDetails, which
includes a default value for the tgtDensity argument of 0.3. Setting
tgtDensity with the RDKit fingerprint can results in the fingerprinter
folding the fingerprint to achieve a particular target density of on bits:
In [6]: m = Chem.MolFromSmiles("CCCCNC(=O)[C@@H]1CCCN(C(=O)CCC(C)C)C1")
In [7]: Chem.RDKFingerprint(m).GetNumBits()
Out[7]: 2048
In [8]: Chem.RDKFingerprint(m,tgtDensity=0.3).GetNumBits()
Out[8]: 512
This explains the differences you are seeing.
-greg
> In [1]: smi1 = "CCCCNC(=O)[C@@H]1CCCN(C(=O)CCC(C)C)C1"
>
> In [2]: smi2 = "COC(=O)c1cccc(CN2C(=O)N[C@@](C)(c3ccc4c(c3)OCCO4)C2=O)c1"
>
> In [3]: smi3 = "CN(C)[C@@H](Cc1ccccc1)C(=O)NNC(=O)c1ccccc1O"
>
> In [4]: from rdkit import Chem
>
> In [5]: mols = [("mol1", Chem.MolFromSmiles(smi1)), ("mol2",
> Chem.MolFromSmiles(smi2)), ("mol3", Chem.MolFromSmiles(smi3))]
>
> In [6]: from rdkit.Chem.Fingerprints import FingerprintMols
>
>
> *In [7]: print [ len(fp[1].ToBitString()) for fp in
> FingerprintMols.FingerprintsFromMols(mols) ][512, 2048, 1024]*
>
> In [8]: from rdkit.Chem.rdmolops import RDKFingerprint
>
> In [9]: print [ len(RDKFingerprint(mol[1]).ToBitString()) for mol in mols ]
> [2048, 2048, 2048]
>
> I can use the RDKFingerprint method as a solution, but I thought it was
> worth mentioning.
>
> Dave
>
>
> ------------------------------------------------------------------------------
> Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
> Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
> Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
> Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
>
> http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
> _______________________________________________
> Rdkit-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
>
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss