Hello all, I am curious on how to fold a count vector fingerprint. I understand when folding bit vectors the most common way is to split the vector in half, and apply a bitwise OR operation. I think this is how the function rdkit.DataStructs.FoldFingerprint works in RDKit, correct me if I am wrong.
How does RDKit and or what is the appropriate way to fold count vectors such as AtomPair, Morgan, and Topological torsion? I thought about turning the fingerprint into a bit vector using their respected "AsBitVect" method then folding using rdkit.DataStructs.FoldFingerprint, but topological torsion doesn't have a " AsBitVect" method [https://www.rdkit.org/docs/GettingStartedInPython.html]. For an explicit example using AtomPair fingerprint we can see the fingerprint is extremely sparse. Could this AtomPair fingerprint be folded to increase the density? >>> from rdkit import Chem >>> from rdkit.Chem import AllChem >>> mol = Chem.MolFromSmiles('CC1CCCCC1') >>> ap_fp = AllChem.GetAtomPairFingerprint(mol, minLength=1, maxLength=3) >>> number_of_nonzero_elements = len(ap_fp.GetNonzeroElements().values()) >>> print((ap_fp.GetLength(),number_of_nonzero_elements)) (8388608,9) Very Respectfully, Ben
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss