Hi Jason, This gist shows how to generate fingerprints for the molecules in a pandas dataframe and then use them to do similarity searches: https://gist.github.com/greglandrum/045ccf8009fde91fc985864e70ee72a1
This is a reasonably efficient way of working with a smallish (<10K) number of molecules. -greg On Thu, Jan 3, 2019 at 7:10 PM Jason Ochoada <jocho...@gmail.com> wrote: > Hi Everyone! > > I'm a newbie making the shift from RDKit in KNIME to working with the full > package. I have been working (hacking) my through the tutorials I could > find pandas, Jupyter, RDKit etc. I'm using RDKit in the anaconda 3 > environment. I'm struggling to figure out how to do what I imagine is a > very simple task. I have read in a flat file (Smiles file) and have it in > a pandas data frame named cpds. It contained SMILES and ID. I have been > able to add a molecule to the dataframe: > > > PandasTools.AddMoleculeColumnToFrame(cpds,'SMILES','Molecule',includeFingerprints=False) > print([str(x) for x in cpds.columns]) > > But I can't seem to figure out how to create and append a fingerprint. > I'm open to any options as I'm new and don't have any particular structure > I like to work in. Of course once I have this I'd like to do similarity > searches either in RDKit or chemfp etc. someday. > > Can you point me to where this might have been done? I've searched and > searched but I can't seem to find a solution that will work for me. > > Thanks, > Jason Ochoada > St. Jude Children's Research Hospital > _______________________________________________ > Rdkit-discuss mailing list > Rdkit-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/rdkit-discuss >
_______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss