Dan You can look for an exact match in a large dataset using fastsearch format. Index the data file (which must not be compressed):
obabel datset.xxx -ofs Do an exact structure search: obabel dataset.fs -O out.yyy -s "SMILES" exact where the SMARTS string can be replaced by a filename containing one or more structures. To get the 5 most similar matches: obabel dataset.fs -osmi -s "SMILES" -at5 -aa or all the matches with Tanimoto>0.75 obabel dataset.fs -osmi -s "SMILES" -at0.75 -aa Get information: obabel -L fs obabel -L s obabel -L fpt Chris On 24/01/2011 09:42, Noel O'Boyle wrote: > Hi Dan, > > Regarding (1) the relevant section in the docs is at > http://openbabel.org/docs/2.3.0/Fingerprints/fingerprints.html. I > think that the section on Similiarity Searching answers this question. > > Question (2) is about searching for exact matches. Currently the only > way to do this is matching by canonical SMILES or by InChI, e.g. see > the section on the InChI descriptor at > http://openbabel.org/docs/2.3.0/Command-line_tools/babel.html#inchi-descriptor. > If you are doing multiple searches, I would use the substructure > search described at > http://openbabel.org/docs/2.3.0/Fingerprints/fingerprints.html to > extract a small set of potential exact matches and then search those > using the InChI descriptor. > > I hope this answers your questions. I'm ccing to the openbabel-discuss > list where someone else might have a better idea. > > Regards, > Noel > > On 23 January 2011 14:50, Daniel Zaharevitz<zahar...@mail.nih.gov> wrote: >> Hi Noel, >> >> I've been playing with Open Babel some more. I had no trouble getting it >> working on my Mac using fink or downloading and compiling from scratch on my >> Linux box. Things work generally as I expect but there are a few things that >> I'm not sure about. There are two things I'm most interested in doing: 1) >> using it to get a similarity score for a given structure w/r to a >> (300K-500K) set of structures and 2) for a given structure check to see if >> there are exact structure matches in a (300K-500K) set of structures. If you >> have any pointers to documentation, suggestions or experience to pass on, >> I'd appreciate it. Right now I'm not sure I understand the parameters >> associated with similarity scores and I can't seem to find anything in the >> docs beyond the tutorial examples. It seems the query is taken as a >> substructure and thus if the entire substructure is present in the test >> molecule you can get a score of 1.0 even if the test molecule contains more >> than the query. If possible, I'm looking for 1.0 to be returned only if the >> entire test molecule matches the query. >> >> Thanks, >> DanZ >> >> >> /******************************************** >> * Daniel Zaharevitz >> * Chief, Information Technology Branch >> * Developmental Therapeutics Program >> * National Cancer Institute >> * zahar...@mail.nih.gov >> * >> ********************************************/ >> >> >> >> > ------------------------------------------------------------------------------ > Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! > Finally, a world-class log management solution at an even better price-free! > Download using promo code Free_Logger_4_Dev2Dev. Offer expires > February 28th, so secure your free ArcSight Logger TODAY! > http://p.sf.net/sfu/arcsight-sfd2d > _______________________________________________ > OpenBabel-discuss mailing list > OpenBabel-discuss@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/openbabel-discuss > ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ OpenBabel-discuss mailing list OpenBabel-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/openbabel-discuss