On Dec 3, 2016, at 3:02 PM, Brian Kelley wrote: > If I had to pick, I would just use the normal MolFromSmiles, if you don't > expect many actual smiles strings in your corpus, it's plenty fast.
I didn't follow from your timings what you used to see if something was a SMILES candidate? Was it word splitting, or was it my regex? Would it detect the SMILES in my examples: The combination of phenol (c1ccccc1O) and .... The SMILES for phenol is c1ccccc1O. Precision and recall can, of course, be more important than performance. Anyone want to take on the boring job of developing a corpus and putting together a benchmark? It certainly isn't going to be me. :) Or perhaps it already exits? Andrew da...@dalkescientific.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss