On Dec 3, 2016, at 3:02 PM, Brian Kelley wrote:
> If I had to pick, I would just use the normal MolFromSmiles, if you don't 
> expect many actual smiles strings in your corpus, it's plenty fast.

I didn't follow from your timings what you used to see if something was a 
SMILES candidate?

Was it word splitting, or was it my regex? Would it detect the SMILES in my 
examples:

   The combination of phenol (c1ccccc1O) and ....
   The SMILES for phenol is c1ccccc1O.

Precision and recall can, of course, be more important than performance.

Anyone want to take on the boring job of developing a corpus and putting 
together a benchmark? It certainly isn't going to be me. :) Or perhaps it 
already exits?



                                Andrew
                                da...@dalkescientific.com



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to