Here are some number from my laptop for parsing:

Normal Smiles parser:
=================
Proper Smiles 11K/s
Non Smiles words: 94K/s

Don't make molecules (n.b. accepts some 'bad' smiles like C1CCC3)
=================
Proper Smiles:  110K/s
Non Smiles words: 130K/s


If I had to pick, I would just use the normal MolFromSmiles, if you don't
expect many actual smiles strings in your corpus, it's plenty fast.

Cheers,
 Brian


On Fri, Dec 2, 2016 at 5:08 PM, Andrew Dalke <da...@dalkescientific.com>
wrote:

> On Dec 2, 2016, at 10:05 PM, Brian Kelley wrote:
> > Here is a very old version of Andrew's parser in code form: ... It was
> fairy well tested on the sigma catalog back in the day.  It might be fun to
> resurrect use it in some form.
>
> There's also my OpenSMILES parser written for Ragel:
>
>   https://bitbucket.org/dalke/opensmiles-ragel
>
> Taking that path goes more along the lines of what NextMove has done.
>
> BTW, upon consideration,
>
> >>   [^]]*   # ignore anything up to the ']'
>
> should be more restrictive and exclude '[', ' ', newline ... or really,
> only allow those characters which are valid after the element (+, -, 0-9,
> @, :, T, H, and a few others).
>
> The exercise is left for the students. ;)
>
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, SlashDot.org! http://sdm.link/slashdot
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most 
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to