Hi Igor,
On Mon, Dec 29, 2014 at 3:43 AM, Igor Filippov <igor.v.filip...@gmail.com>
wrote:
> I was wondering how complicated it would be to add
> the ability to read polymers from molfiles. Right now I am getting
> something like this:
> Unhandled CTAB feature: S group SRU on line: 75. Molecule skipped
>
The RDKit doesn't do much with most information from the S Group section of
CTABs. SRU is a special case because it's a clear indication that the CTAB
contains information about something that the RDKit cannot currently
properly represent: a polymer. Rather than constructing a molecule which is
definitely wrong, the code generates an error. This is the usual RDKit
approach. What is at least somewhat different with this case is that there
is no way to disable the check.
> What I would prefer:
> 1) The molecule is read in and some kind of flag is set to signify that it
> is a polymer
> 2) the position of the brackets is saved in some structure a user can query
>
> A crude way to achieve "1" would be to just skip the "M STY" and similar
> lines
> while setting "is_polymer" flag, not sure if this is the right approach
> though.
>
The problem with having the standard mol block parser set an isPolymer flag
by default is that code expecting "normal" molecules would always have to
check it in order to ensure that they aren't getting polymers.
I could imagine a couple solutions to this:
1) adding additional arguments to the mol file parser that allows calling
code to specify that they are willing to accept polymers and then using
some new data structure to return info about the polymer.
2) extending the applicability of the "strictParsing" flag (this already
exists) to disable the tests for S groups and either just ignore them or
return them as molecule properties.
> Commercial packages seems to be able to handle this - cactvs, chemaxon,
> accelrys draw,
> so there should be no technical reason RDKit cannot read such files.
>
Ignoring the S group is easy. :-)
-greg
------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
Rdkit-devel mailing list
Rdkit-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-devel