In addition to Andrew's suggestions, I'd also recommend that you submit a
bug report to the maker of your other tool! They probably want to know
about this issue - I know I would if it's one of ours...

*dan nealschneider* | lead developer
[image: Schrodinger Logo] <https://www.schrodinger.com/>


On Fri, Oct 2, 2020 at 2:26 PM Andrew Dalke <da...@dalkescientific.com>
wrote:

> Hi Markus,
>
> > On Oct 2, 2020, at 19:56, Markus Metz <metm...@gmail.com> wrote:
> > I have a question to the sd file format.
> > When I write charged molecules via rdkit I noticed that the charge
> definition in the atom block is not written.
> > The charge is written at the end of the entry.
> > So far this worked perfectly fine for me.
>
>
> The ctfile documentation I have from 2011 says this of the charge
> definition in the atom block:
>
>    Wider range of values in M CHG and M RAD lines below. Retained
>    for compatibility with older Ctabs, M CHG and M RAD lines take
>    precedence.
>
> and
>
>    With Ctab version V2000, the dd and ccc fields have been
>    superseded by the M ISO, M CHG, and M RAD lines in the properties
>    block, described below. For compatibility, all releases since ISIS 1.0:
>
>     • Write appropriate values in both places if the values
>       are in the old range.
>
>     • Use the atom block fields if there are no M  ISO, M  CHG, or
>       M  RAD lines in the properties block.
>
>    Support for these atom block fields might be removed in future
>    releases of Symyx software.
>
> Further, I looked into this when I wrote the blog post
> http://www.dalkescientific.com/writings/diary/archive/2020/09/25/mixing_text_and_chemistry_toolkits.html
> a couple of week ago, and found the 1992 JCICS paper "Description of
> Several Chemical Structure File Formats Used by Computer Programs Developed
> at Molecular Design Limited" by Dalby et al. has the "Wider range ...
> Retained for compatibility with older Ctabs" in it.
>
> So including the charge in the atom block as well as in the properties
> block is a 28+ year old backwards compatibility practice.
>
>
> > Now, I am using a program which reads the atom block charge info only.
> > Is there a way in rdkit to enable the charge written in the atom block?
>
> No. The code in Code/GraphMol/FileParsers/MolFileWriter.cpp has it
> hard-coded to 0.
>
> > Do you have any thoughts on this?
>
> The two I can think of are:
>   - post-processing to add it back in,
>   - pass it through another toolkit which adds the duplicated charge
> information
>
>
> I've attached a program for the first of these options. The command-line
> tools reads an SDF and generates a new SDF with the "M  CHG" lines added to
> the atom block. Here's the --help:
>
> ===================
> usage: set_atom_block_charges.py [-h] [--output FILENAME] [--roundtrip]
> [--verify] [--no-set] [FILENAME]
>
> copy charge information from the 'M CHG' data line to the atom block
>
> positional arguments:
>   FILENAME              input filename (default: stdin)
>
> optional arguments:
>   -h, --help            show this help message and exit
>   --output FILENAME, -o FILENAME
>                         output file name (default: stdout)
>   --roundtrip           use RDKit to parse the record and regenerate the
> SDF record
>   --verify              ensure the input and output SMILES match
>   --no-set              don't set the charges (useful if you want to see
> the round-trip output)
> ===================
>
> This depends on the latest commercial version chemfp to identify records
> in an SDF and to help with the verification.
>
> While chemfp is not open source, the base license lets you use this
> functionality for in-house use. (See the file for installation details; the
> pre-compiled package only installs on Linux-based OSes.)
>
> Or, you can grab set_atom_block_charges() from the code (and some code it
> depends on) so you don't need chemfp at all.
>
> In the following, I round-trip the input through RDKit but don't set the
> atom block charges:
>
> % python set_atom_block_charges.py piperidine.sdf --roundtrip --no-set
> piperidine
>      RDKit          3D
>
>   6  6  0  0  1  0  0  0  0  0999 V2000
>    -1.4650    0.7843   -0.9210 N   0  0  0  0  0  0  0  0  0  0  0  0
>     0.0601    0.7265   -0.6801 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.6663   -0.3976   -1.5418 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.0188   -1.7539   -1.2886 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -1.5436   -1.6645   -1.4884 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.1760   -0.5554   -0.6261 C   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   1  6  1  0
>   2  3  1  0
>   3  4  1  0
>   4  5  1  0
>   5  6  1  0
> M  CHG  1   1   1
> M  END
> $$$$
>
> In the following, I round-trip it through RDKit then let the tool set the
> charges in the atom block.
>
> % python set_atom_block_charges.py piperidine.sdf --roundtrip
> piperidine
>      RDKit          3D
>
>   6  6  0  0  1  0  0  0  0  0999 V2000
>    -1.4650    0.7843   -0.9210 N   0  3  0  0  0  0  0  0  0  0  0  0
>     0.0601    0.7265   -0.6801 C   0  0  0  0  0  0  0  0  0  0  0  0
>     0.6663   -0.3976   -1.5418 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -0.0188   -1.7539   -1.2886 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -1.5436   -1.6645   -1.4884 C   0  0  0  0  0  0  0  0  0  0  0  0
>    -2.1760   -0.5554   -0.6261 C   0  0  0  0  0  0  0  0  0  0  0  0
>   1  2  1  0
>   1  6  1  0
>   2  3  1  0
>   3  4  1  0
>   4  5  1  0
>   5  6  1  0
> M  CHG  1   1   1
> M  END
> $$$$
>
> You can also use the --verify flag to generate and compare the SMILES
> strings before and after the conversion.
>
> Best regards,
>
>
>                                 Andrew
>                                 da...@dalkescientific.com
>
> _______________________________________________
> Rdkit-discuss mailing list
> Rdkit-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
>
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to