On 2017-01-11 19:26, Milinda Samaraweera wrote:
Dear Experts,

I was trying to read in the attached SD file (downloaded from HMDB) and trying to calculate the exact mass of each entry:

‚Äč[...]

By running the script, I got a barrage of errors as:

[13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1994014
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:14] ERROR: Could not sanitize molecule ending on line 1996036
[13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted
[13:15:16] Explicit valence for atom # 46 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302532
[13:15:16] ERROR: Explicit valence for atom # 46 N, 4, is greater than permitte
[13:15:16] Explicit valence for atom # 16 N, 4, is greater than permitted
[13:15:16] ERROR: Could not sanitize molecule ending on line 2302918
[13:15:16] ERROR: Explicit valence for atom # 16 N, 4, is greater than permitte
[13:15:17] Explicit valence for atom # 11 N, 4, is greater than permitted
[13:15:17] ERROR: Could not sanitize molecule ending on line 2556541
[13:15:17] ERROR: Explicit valence for atom # 11 N, 4, is greater than permitte
[13:15:18]  S group SUP ignored on line 2836416
[13:15:18] Explicit valence for atom # 1 Cl, 4, is greater than permitted
[13:15:18] ERROR: Could not sanitize molecule ending on line 2841449
[13:15:18] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than permitte
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 10 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 17 ignored.
[13:15:19] Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] ERROR: Could not sanitize molecule ending on line 3107498
[13:15:19] ERROR: Explicit valence for atom # 3 B, 4, is greater than permitted
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:19] Warning: conflicting stereochemistry at atom 6 ignored.
[13:15:20] Unhandled CTAB feature: S group SRU on line: 3205922. Molecule skip
[13:15:20] Explicit valence for atom # 0 Mg, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3222378
[13:15:20] ERROR: Explicit valence for atom # 0 Mg, 4, is greater than permitte
[13:15:20] Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3265386
[13:15:20] ERROR: Explicit valence for atom # 2 N, 4, is greater than permitted
[13:15:20] Explicit valence for atom # 31 N, 4, is greater than permitted
[13:15:20] ERROR: Could not sanitize molecule ending on line 3305754
[13:15:20] ERROR: Explicit valence for atom # 31 N, 4, is greater than permitte
[13:15:21] Explicit valence for atom # 45 N, 4, is greater than permitted
[13:15:21] ERROR: Could not sanitize molecule ending on line 3437055
[13:15:21] ERROR: Explicit valence for atom # 45 N, 4, is greater than permitte
[13:15:56] Explicit valence for atom # 3 C, 5, is greater than permitted
[13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
[13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than permitted

What causes these errors? there a way to suppress or solve the errors? or way to stop priting them up in the command prompt.

--
Thanks,
Milinda Samaraweera,

Hi Milinda,

The errors are caused by valence errors in the SD file.

> [13:15:14] ERROR: Could not sanitize molecule ending on line 1993855
> [13:15:14] ERROR: Explicit valence for atom # 9 O, 3, is greater than permitted

This molecule has a single bond and a double bond to one of the oxygens = 3 valences used. Standard valence for uncharged oxygen is 2, so that won't work. Given that there is a negatively charged Cl in the molecule it is likely that this oxygen should have had a positive charge of one assigned. This would make the error go away.

The nitrogens with valence 4 should probably also have a positive charge applied - or bond orders adjusted.

> [13:15:56] ERROR: Could not sanitize molecule ending on line 8391489
> [13:15:56] ERROR: Explicit valence for atom # 3 C, 5, is greater than permitted

Pentavalent carbon... hmm... could in principle be fixed with a negative charge on the carbon, but where is the counterion ?

I would recommend that you leave the error messages turned on - they do tell you that there is something wrong with the input structures. But if you really want to turn them off there is a discussion about it here:
https://sourceforge.net/p/rdkit/mailman/message/30036309/

In principle it should be possible to fix these errors on the fly by loading the molecules without sanitization, detect the valence issues and somehow automagically correct them, then sanitize and calculate mol weights. In this case it looks like it would be easier and faster to locate the relatively few molecules with valence errors and fix them manually.

Cheers
-- Jan Holst Jensen, Biochemfusion
------------------------------------------------------------------------------
Developer Access Program for Intel Xeon Phi Processors
Access to Intel Xeon Phi processor-based developer platforms.
With one year of Intel Parallel Studio XE.
Training and support from Colfax.
Order your platform today. http://sdm.link/xeonphi
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to