Hi all,

This is my first post on the rdkit mailing list, but I've been using it for a 
few months now (and think it's awesome by the way).

I've found a slightly quirky behaviour.

Rdkit can read in the below mol block but then the smiles it produces cannot be 
read in again.

I think the problem is the lack of explicit hydrogen on the aromatic sulphur, 
leading to an inability to kekulize.

I was wondering why this might occur?

Thanks,

Anthony

Sim_1 = 'C=CCn1c2ccc(S(N)(=O)=O)cc2sc1NS(=O)(=O)c1ccc(Cl)s1' # None mol -> what 
rdkit outputs

Smi_2 = 'C=CCn1c2ccc(S(N)(=O)=O)cc2sc1=NS(=O)(=O)c1ccc(Cl)s1' # Not None

Smi_3 = 'C=CCn1c2ccc(S(N)(=O)=O)cc2[sH]c1NS(=O)(=O)c1ccc(Cl)s1' # Not None

# Smi_2 and Smi_3 both hold a different oxidation state for the sulphur. 
According to the SDF it should be Smi_3.

from rdkit import Chem

sdf= """probmol
     RDKit          3D

26 28  0  0  0  0  0  0  0  0999 V2000
   40.2640  -47.5920   65.9800 N   0  0  0  0  0  0  0  0  0  0  0  0
   41.1750  -46.7850   66.5750 C   0  0  0  0  0  0  0  0  0  0  0  0
   41.4340  -46.7010   67.9480 C   0  0  0  0  0  0  0  0  0  0  0  0
   42.4150  -45.8090   68.3990 C   0  0  0  0  0  0  0  0  0  0  0  0
   43.1240  -45.0120   67.4860 C   0  0  0  0  0  0  0  0  0  0  0  0
   42.8540  -45.1070   66.1140 C   0  0  0  0  0  0  0  0  0  0  0  0
   41.8740  -45.9990   65.6750 C   0  0  0  0  0  0  0  0  0  0  0  0
   41.4130  -46.2370   64.0380 S   0  0  0  0  0  0  0  0  0  0  0  0
   40.2720  -47.4090   64.6280 C   0  0  0  0  0  0  0  0  0  0  0  0
   39.4590  -48.0810   63.8510 N   0  0  0  0  0  0  0  0  0  0  0  0
   39.3560  -48.5030   66.6990 C   0  0  0  0  0  0  0  0  0  0  0  0
   39.9550  -49.8630   66.8630 C   0  0  0  0  0  0  0  0  0  0  0  0
   40.2440  -50.3500   68.0660 C   0  0  0  0  0  0  0  0  0  0  0  0
   39.3310  -47.9450   62.1280 S   0  0  0  0  0  0  0  0  0  0  0  0
   40.7120  -48.0440   61.5180 O   0  0  0  0  0  0  0  0  0  0  0  0
   38.4830  -49.0830   61.6020 O   0  0  0  0  0  0  0  0  0  0  0  0
   38.5560  -46.4150   61.7050 C   0  0  0  0  0  0  0  0  0  0  0  0
   39.4690  -45.0310   61.2360 S   0  0  0  0  0  0  0  0  0  0  0  0
   37.9620  -44.2200   61.0510 C   0  0  0  0  0  0  0  0  0  0  0  0
   36.8440  -44.9740   61.3330 C   0  0  0  0  0  0  0  0  0  0  0  0
   37.8570  -42.5080   60.5230 Cl  0  0  0  0  0  0  0  0  0  0  0  0
   37.1890  -46.2470   61.7120 C   0  0  0  0  0  0  0  0  0  0  0  0
   44.3650  -43.8910   68.0560 S   0  0  0  0  0  0  0  0  0  0  0  0
   45.0700  -44.4650   69.2670 O   0  0  0  0  0  0  0  0  0  0  0  0
   45.3810  -43.6550   66.9600 O   0  0  0  0  0  0  0  0  0  0  0  0
   43.6310  -42.3950   68.5100 N   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  1  0
  1 11  1  0
  2  3  2  0
  3  4  1  0
  5 23  1  0
  5  4  2  0
  6  5  1  0
  7  6  2  0
  7  2  1  0
  8  9  2  0
  8  7  1  0
  9  1  1  0
10  9  1  0
11 12  1  0
12 13  2  0
14 10  1  0
15 14  2  0
16 14  2  0
17 22  2  0
17 14  1  0
18 17  1  0
19 18  1  0
19 20  2  0
20 22  1  0
21 19  1  0
23 26  1  0
23 24  2  0
25 23  2  0
M  END"""

mol = Chem.MolFromMolBlock(sdf)

mol is None

# Gives false

# Then convert to smiles and back
smimol = Chem.MolFromSmiles(Chem.MolToSmiles(mol))

smimol is None

# Gives true


------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss

Reply via email to