I've been trying to get my head around what's happening when I read
and write isomeric smiles. As a user, I hope that the same molecule
will also have the same isomeric SMILES. However, look at the
following examples using cinfony which read a SMILES string and write
an isomeric SMILES string...

I'm trying to specify the chirality of the carbon in
chlorobromomethane, but RDKit is not picking up on the chirality:

>>> rdk.readstring("smi", "[C](Cl)Br").write("iso")
'ClCBr'
(No chirality, as expected)

>>> rdk.readstring("smi", "[C@@H](Cl)Br").write("iso")
'Cl[CH]Br'
>>> rdk.readstring("smi", "[...@](Cl)Br").write("iso")
'ClCBr'
>>> rdk.readstring("smi", "c...@]br").write("iso")
'ClCBr'
>>> rdk.readstring("smi", "Cl[C@@H]Br").write("iso")
'Cl[CH]Br'
(Expected chirality, but didn't get it)

Let's try 1-chloro,1-bromoethane:

>>> rdk.readstring("smi", "Cl[C@@](Br)C").write("iso")
'CC(Cl)Br'
(Expected chirality, but didn't get it)
>>> rdk.readstring("smi", "Cl[C@@H](Br)C").write("iso")
'C[C@@H](Cl)Br'
(Expected chirality, and got it)

Is the problem with me or with RDKit?

On a related note, I have found that RDKit, when reading SDF files,
turns all of the hydrogens into implicit hydrogens. However, when
reading SMILES strings, it retains any explicit hydrogens specified in
C@@H expressions. This doesn't seem to be consistent and requires the
user to remove hydrogens if he/she wants to create a canonical smiles
string.

Apologies in advance if my understanding of SMILES is shaky.

Regards,
    Noel

Reply via email to