Hi Noel,

You already figured out the problem with the chirality of
chlorobromomethane, but I want to clarify a couple of things below.

On Mon, Apr 14, 2008 at 12:50 PM, Noel O'Boyle <[email protected]> wrote:
>
>  I'm trying to specify the chirality of the carbon in
>  chlorobromomethane, but RDKit is not picking up on the chirality:
>
>  >>> rdk.readstring("smi", "[C](Cl)Br").write("iso")
>  'ClCBr'
>  (No chirality, as expected)

Just to be clear on this one, the output here is not technically
correct; you've input a molecule with the formula CClBr (you told the
software that the C has no implicit Hs by putting it in square
brackets), the output however is for something with the formula
CH2ClBr. This is actually a bug; thanks for finding it. :-)
https://sourceforge.net/tracker/index.php?func=detail&aid=1942220&group_id=160139&atid=814650

>  >>> rdk.readstring("smi", "[C@@H](Cl)Br").write("iso")
>  'Cl[CH]Br'
>  >>> rdk.readstring("smi", "[...@](Cl)Br").write("iso")
>  'ClCBr'
>  >>> rdk.readstring("smi", "c...@]br").write("iso")
>  'ClCBr'
>  >>> rdk.readstring("smi", "Cl[C@@H]Br").write("iso")
>  'Cl[CH]Br'
>  (Expected chirality, but didn't get it)

As you've realized: this molecule isn't chiral, so the RDKit is doing
the right thing by not marking chirality. It's doing something
arguable with the canonical smiles though, because it's showing the
explicit H (inside the square brackets). If you input exactly the same
molecule as ClCBr, you'd get a different canonical smiles. This is a
known oddity of the way things are currently handled internally and I
haven't quite figured out a solution yet. Basically explicit Hs remain
always explicit, even if they don't need to be.

>  Let's try 1-chloro,1-bromoethane:
>
>  >>> rdk.readstring("smi", "Cl[C@@](Br)C").write("iso")
>  'CC(Cl)Br'
>  (Expected chirality, but didn't get it)

Again, the molecule as provided isn't chiral because carbon 1 only has
three neighbors (you've told it that there are no implicit Hs).

>  >>> rdk.readstring("smi", "Cl[C@@H](Br)C").write("iso")
>  'C[C@@H](Cl)Br'
>  (Expected chirality, and got it)

It's even the right chirality, which is good to see. :-)

>  Is the problem with me or with RDKit?

I'll answer that "or" question with a "yes", because it's  a little of both. :-)

>  On a related note, I have found that RDKit, when reading SDF files,
>  turns all of the hydrogens into implicit hydrogens.

correct.

>  However, when
>  reading SMILES strings, it retains any explicit hydrogens specified in
>  C@@H expressions. This doesn't seem to be consistent and requires the
>  user to remove hydrogens if he/she wants to create a canonical smiles
>  string.

I commented on this above. It's a known problem and I've been stewing
over how to solve it for a while. Now that someone other than me is
complaining I'll bump it up a bit in priority.

-greg

Reply via email to