On 2 March 2010 11:23, Greg Landrum <greg.land...@gmail.com> wrote:
> Dear Noel,
>
> Thanks for the repost; this helps.
>
> My 2 cents are below.
>
> On Tue, Mar 2, 2010 at 11:34 AM, Noel O'Boyle <baoille...@gmail.com> wrote:
>> On 2 March 2010 09:40, Peter Murray-Rust <pm...@cam.ac.uk> wrote:
>>> Thanks,
>>> This is a useful initiative
>>>
>>> On Tue, Mar 2, 2010 at 9:14 AM, Noel O'Boyle <baoille...@gmail.com> wrote:
>>>>
>>>> (Reposted from my blog following Greg's suggestion )
>>>>
>>>> Hello all,
>>>>
>>>> Right now, I'm adding stereo (i.e. double bond stereochemistry, and
>>>> chirality) to the MDL Mol format in OpenBabel. There are three places
>>>> where stereochemical information can be stored in these files: the
>>>> coordinates, the atom parity (in the atom block), the bond stereo (in
>>>> the bond block).
>>>>
>>>> My current understanding is that where 3D coordinates are present,
>>>> there's no need to store stereochemical information in either the atom
>>>> parity or the bond block. I think I'll probably set the atom parity
>>>> anyway (since I've already written the code, and it helps when you
>>>> look at the file to be able to easily identify the chiral centers).
>
> Agreed that setting parity is a useful service to human readers but,
> as is already mentioned below, the spec is quite clear that these
> flags should be ignored on read.
>
>>>>
>>>
>>> The main problem is lack of information as to whether the geometry (2D or
>>> 3D) is definitive or arbitrary. It is impossible to construct a 3D model of
>>> (say) alanine without a perceived stereochemistry at the Carbon. Similarly
>>> most modern 2D graphic programs will draw a double bond as cis or trans (not
>>> normally linear although this was common in typesetting). If the (arbitrary)
>>> geometry is then transmitted without details of authoring, then the reader
>>> may assume a definitive stereochemistry. Put another way, there is no way of
>>> indicating by coordinates alone that stereochemistry is unknown. I thinks
>>> it's very important not to use the geometry as definitive unless it is clear
>>> that the author specified it (which normally only comes from crystal
>>> structures or computational chemistry).
>>
>> Sure, but I think this is outside the scope here.
>
> I'm not sure I agree. I think this is one of the critical points when
> doing CTABs: when writing 3D or 2D coordinates how do you indicate
> what you *don't* know as well as indicating what you *do* know.
>
> In2D (and 3D) the problem is stereochemistry around double bonds: the
> coordinates provided in the output determine the stereochemistry.
> Luckily here the CTAB spec provides a way to indicate what isn't
> known: you use the 4th field in the bond line to indicate that the
> bond is an "either" bond (value 3). Technically this is what should be
> done by any toolkit that builds a molecule from the SMILES CC=CC.
>
> With atomic stereochemistry in 3D structures, the coordinats again
> determine the stereochemistry. As far as I know, the CTAB spec doesn't
> provide specific guidance about what to do when you have a
> stereocenter that's undetermined in your molecule. One possibility is
> to make sure that the bonds from that atom have 0 in field 4. Maybe
> it's "polite" to assign an either bond here as well (value 4 in this
> case) to make explicit to the viewer that the stereochemistry isn't
> known. But either of these raise the question of what to do if you
> *do* know the stereochemistry. My opinion here, and I'm aware it's one
> that many people do not share, is that it's best to treat the 3D case
> the same as the 2D one and use a wedged bond to mark atoms where the
> stereochemistry is known. It's somewhat ugly, but it has the advantage
> of being consistent (yes, yes, I know, when foolish it's the hobgoblin
> of little minds... but I don't think it's foolish here).
>
>>> P.
>>>
>>>>
>>>> For 2D coordinates, there's no need to store the bond stereochemistry
>>>> (as this can be worked out from the coordinates), but chirality needs
>>>> to be stored explicitly. The normal way to store this is not using
>>>> atom parity (but I'll set this anyway for the same reasons as above),
>>>> but by setting one of the bonds on the tetrahedral center to up or
>>>> down.
>>>>
>>>> For 0D coordinates, there are no guidelines. I propose to store
>>>> cis/trans stereo using the bond stereo (you know, UP [or DOWN] at both
>>>> ends of a double bond means cis), and chirality using the atom parity.
>>>> The MDL spec states that atom parity should be ignored when read,
>>>
>>> I know this is the spec and I don't want to get into more arguments about
>>> whether it should be changed. At this stage I think it is useful if programs
>>> have the capability to read and interpret this field.
>>
>> I think that I may move this to an option. So, if you don't explicitly
>> ask for it, you will just get what the spec says - i.e. no
>> stereochemistry will be stored if there are no coordinates.
>
> This is what I would suggest. Anything else involves introducing
> conventions that will work with OB, but that may or may not work with
> other toolkits. Since there's no clear answer, or anything that even
> really makes much sense, it's probably best to not include stereo info
> in 0D CTABs (except for atomic parity).
>
>>
>>>>
>>>> but
>>>> the alternative is to just forget the stereochemistry, or else to
>>>> store both cis/trans stereo *and* chirality in the bond block, which
>>>> may just about be possible but is likely to be a real mess.
>>>>
>>> Is it ambiguous or merely complicated? If the latter then we should use it
>>> to remove ambiguity.
>>
>> As it is (for 2D), it's already ambiguous. The interpretation of a
>> hash or wedge bond between two stereocentres is ambiguous (as in one
>> toolkit may interpret as describing the stereo only at the start,
>> while another might interpret it as describing the stereo at the
>> beginning and end).
>
> As you say, the spec here is ambiguous. I believe that the convention
> "wedged bonds only affect the begin atom" is fairly broadly used
> though, so that one should be safe. Note: I just tested this in marvin
> sketch and chemdraw and chemdraw actually complains about having a
> wedged bond connect two stereocenters, marvin assigns stereo only to
> the start atom.
>
>> In the case of 0D, if you cram all of the
>> stereochemical information into the bond block it will only get worse;
>> you will have situations like a stereochemical center attached to a
>> double bond. Can the same single bond be used to indicate both
>> cis/trans across the double bond, and the chirality of the center? All
>> of these problems can be avoided using conventions, but the spec
>> doesn't go that far.
>
> nasty stuff... better to avoid stereochem in 0D files.
>
> -greg

Are some of the wedge/hash bonds in typical MOL files unrelated to
stereochemistry? That is, are some purely for depiction? If I knew
this for sure, I would not retain the wedge/hash bond designations in
the input but just work them out from the perceived stereo.

- Noel

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Blueobelisk-discuss mailing list
Blueobelisk-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/blueobelisk-discuss

Reply via email to