Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string

2010-09-19 Thread Greg Landrum
Dear Paul,

On Sun, Sep 19, 2010 at 3:16 PM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote:

 I was hoping that the tools of RDKit can be used in the  generation of
 such files (starting from SMILES or a 2D mol2 description [1]).  It
 seems to me that many of the data items *can* be generated.  The
 question I had was (something like): how hard would it be to fill the
 data for these columns:

 _chem_comp_bond.value_dist_esd, _chem_comp_angle.value_angle_esd 
 _chem_comp_tor.value_angle_esd and _chem_comp_plane_atom.dist_esd?


 (I am not sure that this needs an answer any more than you have already
 given - I'll start digging).

Feel free to ask in case you encounter anything missing/unexpected/confusing.
:-)

-greg

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string

2010-09-18 Thread Paul Emsley
On 18/09/10 05:19, Greg Landrum wrote:
 On Sat, Sep 18, 2010 at 12:05 AM, Geoffrey Hutchison
 ge...@geoffhutchison.net  wrote:

 So now with the replace function in python I can easily remove
 sterochem information from the molecule.

 smiles_corrected = smiles_broken.replace(@,)

 Once I remove the stereochemistry , libcheck does the right thing and
 gives me the right 3D coordinates.

 This doesn't make chemical sense, though. If libcheck is operating on a 
 SMILES without stereochemistry, there's no way it can always give the right 
 3D coordinates. If you have N stereo centers, the chance of a correct 3D 
 structure will be (0.5)^N.

 I'd suggest using a different tool. For example, the upcoming Open Babel 2.3 
 will handle 3D coordinate generation while ensuring stereochemistry.

 But you don't have to use OB -- I'm just saying that your 3D coordinates 
 won't respect stereo with your approach.
  
 Geoff's point is a good one: if you remove the stereochemistry
 information from the SMILES and then generate 3d coordinates, your
 odds of getting a correct 3d structure are not good. I had assumed
 that you had bad stereochemistry info in the SMILES that you wanted to
 get rid of. If the stereochem is correct, then it might be a good idea
 to try Geoff's idea and use OB 2.3 when it's released or to use the
 RDKit's 3D coordinate generation (also respects stereochemistry),
 write the files as SDF, and then use the current version of OB to
 translate to a PDB if you need things in that format.



The (additional) useful thing libcheck can do is generate esd geometry 
restraints for crystallographic refinement (something like the spring 
constants, e.g. 0.02A for a C-C single bonds, 3 degrees for C-C-C 
angles etc. (atom type dependent, of course) - also planes and 
torsions). I wonder how hard that would be to get 
similar/compatible/corresponding numbers by digging into RDKit's UFF 
(presumably that would be the way to do it).  Any thoughts/advice?

Thanks,

Paul.



--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss


Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string

2010-09-16 Thread Greg Landrum
Dear Hari,

On Thu, Sep 16, 2010 at 8:44 PM, hari jayaram hari...@gmail.com wrote:
 I am working with several ligands from a database stored in a SMILES
 format.  I am using the SMILES string to get three dimensional
 coordinates (pdb format file)  using a third-party program called
 libcheck.

 For some of these molecules the SMILES string  sterochemistry in the
 database is entered in incorrectly such that the SMILES input to
 libcheck returns a mangled coordinate file with rings clashing  with
 each other . Inputting SMILES string without the stereochemistry makes
 libcheck behave correctly.

 Is there a way to use rdkit to cleanup the stereochemistry in the SMILES 
 string.

To be certain I understand: you would like to remove the
stereochemistry from the SMILES string?

One way to do this is to read in the SMILES then generate a new SMILES
without stereochemistry information:

[1] from rdkit import Chem

[2] m = Chem.MolFromSmiles('c...@h](F)Br')

[3] Chem.MolToSmiles(m)
Out[3] 'FC(Cl)Br'

A potential problem with this is that it changes the atom ordering.

However, the simplest way to remove stereochemistry information from
SMILES doesn't use the RDKit at all, you just remove @ characters
from the string:

[4] smi = 'c...@h](F)Br'

[5] smi.replace('@','')
Out[5] 'Cl[CH](F)Br'

Hope this helps,
-greg

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
Rdkit-discuss mailing list
Rdkit-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/rdkit-discuss