Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string
Dear Paul, On Sun, Sep 19, 2010 at 3:16 PM, Paul Emsley paul.ems...@bioch.ox.ac.uk wrote: I was hoping that the tools of RDKit can be used in the generation of such files (starting from SMILES or a 2D mol2 description [1]). It seems to me that many of the data items *can* be generated. The question I had was (something like): how hard would it be to fill the data for these columns: _chem_comp_bond.value_dist_esd, _chem_comp_angle.value_angle_esd _chem_comp_tor.value_angle_esd and _chem_comp_plane_atom.dist_esd? (I am not sure that this needs an answer any more than you have already given - I'll start digging). Feel free to ask in case you encounter anything missing/unexpected/confusing. :-) -greg -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string
On 18/09/10 05:19, Greg Landrum wrote: On Sat, Sep 18, 2010 at 12:05 AM, Geoffrey Hutchison ge...@geoffhutchison.net wrote: So now with the replace function in python I can easily remove sterochem information from the molecule. smiles_corrected = smiles_broken.replace(@,) Once I remove the stereochemistry , libcheck does the right thing and gives me the right 3D coordinates. This doesn't make chemical sense, though. If libcheck is operating on a SMILES without stereochemistry, there's no way it can always give the right 3D coordinates. If you have N stereo centers, the chance of a correct 3D structure will be (0.5)^N. I'd suggest using a different tool. For example, the upcoming Open Babel 2.3 will handle 3D coordinate generation while ensuring stereochemistry. But you don't have to use OB -- I'm just saying that your 3D coordinates won't respect stereo with your approach. Geoff's point is a good one: if you remove the stereochemistry information from the SMILES and then generate 3d coordinates, your odds of getting a correct 3d structure are not good. I had assumed that you had bad stereochemistry info in the SMILES that you wanted to get rid of. If the stereochem is correct, then it might be a good idea to try Geoff's idea and use OB 2.3 when it's released or to use the RDKit's 3D coordinate generation (also respects stereochemistry), write the files as SDF, and then use the current version of OB to translate to a PDB if you need things in that format. The (additional) useful thing libcheck can do is generate esd geometry restraints for crystallographic refinement (something like the spring constants, e.g. 0.02A for a C-C single bonds, 3 degrees for C-C-C angles etc. (atom type dependent, of course) - also planes and torsions). I wonder how hard that would be to get similar/compatible/corresponding numbers by digging into RDKit's UFF (presumably that would be the way to do it). Any thoughts/advice? Thanks, Paul. -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] newbie help cleaning up sterochemistry in SMILES string
Dear Hari, On Thu, Sep 16, 2010 at 8:44 PM, hari jayaram hari...@gmail.com wrote: I am working with several ligands from a database stored in a SMILES format. I am using the SMILES string to get three dimensional coordinates (pdb format file) using a third-party program called libcheck. For some of these molecules the SMILES string sterochemistry in the database is entered in incorrectly such that the SMILES input to libcheck returns a mangled coordinate file with rings clashing with each other . Inputting SMILES string without the stereochemistry makes libcheck behave correctly. Is there a way to use rdkit to cleanup the stereochemistry in the SMILES string. To be certain I understand: you would like to remove the stereochemistry from the SMILES string? One way to do this is to read in the SMILES then generate a new SMILES without stereochemistry information: [1] from rdkit import Chem [2] m = Chem.MolFromSmiles('c...@h](F)Br') [3] Chem.MolToSmiles(m) Out[3] 'FC(Cl)Br' A potential problem with this is that it changes the atom ordering. However, the simplest way to remove stereochemistry information from SMILES doesn't use the RDKit at all, you just remove @ characters from the string: [4] smi = 'c...@h](F)Br' [5] smi.replace('@','') Out[5] 'Cl[CH](F)Br' Hope this helps, -greg -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss