Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi James and Greg, On Oct 25, 2013, at 4:03 AM, Greg Landrum wrote: 1. Do I remember correctly that there was a proposal (from Roger) to add some auto bond-type perception to the PDB parser for ligands (or is that just wishful thinking!)? Roger will have to confirm this, but I believe he said something along the lines of that way lies madness. My first comment is that a computational chemistry toolkit's assign bonds orders, formal charges and protonation states from 3D coordinates function is/ should be a (sanitize-like) step independent of its PDB file reader. For one thing, this functionality is required for reading XYZ format files, Schrodinger maestro files, and quantum mechanics files formats, such as Gaussian and MOPAC. For another thing, many PDB file reading applications don't require bond orders, e.g. GRASP surfaces and many docking functions/forcefield calculations, so handling bond order perception independently of PDB reading has some merit. All I'll say at this stage is that correctly perceiving bonds, formal charges and protonation state (they're all interdependent) is probably more complicated than most folks think. Indeed, many of the crystallographers at the RDKit meeting claimed it was impossible. The bondage algorithm used in OpenEye's OEChem is several thousands of lines of C++, and was still improving (on things like iron-sulfur clusters and oxime vs. nitroso perception) up to the point I left Santa Fe in 2010. The state-of-the-art from a decade ago is described at: http://www.daylight.com/meetings/mug01/Sayle/m4xbondage.html and was used at the time to produce a searchable database of PDB ligands: http://www.metaphorics.com/products/luna.html 3. Is there some explanation for what the ‘flavor’ option does for reading/writing PDB? I'm not sure about the reader. Roger, can you answer that? This is what's in the C++ for the PDBWriter: // PDBWriter support multiple flavors of PDB output // flavor 1 : Write MODEL/ENDMDL lines around each record // flavor 2 : Don't write any CONECT records // flavor 4 : Write CONECT records in both directions // flavor 8 : Don't use multiple CONECTs to encode bond order // flavor 16 : Write MASTER record // flavor 32 : Write TER record This is now in the docs for both the Python and C++ code. The use of an integer file format flavor argument allows the caller to customize the behavior of the readers and writers. The semantics is that a reasonable default is zero (for all bits), but that new features may be added without changing the API/ABI. Most of the bits above (for the writer) control strict compliance with the PDB format specification. For example, a flavor of 12 will write bond orders the way the RCSB expects them both throwing away bond orders and increasing the size of the PDB file. For the reader, the flavor argument controls whether alternate locations are read (for use by PDB power users), or whether a sensible subset of atoms is used for the RDKit::ROMol. 5. It seems to me that GetResidueNumber() and GetSerialNumber() may have got mixed-up at some point(?). At least, when I call GetSerialNumber() I see what appears to be the residue number; and when I call GetResidueNumber() I get “0”! This was another dumb bug from me. It's fixed. Greg is being modest. At the time of the RDKit meeting, the MonomerInfo data structure had just a SerialNumber field which was used for storing residue numbers. One of my suggestions back to Greg was that although everything worked, this nomenclature might be confusing to folks using the API, so it was suggested to rename the field for the Q3 beta. The better solution was to support fields for both ResidueNumber and SerialNumber, but following that change I failed to send the patch to make the reader/ writer use the correct (changed) residueNumber field, and record/honour the serial number field. My apologies. I share some of the blame for this one. 6. I also seem to be seeing all of the bonds (for all residues) being written out in CONECT records – such that they all appear as single bonds in eg PyMOL – is this expected behaviour at the moment? Another one for Roger. I believe this should work fine. RDKit's PDB file writer by default encodes the bond orders, which should be interpreted by PyMol. In the words of the late great Warren: http://www.phenix-online.org/pipermail/phenixbb/2008-April/012188.html We need to check where the bond orders are getting lost. If you read the PDB file back RDKit's PDB file reader and write out the SMILES does it have double bonds? I hope this helps. Many thanks again to Greg for all the code polishing described above. Roger -- Roger Sayle, Ph.D. CEO and founder NextMove Software Limited Registered in England No. 07588305 Registered Office: Innovation Centre (Unit 23), Cambridge Science Park, Cambridge CB4 0EY
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi James, Regarding the AssignBondOrdersFromTemplate() method: As far as I understood, the PDB reader assigns bond orders to the amino acids in a protein, but if a ligand is present it puts all bonds of it to SINGLE bonds as auto bond-type perception is not trivial (see Roger's comments). However, usually one knows which ligand was crystallized (i.e. the SMILES is available), so the AssignBondOrdersFromTemplate() method can be used to set the bond orders based on the known ligand structure. This is the idea of the method. Now, to your real-world application. I'm sorry but I don't think I understand it completely. Do you want to set only the bond orders of a specific substructure? Or would you like to give the function a set of ligands and a set of templates and it figures out which template belongs to which ligand and sets the bonds orders accordingly? Best, Sereina 2013/10/24 Greg Landrum greg.land...@gmail.com James, On Thu, Oct 24, 2013 at 7:27 PM, James Davidson j.david...@vernalis.comwrote: Hi Greg (et al.), ** ** Thanks for the beta! I have been going through some of the recently-added functionality, and had a couple of questions regarding the PDB reading / writing. Thanks for the bug reports! ** **1. **Do I remember correctly that there was a proposal (from Roger) to add some auto bond-type perception to the PDB parser for ligands (or is that just wishful thinking!)? Roger will have to confirm this, but I believe he said something along the lines of that way lies madness. 2. **If not, I notice that there is an AssignBondOrdersFromTemplate() method – but the example in the doc-string only shows (I think) the case where the input PDB is just a single small molecule – so the matching is pretty easy! I think a more real-World case is when one wants to set the bond orders for multiple ligands (HETATM residues) based on substructure matches – which will then return an atom index selection that can be used as a start point. Is there any way to have the AssignBondOrdersFromTemplate() convenience function optionally accept a list of atom indexes to specify a substructure? Sereina? Is that doable? **3. **Is there some explanation for what the ‘flavor’ option does for reading/writing PDB? I'm not sure about the reader. Roger, can you answer that? This is what's in the C++ for the PDBWriter: // PDBWriter support multiple flavors of PDB output // flavor 1 : Write MODEL/ENDMDL lines around each record // flavor 2 : Don't write any CONECT records // flavor 4 : Write CONECT records in both directions // flavor 8 : Don't use multiple CONECTs to encode bond order // flavor 16 : Write MASTER record // flavor 32 : Write TER record This is now in the docs for both the Python and C++ code. **4. **Having read in a PDB file I see the correct atoms flagged as HETATM (from GetIsHeteroAtom()). But when call Chem.MolToPDBBlock() these atoms get written as ATOM records… Also, a Chem.MolToPDBFile() method would be nice for completeness / symmetry : ) The HETATM thing was the result of a dumb copy and paste error from me. It's fixed. Re: Chem.MolToPDBFile() that's missing because there's no corresponding Chem.MolToMolFile() This is an odd oversight, which I've now fixed. **5. **It seems to me that GetResidueNumber() and GetSerialNumber() may have got mixed-up at some point(?). At least, when I call GetSerialNumber() I see what appears to be the residue number; and when I call GetResidueNumber() I get “0”! This was another dumb bug from me. It's fixed. **6. **I also seem to be seeing all of the bonds (for all residues) being written out in CONECT records – such that they all appear as single bonds in eg PyMOL – is this expected behaviour at the moment? Another one for Roger. -greg -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi Sereina, Sereina wrote: Regarding the AssignBondOrdersFromTemplate() method: As far as I understood, the PDB reader assigns bond orders to the amino acids in a protein, but if a ligand is present it puts all bonds of it to SINGLE bonds as auto bond-type perception is not trivial (see Roger's comments). However, usually one knows which ligand was crystallized (i.e. the SMILES is available), so the AssignBondOrdersFromTemplate() method can be used to set the bond orders based on the known ligand structure. This is the idea of the method. Now, to your real-world application. I'm sorry but I don't think I understand it completely. Do you want to set only the bond orders of a specific substructure? Or would you like to give the function a set of ligands and a set of templates and it figures out which template belongs to which ligand and sets the bonds orders accordingly? This is very likely to be me being stupid - so please bear with me! If I read in a complex (pdb), and already have my reference ligand (lig), then AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the reference ligand has not been matched to the ligand in the pdb 'complex' (dot-separated list of molecules). The doc-string states that the method works on two molecules - but I want to work on a reference molecule (lig) and a *substructure* of the macromolecule (pdb). How should I be getting the bound ligand out as a molecule object to then use the AssignBondOrdersFromTemplate() method? Am I missing some new PDB-related methods, or have I forgotten some fundamental RDKit methods for dealing with multi-component molecules? I guess a sensible process would be: 1. Identify any HETATM residues 2. For each residue (or at least those that have bonds!) extract or copy the mol (unless it can be addressed 'in place'?) 3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue name, etc 4. Insert the molecule back into the complex (or update the info if it has been modified 'in place') Is this how the method is intended to be used with complexes (and if so, do you have an example for steps 2 and 4? Thanks James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
On 25/10/13 08:09, James Davidson wrote: Hi Roger, Thanks for the response The use of an integer file format flavor argument allows the caller to customize the behavior of the readers and writers. The semantics is that a reasonable default is zero (for all bits), but that new features may be added without changing the API/ABI. Most of the bits above (for the writer) control strict compliance with the PDB format specification. For example, a flavor of 12 will write bond orders the way the RCSB expects them both throwing away bond orders and increasing the size of the PDB file. As a test, I am using 2VCI, and am retrieving the PDB data from the RCSB using the following import requests url = http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=pdbcompression=NOstructureId=2VCI; response = requests.get(url) pdb_block = response.content response.close() pdb_block shows CONECT records only for the HETATM records. If I now read into RDKit, using the defaults, and write back out using the defaults, I see CONECT records for every atom (ie protein as well). And I can't see any double-bonds rendered in PyMOL: from rdkit import Chem from rdkit.Chem import AllChem pdb = Chem.MolFromPDBBlock(pdb_block) pdb_block_out = Chem.MolToPDBBlock(pdb) First 10 CONECT records of output: CONECT12 CONECT235 CONECT344 10 CONECT56 CONECT67 CONECT7889 CONECT 10 11 CONECT 11 12 14 CONECT 12 13 13 17 CONECT 14 15 16 If I use Chem.MolToPDBBlock(pdb, flavor=12) I do, indeed see the ligand CONECT records in what looks like the original format (albeit now numbered differently), and I still see CONECT records for the protein - but this PDB *will* render double bonds in PyMOL. First 10 CONECT records of output: CONECT344 CONECT788 CONECT 12 13 13 CONECT 19 20 20 CONECT 23 24 24 CONECT 28 29 29 CONECT 35 36 36 CONECT 38 39 39 CONECT 40 42 42 CONECT 41 43 43 If I may be so bold, I believe an important part of the puzzle is missing. The residue-name/3-letter-code/comp-id in the PDB file is a pointer to an entry in the mmCIF-formatted chemical component dictionary that describes the compound, for all compounds for all entries released by the PDB. http://deposit.pdb.org/cc_dict_tut.html If this is an internal PDB file there will, very likely be a similar mmCIF file used for crystallographic refinement. Only when these options fail would I consider turning to bond-order perception and CONECT records. Paul. -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi James, Okay, now it's clear. I somehow (wrongly) thought the PDB reader would give you the protein and the ligand as two molecules and then it wouldn't have been a problem... I will discuss with Greg on how to best do this and get back to you. Best, Sereina 2013/10/25 James Davidson j.david...@vernalis.com Hi Sereina, Sereina wrote: Regarding the AssignBondOrdersFromTemplate() method: As far as I understood, the PDB reader assigns bond orders to the amino acids in a protein, but if a ligand is present it puts all bonds of it to SINGLE bonds as auto bond-type perception is not trivial (see Roger's comments). However, usually one knows which ligand was crystallized (i.e. the SMILES is available), so the AssignBondOrdersFromTemplate() method can be used to set the bond orders based on the known ligand structure. This is the idea of the method. Now, to your real-world application. I'm sorry but I don't think I understand it completely. Do you want to set only the bond orders of a specific substructure? Or would you like to give the function a set of ligands and a set of templates and it figures out which template belongs to which ligand and sets the bonds orders accordingly? This is very likely to be me being stupid - so please bear with me! If I read in a complex (pdb), and already have my reference ligand (lig), then AllChem.AssignBondOrdersFromTemplate(lig, pdb) fails because the reference ligand has not been matched to the ligand in the pdb 'complex' (dot-separated list of molecules). The doc-string states that the method works on two molecules - but I want to work on a reference molecule (lig) and a *substructure* of the macromolecule (pdb). How should I be getting the bound ligand out as a molecule object to then use the AssignBondOrdersFromTemplate() method? Am I missing some new PDB-related methods, or have I forgotten some fundamental RDKit methods for dealing with multi-component molecules? I guess a sensible process would be: 1. Identify any HETATM residues 2. For each residue (or at least those that have bonds!) extract or copy the mol (unless it can be addressed 'in place'?) 3. Use AssignBondOrdersFromTemplate() - relying on lookup be eg residue name, etc 4. Insert the molecule back into the complex (or update the info if it has been modified 'in place') Is this how the method is intended to be used with complexes (and if so, do you have an example for steps 2 and 4? Thanks James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __ -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
On Oct 25, 2013, at 10:11 AM, Roger Sayle wrote: The use of an integer file format flavor argument allows the caller to customize the behavior of the readers and writers. The semantics is that a reasonable default is zero (for all bits), but that new features may be added without changing the API/ABI. For some background, this is the API style used by OpenEye's high-level readers and writers. There's more explanation at: http://www.eyesopen.com/docs/toolkits/current/html/OEChem_TK-python/molreadwrite.html#flavored-input-and-output It solves a difficult problem, which is that there is no such thing as the PDB format. (For that matter, there are also variations of the MDL format, if only because the output writer could use V3000 format for all cases, vs. V3000 only when V2000 can't support the structure.) RDKit also supports different input and output flavors, though it uses parameter attributes, like sanitize=False or removeHs=False for reading an SD file. OEChem's interface is more generic, in that the single 'flavor' parameter exists for the high-level readers, which is easier to pass around in a C++ toolkit. (OTOH, this is less important for Python code. In chemfp, I just pass around a Python dictionary of kwargs and apply it like: SDMolSupplier(filename, **kwargs). ) However, these integer flags are tricky to use in practice. For example, if you see flavor=49, what does it mean? Few people will be able to look at that number and know it's: bit 1 = Write MODEL/ENDMDL lines around each record bit 16 = Write MASTER record bit 32 = Write TER record For OEChem support, I ended up writing my own conversion routines between the integer and a string notation. After all, I would rather people do: rdkit2fps input.pdb --flavor MASTER|MODEL|TER than have to do bitwise or-ing themselves for: rdkit2fps input.pdb --flavor 49 Bitflags also don't mix well with non-binary states. Consider an SD file writer which supports a three-state option: - only V2000 output (ignore or generate corrupt records otherwise?) - V3000 output if required, otherwise V2000 - always V3000 It's of course possible to encode this using 2 bits, but it loses some of its elegance. Think though of RDKit's SMILES file reader. It supports a 'delimiter' option, in order to support space, tab, comma, and I presume other delimiters as well. It also supports the ability to say that the SMILES come from something other than the first column, and the SMILES from other than the second. These are even harder to encode in a single flavor. BTW, OEChem doesn't support a delimiter option. Their 'SMILES file' comes from the Daylight practice of SMILES + whitespace + rest_of_line_as_title vs. the RDKit practice of assuming the file is a set of delimited columns, with a possible header. Above Roger said above that a reasonable default is zero (for all bits), but that new features may be added without changing the API/ABI. Most file format work nicely with binary flags, as OEChem's practice well shows. Some do not, as RDKit's SMILES file format suggests. There are other possible APIs which can handle the requirement of supporting new features without changing the API/ABI. RDKit's current method, that of passing additional arguments to the function or constructor, is not scalable. I may have multiple layers before I get to the actual reader or writer, and I don't want to update the intermediate APIs every time something changes. I think it's very interesting that OEChem's new InChI support (added only recently, so Roger might not know about it), takes an InChIOptions object. http://www.eyesopen.com/docs/toolkits/current/html/OEChem_TK-python/OEChemClasses/OEInChIOptions.html OEInChIOptions(unsigned int flavor = OEOFlavor::INCHI::Default) with methods like: .GetChiral() .GetFixedHLayer() ... .SetChiral() .SetFixedHLayer() ... I don't know why they switched to this style for this case. I wonder if part of it was to insulate themselves from any odd specifications InChI might add in the future. I prefer this style - an instance which contains the different parameters - though I haven't used it in earnest. This style too has difficulties, especially in C++. Ideally you want to support programs which support, say, version 2013 (without a given feature and associated method) and version 2014 (without). You can't do that in a language like C++ which requires all methods to be resolved in order for the program to run. The XMLReader API supports a 'getFeature(name)' and associated 'getProperty()'/'setProperty()', which might provide the right generic API. That said, you should read my email as commentary, and not as a statement for or against the current code. While I don't like it that much; without doubt, bit flags do work for this task. And because of C++ overloading, there's also a migration path to support an options class API like I promoted just now. Andrew
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi James, There's something very strange going on here with PyMol. On Oct 25, 2013, at 1:09 PM, James Davidson wrote: I can't see any double-bonds rendered in PyMOL: CONECT344 10 Here atom 3 has two bonds to atom 4. Why isn't it displayed double? This PDB *will* render double bonds in PyMOL. CONECT344 As expected. (and, again, I also see double bonds in PyMOL). CONECT324 10 No explicit double bond. Where is the double bond coming from? I'd expect two of the above cases to show double bonds, and one to only have single bonds. What is confusing is that which is which doesn't make any sense. Can you (or Greg) post a list of what the current input flavors do? Currently the reader only has a single flavor... flavor 1 : Read alternate locations, XPLOR/NMR pseudo atoms, and PDB dummy residues. By default the PDB file reader only returns atoms with alternate locations fields of space, 'A' or '1'. It also ignores atoms with co-ordinates .000, .000, .000 that appear in XPLOR output for leaving group atoms in covalently bonded ligands. Likewise, atoms with atomic symbol Q which are typically dummy atoms used as refinement constraints in NMR refinement. If the flavor parameter has the value 1, all these pseudo-atoms are read into the RDKit::ROMol, but clearly their semantics isn't understood by the rest of the toolkit. Valences will be incorrect, and a protein with multiple alternate sidechain conformations for some will likely fail sanitization. I hope this helps. Roger -- Roger Sayle, Ph.D. CEO and founder NextMove Software Limited Registered in England No. 07588305 Registered Office: Innovation Centre (Unit 23), Cambridge Science Park, Cambridge CB4 0EY -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss
Re: [Rdkit-discuss] Beta of Q3 2013 release available
Hi Greg (et al.), Thanks for the beta! I have been going through some of the recently-added functionality, and had a couple of questions regarding the PDB reading / writing. 1. Do I remember correctly that there was a proposal (from Roger) to add some auto bond-type perception to the PDB parser for ligands (or is that just wishful thinking!)? 2. If not, I notice that there is an AssignBondOrdersFromTemplate() method - but the example in the doc-string only shows (I think) the case where the input PDB is just a single small molecule - so the matching is pretty easy! I think a more real-World case is when one wants to set the bond orders for multiple ligands (HETATM residues) based on substructure matches - which will then return an atom index selection that can be used as a start point. Is there any way to have the AssignBondOrdersFromTemplate() convenience function optionally accept a list of atom indexes to specify a substructure? 3. Is there some explanation for what the 'flavor' option does for reading/writing PDB? 4. Having read in a PDB file I see the correct atoms flagged as HETATM (from GetIsHeteroAtom()). But when call Chem.MolToPDBBlock() these atoms get written as ATOM records... Also, a Chem.MolToPDBFile() method would be nice for completeness / symmetry : ) 5. It seems to me that GetResidueNumber() and GetSerialNumber() may have got mixed-up at some point(?). At least, when I call GetSerialNumber() I see what appears to be the residue number; and when I call GetResidueNumber() I get 0! 6. I also seem to be seeing all of the bonds (for all residues) being written out in CONECT records - such that they all appear as single bonds in eg PyMOL - is this expected behaviour at the moment? Cheers James __ PLEASE READ: This email is confidential and may be privileged. It is intended for the named addressee(s) only and access to it by anyone else is unauthorised. If you are not an addressee, any disclosure or copying of the contents of this email or any action taken (or not taken) in reliance on it is unauthorised and may be unlawful. If you have received this email in error, please notify the sender or postmas...@vernalis.com. Email is not a secure method of communication and the Company cannot accept responsibility for the accuracy or completeness of this message or any attachment(s). Please check this email for virus infection for which the Company accepts no responsibility. If verification of this email is sought then please request a hard copy. Unless otherwise stated, any views or opinions presented are solely those of the author and do not represent those of the Company. The Vernalis Group of Companies 100 Berkshire Place Wharfedale Road Winnersh, Berkshire RG41 5RD, England Tel: +44 (0)118 938 To access trading company registration and address details, please go to the Vernalis website at www.vernalis.com and click on the Company address and registration details link at the bottom of the page.. __-- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk___ Rdkit-discuss mailing list Rdkit-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/rdkit-discuss