Re: [ccp4bb] LINKR in refmac
Ian, Perhaps someone from one of the PDB deposition sites could comment and verify my reading of this? let me take the liberty. Your reading of PDB documentation is absolutely correct. PDB format has got 3 types of links: SSBOND, LINK and CISPEP. And indeed, residue numbers have no significance in the PDB whatsoever. The connectivity is given by SEQRES and by the order of residues in the coordinate section (which must be identical to SEQRES). Where this order is insufficient to describe (extra) connectivity, LINK etc. records are used. LINKR was never in PDB standard and for this is not admissible. I think (Garib will give an exhaustive explanation if he wishes) Refmac uses them for purely technical purposes from long ago. In the end of processing, they should become one of the PDB's link records - either at depositor or PDB side, or be removed if they are redundant. I am sure Garib has reasons for having LINKR records in Refmac, however confusing this may be. It is, indeed, not a very clean practice to use self-invented additions to PDB format, but for as long as they are used only locally there is no a terrible harm in it as seems. Best, Eugene. On Mon, 17 Aug 2009, Ian Tickle wrote: Tim, Garib, Sorry, maybe I'm missing something here but how does the user specify that (s)he wants a TRANS link between standard amino-acids (ASN-GLY) in this case? Isn't that the default? I always thought the answer was to add a LINK record for those two residues in the PDB file using the format specified in the PDB guide, e.g. LINK C ASN B 729 N GLY B 741 (or just paste the LINKR record from the output PDB file and change LINKR to 'LINK '). But this raises an important issue. The PDB entries contain many examples of this, i.e. where there's a gap in the numbering but not in the sequence, and the PDB guide on the LINK record states: The LINK records specify connectivity between residues *that is not implied by the primary structure*. (my emphasis). My reading of this is that it's the primary structure (i.e. the SEQRES records) that specify that the residues are contiguous, *not* the residue numbering. Perhaps someone from one of the PDB deposition sites could comment and verify my reading of this? If this is the case then Refmac is ignoring a perfectly valid PDB format, and requiring that the user supplies a non-agreed format! - but of course I could be wrong in my interpretation (in which case of course I withdraw from the argument!). But if I'm right then it seems to me that refinement programs should at the very minimum be able to treat completely valid PDB entries correctly, and not require the user to make non-standard changes. Cheers -- Ian -Original Message- From: owner-ccp...@jiscmail.ac.uk [mailto:owner-ccp...@jiscmail.ac.uk] On Behalf Of Garib Murshudov Sent: 16 August 2009 22:09 To: Tim Fenn Cc: CCP4BB@jiscmail.ac.uk Subject: Re: [ccp4bb] LINKR in refmac Tim is right. The link you want is TRANS. And if you want link between alternative position then you need to add alt codes before residue names. Llink ids must be defined in the dictionary. There are definitions for standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif. For templates how to use various forms of links please have a look: http://www.ysbl.york.ac.uk/refmac/data/template_link.txt If you experience further difficulties please let me know and I will try to sort this out. regards Garib 2009/8/14 Tim Fenn f...@stanford.edu On Fri, 14 Aug 2009 13:24:16 -0700 Jan Abendroth jan.abendr...@gmail.com wrote: How can I tell refmac to maintain the peptide link? Here is what I tried - the numbers above just for orientation 1 2 3 4 5 6 7 8 123456789012345678901234567890123456789012345678901234567890123456789012 34 567890 LINKRC ASN B 729 N GLY B 741 ASN-GLY refmac comments in the log file ... however, still pulls the residues apart. WARNING : description of link:ASN-GLY is not in the dictionary link will be created with bond_lenth = 1.260 So, in my understanding it comes down to the question: how is a peptide bond referenced to in the dictionary? take a look at the data_link_list loop in mon_lib_list.cif (there may be an easier way to view this info): TRANS..peptide ..peptide default-peptide-link PTRANS ..peptide PRO .. default-peptide-link_pro NMTRANS ..peptide PRO .. default-peptide-link_cn CIS ..peptide ..peptide cis-peptide-link PCIS ..peptide PRO .. cis-peptide-link_pro NMCIS..peptide PRO .. cis-peptide-link_cn
Re: [ccp4bb] LINKR in refmac
Yes. There is a confusion at the moment. REFMAC LINK or LINKR record require link record that is absent in the pdb. If primary structure (sequence of amino acids/RNA/DNA and closeness in 3d of sugars) allow linking then links between residues used automatically (TRANS/CIS/PTRANS/PCIS for peptides p or variations for RNA/DNA and corresponding links for sugars). If residues are in different chains then these links are not created automatically. (There is an option to use them automatically but for incomplete model the results may be misleading). In these cases a user can define links explicitly between residues. For example: LINK LYS A 27 TRP B 555TRANS Simple link record as it is in pdb does not allow to create chemical description of polymers fully (for example if you write: LINK C ASN B 729 N GLY B 741 It is not clear if bonds are single or double or is there any other modification of residues before forming a covalent bond. I hope this issue will be sorted out with pdb soon. There is an option in refmac to create links if residues are close each other but it may cause problem with incomplete models. When there are deletions (gap between residues as in the above case) it is not clear if it is because of incompleteness of the model or because of actual deletions. With insertion codes situation is easier. About LINK or LINKR: refmac tries to analyse the record and if it has id then it assumes that it is refmac style link if it is not then it is pdb style. I hope it a little bit clarifies this confusing case. regards Garib On 17 Aug 2009, at 10:17, Ian Tickle wrote: Tim, Garib, Sorry, maybe I'm missing something here but how does the user specify that (s)he wants a TRANS link between standard amino-acids (ASN-GLY) in this case? Isn't that the default? I always thought the answer was to add a LINK record for those two residues in the PDB file using the format specified in the PDB guide, e.g. LINK C ASN B 729 N GLY B 741 (or just paste the LINKR record from the output PDB file and change LINKR to 'LINK '). But this raises an important issue. The PDB entries contain many examples of this, i.e. where there's a gap in the numbering but not in the sequence, and the PDB guide on the LINK record states: The LINK records specify connectivity between residues *that is not implied by the primary structure*. (my emphasis). My reading of this is that it's the primary structure (i.e. the SEQRES records) that specify that the residues are contiguous, *not* the residue numbering. Perhaps someone from one of the PDB deposition sites could comment and verify my reading of this? If this is the case then Refmac is ignoring a perfectly valid PDB format, and requiring that the user supplies a non-agreed format! - but of course I could be wrong in my interpretation (in which case of course I withdraw from the argument!). But if I'm right then it seems to me that refinement programs should at the very minimum be able to treat completely valid PDB entries correctly, and not require the user to make non-standard changes. Cheers -- Ian -Original Message- From: owner-ccp...@jiscmail.ac.uk [mailto:owner- ccp...@jiscmail.ac.uk] On Behalf Of Garib Murshudov Sent: 16 August 2009 22:09 To: Tim Fenn Cc: CCP4BB@jiscmail.ac.uk Subject: Re: [ccp4bb] LINKR in refmac Tim is right. The link you want is TRANS. And if you want link between alternative position then you need to add alt codes before residue names. Llink ids must be defined in the dictionary. There are definitions for standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif. For templates how to use various forms of links please have a look: http://www.ysbl.york.ac.uk/refmac/data/template_link.txt If you experience further difficulties please let me know and I will try to sort this out. regards Garib 2009/8/14 Tim Fenn f...@stanford.edu On Fri, 14 Aug 2009 13:24:16 -0700 Jan Abendroth jan.abendr...@gmail.com wrote: How can I tell refmac to maintain the peptide link? Here is what I tried - the numbers above just for orientation 1 2 3 4 5 6 7 8 123456789012345678901234567890123456789012345678901234567890123456789012 34 567890 LINKRC ASN B 729 N GLY B 741 ASN-GLY refmac comments in the log file ... however, still pulls the residues apart. WARNING : description of link:ASN-GLY is not in the dictionary link will be created with bond_lenth = 1.260 So, in my understanding it comes down to the question: how is a peptide bond referenced to in the dictionary? take a look at the data_link_list loop in
Re: [ccp4bb] LINKR in refmac
Hi, The LINK records specify connectivity between residues *that is not implied by the primary structure*. (my emphasis). My reading of this is that it's the primary structure (i.e. the SEQRES records) that specify that the residues are contiguous, *not* the residue numbering. Perhaps someone from one of the PDB deposition sites could comment and verify my reading of this? Ian is correct. In the PDB, LINK records are only really used to describe bonds that are not immediately obvious based on residue name. For example, Link records could be used to describe coordinate bonds between metal and amino acids, connections between modified residues and standard residues, or even glycosidic bonds between sugars. Standard connectivity is assumed between all standard polymer chains based on the sequence (SEQRES) and not residue numbering as has been pointed out. As long as two residues are contiguous inside a SEQRES record, it is assumed they have standard connections and therefore no LINK records are necessary (as far as the PDB is concerned) for such residues. best regards- Jawahar Swaminathan PDB Depositions Protein Databank in Europe (PDBe) http://www.ebi.ac.uk/pdbe If this is the case then Refmac is ignoring a perfectly valid PDB format, and requiring that the user supplies a non-agreed format! - but of course I could be wrong in my interpretation (in which case of course I withdraw from the argument!). But if I'm right then it seems to me that refinement programs should at the very minimum be able to treat completely valid PDB entries correctly, and not require the user to make non-standard changes. Cheers -- Ian -Original Message- From: owner-ccp...@jiscmail.ac.uk [mailto:owner-ccp...@jiscmail.ac.uk] On Behalf Of Garib Murshudov Sent: 16 August 2009 22:09 To: Tim Fenn Cc: CCP4BB@jiscmail.ac.uk Subject: Re: [ccp4bb] LINKR in refmac Tim is right. The link you want is TRANS. And if you want link between alternative position then you need to add alt codes before residue names. Llink ids must be defined in the dictionary. There are definitions for standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif. For templates how to use various forms of links please have a look: http://www.ysbl.york.ac.uk/refmac/data/template_link.txt If you experience further difficulties please let me know and I will try to sort this out. regards Garib 2009/8/14 Tim Fenn f...@stanford.edu On Fri, 14 Aug 2009 13:24:16 -0700 Jan Abendroth jan.abendr...@gmail.com wrote: How can I tell refmac to maintain the peptide link? Here is what I tried - the numbers above just for orientation 1 2 3 4 5 6 7 8 123456789012345678901234567890123456789012345678901234567890123456789012 34 567890 LINKRC ASN B 729 N GLY B 741 ASN-GLY refmac comments in the log file ... however, still pulls the residues apart. WARNING : description of link:ASN-GLY is not in the dictionary link will be created with bond_lenth = 1.260 So, in my understanding it comes down to the question: how is a peptide bond referenced to in the dictionary? take a look at the data_link_list loop in mon_lib_list.cif (there may be an easier way to view this info): TRANS..peptide ..peptide default-peptide-link PTRANS ..peptide PRO .. default-peptide-link_pro NMTRANS ..peptide PRO .. default-peptide-link_cn CIS ..peptide ..peptide cis-peptide-link PCIS ..peptide PRO .. cis-peptide-link_pro NMCIS..peptide PRO .. cis-peptide-link_cn so you probably want TRANS. HTH, Tim -- - Tim Fenn f...@stanford.edu Stanford University, School of Medicine James H. Clark Center 318 Campus Drive, Room E300 Stanford, CA 94305-5432 Phone: (650) 736-1714 FAX: (650) 736-1961 - Disclaimer This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this
Re: [ccp4bb] LINKR in refmac
Hi Eugene Jawahar Thanks for responding! let me take the liberty. Your reading of PDB documentation is absolutely correct. PDB format has got 3 types of links: SSBOND, LINK and CISPEP. And indeed, residue numbers have no significance in the PDB whatsoever. The connectivity is given by SEQRES and by the order of residues in the coordinate section (which must be identical to SEQRES). Where this order is insufficient to describe (extra) connectivity, LINK etc. records are used. I understand that to mean that it's the *order* of residues that must be identical, not the residues themselves, i.e. it's valid for residues to be omitted from the co-ordinate section (but not of course from the SEQRES section!). Let's take a couple of concrete examples. First, suppose the sequence is ...AGGA... and the co-ordinate section contains: ... A4 G6 G7 A8 ... . Then it's unambiguous from the sequence that A4-G6, G6-G7 and G7-A8 are linked (i.e. residue no 5 is not used in this case), so a LINK record for A4-G6 is *not* mandatory (though I assume it's not an error to give it). Now, assuming the same sequence, suppose that one of the Gs could not be seen in the structure, so the co-ordinate section contains only ... A4 G6 A8 ... . Now there's an ambiguity: is the sequence actually A4 G5 G6 A8 or is it A4 G6 G7 A8 ? Clearly it makes a difference! Presumably then a LINK record would be mandatory in order to resolve the ambiguity and identify the missing residue (i.e. either a G6-A8 link with G5 missing or a A4-G6 link with G7 missing). LINKR was never in PDB standard and for this is not admissible. I think (Garib will give an exhaustive explanation if he wishes) Refmac uses them for purely technical purposes from long ago. In the end of processing, they should become one of the PDB's link records - either at depositor or PDB side, or be removed if they are redundant. I am sure Garib has reasons for having LINKR records in Refmac, however confusing this may be. It is, indeed, not a very clean practice to use self-invented additions to PDB format, but for as long as they are used only locally there is no a terrible harm in it as seems. It's not the presence of LINKR records that really concerns me: as you say there's no harm done if they are only used locally - and as long as users are not confused into thinking that the LINKR records are valid for deposition! My concern is that in cases (such as my first example above) where the residue numbers are non-contiguous but no atoms have actually been omitted, users are required to insert additional LINK records in order to re-refine already perfectly valid and unambiguous PDB entries. This makes automated refinement of PDB entries very difficult! Cheers -- Ian Disclaimer This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing i.tic...@astex-therapeutics.com and destroy all copies of the message and any attached documents. Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof. Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, Cambridge CB4 0QA under number 3751674
Re: [ccp4bb] LINKR in refmac
Hi Ian, I may be wrong here, in which case Jawahar will correct me. I understand that to mean that it's the *order* of residues that must be identical, not the residues themselves, i.e. it's valid for residues to be omitted from the co-ordinate section (but not of course from the SEQRES section!). - correct, however: I thought that primary purpose of LINKs in PDB is to specify extra connectivity like between ligands and polypeptides, also when polypeptides loop up and make SS or other bonding etc. If I were to deal with your example, I would look into distance profile between residues in coordinate section, which then gives answer to your question. It's not the presence of LINKR records that really concerns me: as you say there's no harm done if they are only used locally - and as long as users are not confused into thinking that the LINKR records are valid for deposition! LINKR simply wouldn't validate at deposition My concern is that in cases (such as my first example above) where the residue numbers are non-contiguous but no atoms have actually been omitted, users are required to insert additional LINK records in order to re-refine already perfectly valid and unambiguous PDB entries. This makes automated refinement of PDB entries very difficult! Obviously LINKR will help where distance profile is insufficient to reliably derive linking. I believe this is what Garib says in his e-mail. The problem, as far as I can see it, is simply that on the refinement stage residue numbers are equivalented with residue positions, which is wrong in general, but acceptable locally. As in George reply of today, a generic solution would be to keep original residue numbers merely as labels. Hope I did not mess it up completely. Cheers, Eugene. On Mon, 17 Aug 2009, Ian Tickle wrote: Hi Eugene Jawahar Thanks for responding! let me take the liberty. Your reading of PDB documentation is absolutely correct. PDB format has got 3 types of links: SSBOND, LINK and CISPEP. And indeed, residue numbers have no significance in the PDB whatsoever. The connectivity is given by SEQRES and by the order of residues in the coordinate section (which must be identical to SEQRES). Where this order is insufficient to describe (extra) connectivity, LINK etc. records are used. I understand that to mean that it's the *order* of residues that must be identical, not the residues themselves, i.e. it's valid for residues to be omitted from the co-ordinate section (but not of course from the SEQRES section!). Let's take a couple of concrete examples. First, suppose the sequence is ...AGGA... and the co-ordinate section contains: ... A4 G6 G7 A8 ... . Then it's unambiguous from the sequence that A4-G6, G6-G7 and G7-A8 are linked (i.e. residue no 5 is not used in this case), so a LINK record for A4-G6 is *not* mandatory (though I assume it's not an error to give it). Now, assuming the same sequence, suppose that one of the Gs could not be seen in the structure, so the co-ordinate section contains only ... A4 G6 A8 ... . Now there's an ambiguity: is the sequence actually A4 G5 G6 A8 or is it A4 G6 G7 A8 ? Clearly it makes a difference! Presumably then a LINK record would be mandatory in order to resolve the ambiguity and identify the missing residue (i.e. either a G6-A8 link with G5 missing or a A4-G6 link with G7 missing). LINKR was never in PDB standard and for this is not admissible. I think (Garib will give an exhaustive explanation if he wishes) Refmac uses them for purely technical purposes from long ago. In the end of processing, they should become one of the PDB's link records - either at depositor or PDB side, or be removed if they are redundant. I am sure Garib has reasons for having LINKR records in Refmac, however confusing this may be. It is, indeed, not a very clean practice to use self-invented additions to PDB format, but for as long as they are used only locally there is no a terrible harm in it as seems. It's not the presence of LINKR records that really concerns me: as you say there's no harm done if they are only used locally - and as long as users are not confused into thinking that the LINKR records are valid for deposition! My concern is that in cases (such as my first example above) where the residue numbers are non-contiguous but no atoms have actually been omitted, users are required to insert additional LINK records in order to re-refine already perfectly valid and unambiguous PDB entries. This makes automated refinement of PDB entries very difficult! Cheers -- Ian Disclaimer This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex
Re: [ccp4bb] LINKR in refmac
Hi Eugene If I were to deal with your example, I would look into distance profile between residues in coordinate section, which then gives answer to your question. OK that's interesting, I hadn't realised that the interatomic distances played an active role in determining connectivity: I had assumed the whole point of the various connectivity specifiers was to avoid reliance on distance calculations. But resolution of the kind of ambiguity I illustrated would not occur automatically, or would it? - i.e. would the need for it only become clear when the structure was checked manually? Of course it makes sense to use distance information for fully-refined structures such as those that are (hopefully!) deposited with the PDB, but it would be more of an issue with refinement programs that have to deal with all kinds of errant input geometry! I suppose the fundamental problem is that we're trying to use the same format for both purposes and there are fundamentally conflicting requirements. Cheers -- Ian Disclaimer This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing i.tic...@astex-therapeutics.com and destroy all copies of the message and any attached documents. Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof. Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, Cambridge CB4 0QA under number 3751674
[ccp4bb] imosflm with multiple data sets
I am an imosflm novice and have a relatively simple question. I have a 360 deg data set collected in two swathes of 180 deg (one with phi=0 and omega going 0-180 and the second with phi = 180 and omega going 0-180). What is the easiest way to process the two datasets using a matching orientation matrix (or one rotated by 180 deg as it were) so that all the data can be merged together. Is there an easy way to do it in imosflm or must one process the two sets separately and then manipulate later with pointless before scalling and merging everything together? Thanks in advance. -Tom Tom J. Brett, PhD Assistant Professor of Medicine Division of Pulmonary and Critical Care Washington University School of Medicine Campus Box 8052, 660 S. Euclid Saint Louis, MO 63110
Re: [ccp4bb] imosflm with multiple data sets
Dear Tom, There is a straightforward way to do what you want. It is probably simplest to start by reading in only the images from the first segment (0-180). Then do the indexing, cell refinement and integration in the usual way. Then read in the second segment of data. You will notice that in this second segment, underneath the Sector name, there is a line starting Matrix and this will be Unknown. If you go to the Matrix line of the first segment, the matrix will have a name (based on the image template). Double click on the name of the matrix. A popup window (Matrix properties) will appear. Click on the save matrix file icon (a blue disc) and save the matrix with an appropriate filename. Now go to the Matrix line of the second segment, double click (on Unknown) as before and this time click on the Open matrix file icon (a folder) and read in the matrix that you saved from the first sector. You can now process the second segment using this matrix. It would be even nicer if you could drag and drop the matrix, this is on our to do list. Best wishes, Andrew On 17 Aug 2009, at 13:33, Brett, Thomas wrote: I am an imosflm novice and have a relatively simple question. I have a 360 deg data set collected in two swathes of 180 deg (one with phi=0 and omega going 0-180 and the second with phi = 180 and omega going 0-180). What is the easiest way to process the two datasets using a matching orientation matrix (or one rotated by 180 deg as it were) so that all the data can be merged together. Is there an easy way to do it in imosflm or must one process the two sets separately and then manipulate later with pointless before scalling and merging everything together? Thanks in advance. -Tom Tom J. Brett, PhD Assistant Professor of Medicine Division of Pulmonary and Critical Care Washington University School of Medicine Campus Box 8052, 660 S. Euclid Saint Louis, MO 63110
Re: [ccp4bb] LINKR in refmac
If I were to deal with your example, I would look into distance profile between residues in coordinate section, which then gives answer to your question. OK that's interesting, I hadn't realised that the interatomic distances played an active role in determining connectivity: I had assumed the whole point of the various connectivity specifiers was to avoid reliance on distance calculations. yes where distances are unreliable or where you cannot unambiguously determine the link point, e.g. when it's apriori unclear which atom is leaving upon linking, how many hydrogens are involved etc. But resolution of the kind of ambiguity I illustrated would not occur automatically, or would it? - i.e. would the need for it only become clear when the structure was checked manually? I think this depends on the quality of your data. If you have a remote model and poor resolution then the difficulties arise. You do not LINKR everything, or do you? If two atoms may be automatically referenced in a residue, the same can be done for the mainchain, so I would think that one uses LINKR when things are not self-obvious from structural positions. Of course it makes sense to use distance information for fully-refined structures such as those that are (hopefully!) deposited with the PDB, but it would be more of an issue with refinement programs that have to deal with all kinds of errant input geometry! I suppose the fundamental problem is that we're trying to use the same format for both purposes and there are fundamentally conflicting requirements. Probably that's why there are LINKs and LINKRs I guess. Regards, Eugene. Cheers -- Ian Disclaimer This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this communication in error, please notify Astex Therapeutics Ltd by emailing i.tic...@astex-therapeutics.com and destroy all copies of the message and any attached documents. Astex Therapeutics Ltd monitors, controls and protects all its messaging traffic in compliance with its corporate email policy. The Company accepts no liability or responsibility for any onward transmission or use of emails and attachments having left the Astex Therapeutics domain. Unless expressly stated, opinions in this message are those of the individual sender and not of Astex Therapeutics Ltd. The recipient should check this email and any attachments for the presence of computer viruses. Astex Therapeutics Ltd accepts no liability for damage caused by any virus transmitted by this email. E-mail is susceptible to data corruption, interception, unauthorized amendment, and tampering, Astex Therapeutics Ltd only send and receive e-mails on the basis that the Company is not liable for any such alteration or any consequences thereof. Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, Cambridge CB4 0QA under number 3751674
Re: [ccp4bb] imosflm with multiple data sets
I didnt realize the following: You read in images from the two wedges collected with the same crystal orientation. mydata_1_###.img mydata_101_###.img Now when you index ,if you say use images from both datasets mydata_1_###.img use image 1,90 mydata_101_###.img use image 30 , 120 The matrix for the second wedge (mydata_101_###.img) is still marked unknown? Isnt this different from the behaviour in the X- mosflm . SHould the matrixes be the same since the orientation was calculated using images from both. Now , If I did not force the second wedge to have the same matrix , using the save to file and read from file method you just described , does the new imosflm use the last calculated matrix from the running session or calculate a new matrix ?..I guess I have to check some of the data I processed with my erroneous assumption to make sure that the matrixes for the two wedges are the same . Thanks for clarifying this.. hari On Mon, Aug 17, 2009 at 9:13 AM, Andrew Leslieand...@mrc-lmb.cam.ac.uk wrote: Dear Tom, There is a straightforward way to do what you want. It is probably simplest to start by reading in only the images from the first segment (0-180). Then do the indexing, cell refinement and integration in the usual way. Then read in the second segment of data. You will notice that in this second segment, underneath the Sector name, there is a line starting Matrix and this will be Unknown. If you go to the Matrix line of the first segment, the matrix will have a name (based on the image template). Double click on the name of the matrix. A popup window (Matrix properties) will appear. Click on the save matrix file icon (a blue disc) and save the matrix with an appropriate filename. Now go to the Matrix line of the second segment, double click (on Unknown) as before and this time click on the Open matrix file icon (a folder) and read in the matrix that you saved from the first sector. You can now process the second segment using this matrix. It would be even nicer if you could drag and drop the matrix, this is on our to do list. Best wishes, Andrew On 17 Aug 2009, at 13:33, Brett, Thomas wrote: I am an imosflm novice and have a relatively simple question. I have a 360 deg data set collected in two swathes of 180 deg (one with phi=0 and omega going 0-180 and the second with phi = 180 and omega going 0-180). What is the easiest way to process the two datasets using a matching orientation matrix (or one rotated by 180 deg as it were) so that all the data can be merged together. Is there an easy way to do it in imosflm or must one process the two sets separately and then manipulate later with pointless before scalling and merging everything together? Thanks in advance. -Tom Tom J. Brett, PhD Assistant Professor of Medicine Division of Pulmonary and Critical Care Washington University School of Medicine Campus Box 8052, 660 S. Euclid Saint Louis, MO 63110
Re: [ccp4bb] Questions about Phaser
Peter Zwart wrote: Hi, This can might well be due to wrong space group. I suggest fixing the space groups by hand and running it in each possible space group that is possible. (P 31 2 1, P 32 2 1, P 3 2 1 are your options) Did you run xtriage already? P 2009/8/17 Yuan Cheng ych...@email.unc.edu: Hey, I am trying to use phaser to solve a protein structure. There is predicted to be 8 mol/asu based on Matthew coefficient analysis.I am using a protein that shares about 35% identity with my protein as a search model. Phaser found the first five solutions and then failed to find the 6th. The LLG and Z-score are as following. The possible loop in the chainsawed model has been truncated. SOLU SET RFZ=5.2 TFZ=7.1 PAK=0 LLG=44 RFZ=5.3 TFZ=13.8 PAK=0 LLG=169 RFZ=4.3 TFZ=69.6 PAK=0 LLG=1209 RFZ=4.6 TFZ=60.6 PAK=0 LLG=2200 RFZ=4.3 TFZ=5.7 PAK=0 LLG=2163 I used coot to check the difference map made with the model including the above five solutions.The first four solutions fit the density very well (didn't see many positive or negative densities). The 5th solution didn't fit the density at all. I saw many empty density in the map, indicating I still need to find more solutions.The space group I am using is P3 2 1. Could this be caused by a wrong space group? Could anyone give me some suggestions about this? Thanks a lot! Yuan Hi every, Thanks a lot for your reply! Actually I am pretty confusing here about the using of different space groups. The.mtz file I am using now as the input file for phaser is in space group P3. Phaser gave me the first four solutions like I mentioned in last email, but failed at the 5th one. Then I realized there might be something wrong with the space group or the data. I used Phenix.xtriage to re-analyze my data (P3 space group). Merohedral twinning and pseu-translational symmetry were found as following, Statistics depending on twin laws -- | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | -- | -h,-k,l | M | 0.738 | -0.137| 0.000 | 0.022| | h,-h-k,-l | M | 0.047 | 0.390 | 0.429 | 0.478| | -k,-h,-l | M | 0.746 | -0.143| 0.000 | 0.022| -- The analyses of the Patterson function reveals a significant off-origin peak that is 82.25 % of the origin peak, indicating pseudo translational symmetry. Also, The analysis indicates P 3 2 1 and its alternatives P31 2 1 and P32 2 1 might be the correct space group. Then I used the Sort/reindex MTZ fils module to change the space group to P 3 2 1. I forced Phaser to use P 3 2 1 as the space group to search for more solutions with the four solutions already found fixed. But I didn't get any better result. I used phenix.xtriage to analyze the data in P 3 2 1. It indicates there might be one twinning operator,but different tests gave different answers.But there is still a pseudo-translational symmetry. Statistics depending on twin laws - | Operator | type | R obs. | Britton alpha | H alpha | ML alpha | - | -h,-k,l | M | 0.759 | -0.152| 0.000 | 0.022| - The analyses of the Patterson function reveals a significant off-origin peak that is 82.17 % of the origin peak, indicating pseudo translational symmetry. I have a couple of question now 1)Do I need go back to HKL2000 and redo the index,integrate and scale. Since the .sca and .mtz I have now is in P3. I don't know whether the unit cell dimension is going to change if I redo it in P3 2 1. 2)what does the pseudo-translational symmetry actually means? I don't quite understand this concept and what should I do about it? This is a really long email. I appreciate your attention very much. Yuan
[ccp4bb] Summary of Extracting Amino Acid Sequence from PDB File
Dear All, Thanks to everyone who replied to my query about extracting an amino acid sequence from a PDB file! Here is a summary of responses of my query; 1. Use SwissPDBViewer 2. Use Pymol 1.2 load $TUT/1hpv.pdb save 1hpv.fasta # or by selection save 1hpv_A.fasta, chain A 3. With Phenix, you can use the phenix.print_sequence tool to output the sequence in FASTA format. 4. If the PDB file is in the PDB, you can download the primary sequence from here. 5. Use MOLEMAN2 6. Use PDBSET (part of CCP4) All the best, ---Buz
[ccp4bb] space groups not supported by Refmac5
Hi, I have a question; recently I have encountered a few space groups which are not supported by Refmac5. Examples include I2 and P21221. Both these space groups have been identified as the best solutions for the two different datasets I am working on using Pointless. However, I am faced with difficulties in Refmac5, and the program fails to complete when I select refinement cycles with Arp-waters...with the message saying 'space group not supported'. Any suggestions on how this problem could be overcome? Regards, Arefeh Arefeh Seyedarabi, PhD Postdoctoral research assistant School of Biological and Chemical sciences Queen Mary, University of London Mile End road London E1 4NS Based at Joseph Priestley Building G.35 020 78828480
Re: [ccp4bb] space groups not supported by Refmac5
For the latter, P21221, you could reindex to get P21212, which is supported. - === You can't possibly be a scientist if you mind people thinking that you're a fool. - Wonko the Sane === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu On Mon, 2009-08-17 at 17:37 +0100, Arefeh Seyedarabi wrote: Hi, I have a question; recently I have encountered a few space groups which are not supported by Refmac5. Examples include I2 and P21221. Both these space groups have been identified as the best solutions for the two different datasets I am working on using Pointless. However, I am faced with difficulties in Refmac5, and the program fails to complete when I select refinement cycles with Arp-waters...with the message saying 'space group not supported'. Any suggestions on how this problem could be overcome? Regards, Arefeh Arefeh Seyedarabi, PhD Postdoctoral research assistant School of Biological and Chemical sciences Queen Mary, University of London Mile End road London E1 4NS Based at Joseph Priestley Building G.35 020 78828480
Re: [ccp4bb] space groups not supported by Refmac5
On Monday 17 August 2009 09:37:40 Arefeh Seyedarabi wrote: Hi, I have a question; recently I have encountered a few space groups which are not supported by Refmac5. Examples include I2 and P21221. This is not correct. Refmac is perfectly happy with these settings, although there are some glitches to be careful of when you eventually get to the point of data deposition. (See earlier thread about mtz2various mangling the spacegroup description when converting to *.cif). Both these space groups have been identified as the best solutions for the two different datasets I am working on using Pointless. However, I am faced with difficulties in Refmac5, and the program fails to complete when I select refinement cycles with Arp-waters...with the message saying 'space group not supported'. Ah. Perhaps this is a message from Arp/wArp? I don't know. But having recently refined several I2 structures in refmac, I can attest that it works fine in recent versions. Ethan -- Ethan A Merritt Biomolecular Structure Center University of Washington, Seattle 98195-7742
Re: [ccp4bb] space groups not supported by Refmac5
Hi, Following mail (retrived from ccp4bb archieve) may help for intechanging the cell axis. HTH S.Karthikeyan == I have a CCP4 script for this: http://bl831.als.lbl.gov/~jamesh/elves/scripts/reindex.com If you run the script like this: reindex.com yourdata.mtz P22121 Then you will get a file called reindexed.mtz that has the space group P21212 with the cell permuted so that the shortest edge is b. Specifying P21221 will give you P21212 with the shortest edge at c, and P21212 will give you the shortest edge at a, like it is supposed to be. Like George said, this way you are always working in a cannonical space group so you will have much less portability issues between programs. -James Holton MAD Scientist = For the latter, P21221, you could reindex to get P21212, which is supported. - === You can't possibly be a scientist if you mind people thinking that you're a fool. - Wonko the Sane === David J. Schuller modern man in a post-modern world MacCHESS, Cornell University schul...@cornell.edu On Mon, 2009-08-17 at 17:37 +0100, Arefeh Seyedarabi wrote: Hi, I have a question; recently I have encountered a few space groups which are not supported by Refmac5. Examples include I2 and P21221. Both these space groups have been identified as the best solutions for the two different datasets I am working on using Pointless. However, I am faced with difficulties in Refmac5, and the program fails to complete when I select refinement cycles with Arp-waters...with the message saying 'space group not supported'. Any suggestions on how this problem could be overcome? Regards, Arefeh Arefeh Seyedarabi, PhD Postdoctoral research assistant School of Biological and Chemical sciences Queen Mary, University of London Mile End road London E1 4NS Based at Joseph Priestley Building G.35 020 78828480
Re: [ccp4bb] space groups not supported by Refmac5
Yeah, I2 works indeed just fine (for me, right now). Andreas Ethan Merritt wrote: On Monday 17 August 2009 09:37:40 Arefeh Seyedarabi wrote: Hi, I have a question; recently I have encountered a few space groups which are not supported by Refmac5. Examples include I2 and P21221. This is not correct. Refmac is perfectly happy with these settings, although there are some glitches to be careful of when you eventually get to the point of data deposition. (See earlier thread about mtz2various mangling the spacegroup description when converting to *.cif). Both these space groups have been identified as the best solutions for the two different datasets I am working on using Pointless. However, I am faced with difficulties in Refmac5, and the program fails to complete when I select refinement cycles with Arp-waters...with the message saying 'space group not supported'. Ah. Perhaps this is a message from Arp/wArp? I don't know. But having recently refined several I2 structures in refmac, I can attest that it works fine in recent versions. Ethan -- Andreas Förster, Research Associate Paul Freemont Xiaodong Zhang Labs Department of Biochemistry, Imperial College London http://www.msf.bio.ic.ac.uk
[ccp4bb] Protein Protein interactions
Hello All, Can anyone tell me what are the programs used to find out the different interactions in a protein. I am talking about both intra and intermolecular interactions. Thanks in advance. Mariah -- Mariah Jones Department of Biochemistry University of Florida
Re: [ccp4bb] Protein Protein interactions
CCP4 has a program called contact where you can specify the chains, distances and atoms you want to explore for contacts and it will give a list of all contacts that fit that criteria, along with the distance. It works best if you already have an idea for which chain interfaces you want to explore. A much slower way is to use coot to bring up the model and look at the residue environments. Again, much better if you already know which residues to use. A better predictor for unknown interfaces would be PISA, accessible either directly through EMBL-EBI or indirectly through ExPASy. Regina --- On Mon, 8/17/09, protein.chemist protein.chemist pp73...@gmail.com wrote: From: protein.chemist protein.chemist pp73...@gmail.com Subject: [ccp4bb] Protein Protein interactions To: CCP4BB@JISCMAIL.AC.UK Date: Monday, August 17, 2009, 1:13 PM Hello All, Can anyone tell me what are the programs used to find out the different interactions in a protein. I am talking about both intra and intermolecular interactions. Thanks in advance. Mariah -- Mariah Jones Department of Biochemistry University of Florida
[ccp4bb] MX beamtime at CLS
Dear Crystallographers, At the Canadian Light Source the Call for Proposals is currently open AND accepting General User Proposals. * Macromolecular Crystallography (08ID-1 beamline) * Deadline: September 2, 2009 * Scheduling Period: October 27 - December 18, 2009 * A call for proposals is issued four times per year * Process to apply for beamtime http://www.lightsource.ca/uso/macromolecular_crystallography.php Please visit http://ex.lightsource.ca/cmcf/ to gain updated information about our experimental setup. Regards, Pawel __ Pawel Grochulski, Ph.D., D.Sc. Canadian Macromolecular Crystallography Facility Canadian Light Source Inc., 101 Perimeter Road, Saskatoon, SK S7N 0X4, Canada. Phone: 306-657-3538; Fax: 306-657-3535 http://ex.lightsource.ca/cmcf/ http://ex.lightsource.ca/cmcf/
[ccp4bb] Scale datasets from different crystals
Hello all, I have incomplete data from two different crystals. Is there a program in ccp4 that I can use to merge the two scaled datasets? I used CAD and scaleit, but it does not give any scaling statistics like completeness and Rmerge. Any help would be appreciated. -Donald.
Re: [ccp4bb] Scale datasets from different crystals
If you want to treat the end result as a single dataset, you're probably better off combining unmerged datasets (two different batches in scala, or two different scaling sets in scalepack). As far as I understand it, the data model in the mtz files (project/crystal/dataset/column) isn't really set up to allow combining two datasets post-merging; so most likely you'd have to cook up a custom procedure (most likely involving some duct tape, or equivalently dumping to ascii after scaling, merging overlapping reflections, and reconversion to mtz). Pete Donald Damian Raymond wrote: Hello all, I have incomplete data from two different crystals. Is there a program in ccp4 that I can use to merge the two scaled datasets? I used CAD and scaleit, but it does not give any scaling statistics like completeness and Rmerge. Any help would be appreciated. -Donald.
[ccp4bb] How to determine water number?
Hi all, I am a novice working on protein structure. When I pick water using COOT, too many waters picked, filling in the whole cell. My question is how can I determine which water is needed, which is not needed?
Re: [ccp4bb] How to determine water number?
Hi Antonio, My question is how can I determine which water is needed, which is not needed? may be this will give you some clues - here are the approximate criteria for water picking that are used in automatic water picking and refinement in phenix.refine: 1) peak at mFo-DFc map is higher than ~3sigma, and 2) peak center is within a hydrogen bond to another atom (water or macromolecule), and 3) peak has approximately the same shape as a water molecule would have at this resolution and local environment, and 4) peak at 2mFo-DFc is higher than ~1.5 sigma, after a round of coordinate and B-factor refinement, the criteria (2-3-4) are still ok, and 5) refined B-factor of newly placed water is meaningful (didn't jump to large value), otherwise a water is deleted. There are a few other technical tricks to make this process robust and efficient at high resolution, higher than ~1.2A or so, but this is, I guess, beyond of what you were asking about. Pavel.
Re: [ccp4bb] imosflm with multiple data sets
Actually, drag-and-drop DOES work, and is *dead* handy! (But a considerable annoyance: you HAVE to open the sector to be able to click on the matrix line -- and then you have to drag that matrix past all the 300 (or whatever) images to get to the next sector. For many images, this really slow. Better to put matrix and images on separate sub-nodes.) Andrew Leslie wrote: Dear Tom, There is a straightforward way to do what you want. It is probably simplest to start by reading in only the images from the first segment (0-180). Then do the indexing, cell refinement and integration in the usual way. Then read in the second segment of data. You will notice that in this second segment, underneath the Sector name, there is a line starting Matrix and this will be Unknown. If you go to the Matrix line of the first segment, the matrix will have a name (based on the image template). Double click on the name of the matrix. A popup window (Matrix properties) will appear. Click on the save matrix file icon (a blue disc) and save the matrix with an appropriate filename. Now go to the Matrix line of the second segment, double click (on Unknown) as before and this time click on the Open matrix file icon (a folder) and read in the matrix that you saved from the first sector. You can now process the second segment using this matrix. It would be even nicer if you could drag and drop the matrix, this is on our to do list. Best wishes, Andrew On 17 Aug 2009, at 13:33, Brett, Thomas wrote: I am an imosflm novice and have a relatively simple question. I have a 360 deg data set collected in two swathes of 180 deg (one with phi=0 and omega going 0-180 and the second with phi = 180 and omega going 0-180). What is the easiest way to process the two datasets using a matching orientation matrix (or one rotated by 180 deg as it were) so that all the data can be merged together. Is there an easy way to do it in imosflm or must one process the two sets separately and then manipulate later with pointless before scalling and merging everything together? Thanks in advance. -Tom Tom J. Brett, PhD Assistant Professor of Medicine Division of Pulmonary and Critical Care Washington University School of Medicine Campus Box 8052, 660 S. Euclid Saint Louis, MO 63110