Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Eugene Krissinel

Ian,


Perhaps someone from one of the PDB deposition sites
could comment and verify my reading of this?


let me take the liberty. Your reading of PDB documentation is
absolutely correct. PDB format has got 3 types of links: SSBOND,
LINK and CISPEP. And indeed, residue numbers have no significance
in the PDB whatsoever. The connectivity is given by SEQRES and
by the order of residues in the coordinate section (which must
be identical to SEQRES). Where this order is insufficient to
describe (extra) connectivity, LINK etc. records are used.

LINKR was never in PDB standard and for this is not admissible.
I think (Garib will give an exhaustive explanation if he wishes)
Refmac uses them for purely technical purposes from long ago.
In the end of processing, they should become one of the PDB's
link records - either at depositor or PDB side, or be removed
if they are redundant.

I am sure Garib has reasons for having LINKR records in Refmac,
however confusing this may be. It is, indeed, not a very clean
practice to use self-invented additions to PDB format, but for
as long as they are used only locally there is no a terrible harm
in it as seems.

Best,

Eugene.


On Mon, 17 Aug 2009, Ian Tickle wrote:


Tim, Garib,

Sorry, maybe I'm missing something here but how does the user specify
that (s)he wants a TRANS link between standard amino-acids (ASN-GLY) in
this case?  Isn't that the default?  I always thought the answer was to
add a LINK record for those two residues in the PDB file using the
format specified in the PDB guide, e.g.

LINK C   ASN B 729 N   GLY B 741

(or just paste the LINKR record from the output PDB file and change
LINKR to 'LINK ').

But this raises an important issue.  The PDB entries contain many
examples of this, i.e. where there's a gap in the numbering but not in
the sequence, and the PDB guide on the LINK record states:

The LINK records specify connectivity between residues *that is not
implied
by the primary structure*. (my emphasis).

My reading of this is that it's the primary structure (i.e. the SEQRES
records) that specify that the residues are contiguous, *not* the
residue numbering.  Perhaps someone from one of the PDB deposition sites
could comment and verify my reading of this?  If this is the case then
Refmac is ignoring a perfectly valid PDB format, and requiring that the
user supplies a non-agreed format! - but of course I could be wrong in
my interpretation (in which case of course I withdraw from the
argument!).  But if I'm right then it seems to me that refinement
programs should at the very minimum be able to treat completely valid
PDB entries correctly, and not require the user to make non-standard
changes.

Cheers

-- Ian


-Original Message-
From: owner-ccp...@jiscmail.ac.uk [mailto:owner-ccp...@jiscmail.ac.uk]

On

Behalf Of Garib Murshudov
Sent: 16 August 2009 22:09
To: Tim Fenn
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] LINKR in refmac

Tim is right. The link you want is TRANS. And if you want link between
alternative position then you need to add alt codes before residue

names.

Llink ids must be defined in the dictionary. There are definitions for
standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif.

For templates how to use various forms of links please have a look:
http://www.ysbl.york.ac.uk/refmac/data/template_link.txt

If you experience further difficulties please let me know and I will

try

to sort this out.

regards
Garib



2009/8/14 Tim Fenn f...@stanford.edu


On Fri, 14 Aug 2009 13:24:16 -0700
Jan Abendroth jan.abendr...@gmail.com wrote:

 How can I tell refmac to maintain the peptide link?
 Here is what I tried - the numbers above just for orientation

  1 2 3 4 5 6
 7 8



123456789012345678901234567890123456789012345678901234567890123456789012
34

567890
 LINKRC   ASN B 729 N   GLY B 741

ASN-GLY


 refmac comments in the log file ... however, still pulls the
residues
 apart. WARNING : description of link:ASN-GLY  is not in the
dictionary
 link will be created with bond_lenth =   1.260

 So, in my understanding it comes down to the question:
 how is a peptide bond referenced to in the dictionary?



take a look at the data_link_list loop in mon_lib_list.cif

(there

may
be an easier way to view this info):

TRANS..peptide  ..peptide
 default-peptide-link
PTRANS   ..peptide  PRO  ..
 default-peptide-link_pro
NMTRANS  ..peptide  PRO  ..
 default-peptide-link_cn
CIS  ..peptide  ..peptide
 cis-peptide-link
PCIS ..peptide  PRO  ..
 cis-peptide-link_pro
NMCIS..peptide  PRO  ..
 cis-peptide-link_cn

Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Garib Murshudov

Yes. There is a confusion at the moment.
REFMAC LINK or LINKR record require link record that is absent in the  
pdb.
If primary structure (sequence of amino acids/RNA/DNA and closeness in  
3d of sugars) allow linking then links between residues used  
automatically (TRANS/CIS/PTRANS/PCIS for peptides p or variations for  
RNA/DNA and corresponding links for sugars). If residues are in  
different chains then these links are not created automatically.
(There is an option to use them automatically but for incomplete model  
the results may be misleading). In these cases a user can define links  
explicitly between residues. For example:


LINK LYS A  27 TRP B  
555TRANS


Simple link record as it is in pdb does not allow to create chemical  
description of polymers fully (for example if you write:


 LINK C   ASN B 729 N   GLY B 741

It is not clear if bonds are single or double or is there any other  
modification of residues before forming a covalent bond.


I hope this issue will be sorted out with pdb soon.

There is an option in refmac to create links if residues are close  
each other but it may cause problem with incomplete models. When there  
are deletions (gap between residues as in the above case) it is not  
clear if it is because of incompleteness of the model or because of  
actual deletions. With insertion codes situation is easier.


About LINK or LINKR: refmac tries to analyse the record and if it has  
id then it assumes that it is refmac style link if it is not then it  
is pdb style.


I hope it a little bit clarifies this confusing case.

regards
Garib


On 17 Aug 2009, at 10:17, Ian Tickle wrote:



Tim, Garib,

Sorry, maybe I'm missing something here but how does the user specify
that (s)he wants a TRANS link between standard amino-acids (ASN-GLY)  
in
this case?  Isn't that the default?  I always thought the answer was  
to

add a LINK record for those two residues in the PDB file using the
format specified in the PDB guide, e.g.

LINK C   ASN B 729 N   GLY B 741

(or just paste the LINKR record from the output PDB file and change
LINKR to 'LINK ').

But this raises an important issue.  The PDB entries contain many
examples of this, i.e. where there's a gap in the numbering but not in
the sequence, and the PDB guide on the LINK record states:

The LINK records specify connectivity between residues *that is not
implied
by the primary structure*. (my emphasis).

My reading of this is that it's the primary structure (i.e. the SEQRES
records) that specify that the residues are contiguous, *not* the
residue numbering.  Perhaps someone from one of the PDB deposition  
sites

could comment and verify my reading of this?  If this is the case then
Refmac is ignoring a perfectly valid PDB format, and requiring that  
the

user supplies a non-agreed format! - but of course I could be wrong in
my interpretation (in which case of course I withdraw from the
argument!).  But if I'm right then it seems to me that refinement
programs should at the very minimum be able to treat completely valid
PDB entries correctly, and not require the user to make non-standard
changes.

Cheers

-- Ian


-Original Message-
From: owner-ccp...@jiscmail.ac.uk [mailto:owner- 
ccp...@jiscmail.ac.uk]

On

Behalf Of Garib Murshudov
Sent: 16 August 2009 22:09
To: Tim Fenn
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] LINKR in refmac

Tim is right. The link you want is TRANS. And if you want link  
between

alternative position then you need to add alt codes before residue

names.
Llink ids must be defined in the dictionary. There are definitions  
for

standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif.

For templates how to use various forms of links please have a look:
http://www.ysbl.york.ac.uk/refmac/data/template_link.txt

If you experience further difficulties please let me know and I will

try

to sort this out.

regards
Garib



2009/8/14 Tim Fenn f...@stanford.edu


On Fri, 14 Aug 2009 13:24:16 -0700
Jan Abendroth jan.abendr...@gmail.com wrote:

 How can I tell refmac to maintain the peptide link?
 Here is what I tried - the numbers above just for orientation

  1 2 3 4 5 6
 7 8



123456789012345678901234567890123456789012345678901234567890123456789012
34

567890
 LINKRC   ASN B 729 N   GLY B 741

ASN-GLY


 refmac comments in the log file ... however, still pulls the
residues
 apart. WARNING : description of link:ASN-GLY  is not in the
dictionary
 link will be created with bond_lenth =   1.260

 So, in my understanding it comes down to the question:
 how is a peptide bond referenced to in the dictionary?



take a look at the data_link_list loop in 

Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Jawahar Swaminathan

Hi,


The LINK records specify connectivity between residues *that is not
implied
by the primary structure*. (my emphasis).

My reading of this is that it's the primary structure (i.e. the SEQRES
records) that specify that the residues are contiguous, *not* the
residue numbering.  Perhaps someone from one of the PDB deposition sites
could comment and verify my reading of this?  
Ian is correct. In the PDB, LINK records are only really used to 
describe bonds that are not immediately obvious based on residue name. 
For example, Link records could be used to describe coordinate bonds 
between metal and amino acids, connections between modified residues and 
standard residues, or even glycosidic bonds between sugars. Standard 
connectivity is assumed between all standard polymer chains based on the 
sequence (SEQRES) and not residue numbering as has been pointed out. As 
long as two residues are contiguous inside a SEQRES record, it is 
assumed they have standard connections and therefore no LINK records are 
necessary (as far as the PDB is concerned) for such residues.


best regards-
Jawahar Swaminathan
PDB Depositions
Protein Databank in Europe (PDBe)
http://www.ebi.ac.uk/pdbe

If this is the case then
Refmac is ignoring a perfectly valid PDB format, and requiring that the
user supplies a non-agreed format! - but of course I could be wrong in
my interpretation (in which case of course I withdraw from the
argument!).  But if I'm right then it seems to me that refinement
programs should at the very minimum be able to treat completely valid
PDB entries correctly, and not require the user to make non-standard
changes.

Cheers

-- Ian

  

-Original Message-
From: owner-ccp...@jiscmail.ac.uk [mailto:owner-ccp...@jiscmail.ac.uk]


On
  

Behalf Of Garib Murshudov
Sent: 16 August 2009 22:09
To: Tim Fenn
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] LINKR in refmac

Tim is right. The link you want is TRANS. And if you want link between
alternative position then you need to add alt codes before residue


names.
  

Llink ids must be defined in the dictionary. There are definitions for
standard links in the dictionary: $CLIBD_MON/list/mon_lib_list.cif.

For templates how to use various forms of links please have a look:
http://www.ysbl.york.ac.uk/refmac/data/template_link.txt

If you experience further difficulties please let me know and I will


try
  

to sort this out.

regards
Garib



2009/8/14 Tim Fenn f...@stanford.edu


On Fri, 14 Aug 2009 13:24:16 -0700
Jan Abendroth jan.abendr...@gmail.com wrote:

 How can I tell refmac to maintain the peptide link?
 Here is what I tried - the numbers above just for orientation

  1 2 3 4 5 6
 7 8




123456789012345678901234567890123456789012345678901234567890123456789012
34
  

567890
 LINKRC   ASN B 729 N   GLY B 741


ASN-GLY
  


 refmac comments in the log file ... however, still pulls the
residues
 apart. WARNING : description of link:ASN-GLY  is not in the
dictionary
 link will be created with bond_lenth =   1.260

 So, in my understanding it comes down to the question:
 how is a peptide bond referenced to in the dictionary?



take a look at the data_link_list loop in mon_lib_list.cif


(there
  

may
be an easier way to view this info):

TRANS..peptide  ..peptide
 default-peptide-link
PTRANS   ..peptide  PRO  ..
 default-peptide-link_pro
NMTRANS  ..peptide  PRO  ..
 default-peptide-link_cn
CIS  ..peptide  ..peptide
 cis-peptide-link
PCIS ..peptide  PRO  ..
 cis-peptide-link_pro
NMCIS..peptide  PRO  ..
 cis-peptide-link_cn


so you probably want TRANS.

HTH,
Tim

--
-

   Tim Fenn
   f...@stanford.edu
   Stanford University, School of Medicine
   James H. Clark Center
   318 Campus Drive, Room E300
   Stanford, CA  94305-5432
   Phone:  (650) 736-1714
   FAX:  (650) 736-1961

-







Disclaimer
This communication is confidential and may contain privileged information intended solely for the named addressee(s). It may not be used or disclosed except for the purpose for which it has been sent. If you are not the intended recipient you must not review, use, disclose, copy, distribute or take any action in reliance upon it. If you have received this 

Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Ian Tickle
Hi Eugene  Jawahar

Thanks for responding!

 let me take the liberty. Your reading of PDB documentation is
 absolutely correct. PDB format has got 3 types of links: SSBOND,
 LINK and CISPEP. And indeed, residue numbers have no significance
 in the PDB whatsoever. The connectivity is given by SEQRES and
 by the order of residues in the coordinate section (which must
 be identical to SEQRES). Where this order is insufficient to
 describe (extra) connectivity, LINK etc. records are used.

I understand that to mean that it's the *order* of residues that must be
identical, not the residues themselves, i.e. it's valid for residues to
be omitted from the co-ordinate section (but not of course from the
SEQRES section!).

Let's take a couple of concrete examples.  First, suppose the sequence
is ...AGGA... and the co-ordinate section contains:  ... A4 G6 G7 A8 ...
.  Then it's unambiguous from the sequence that A4-G6, G6-G7 and G7-A8
are linked (i.e. residue no 5 is not used in this case), so a LINK
record for A4-G6 is *not* mandatory (though I assume it's not an error
to give it).

Now, assuming the same sequence, suppose that one of the Gs could not be
seen in the structure, so the co-ordinate section contains only ... A4
G6 A8 ... .  Now there's an ambiguity: is the sequence actually A4 G5 G6
A8 or is it A4 G6 G7 A8 ?  Clearly it makes a difference!  Presumably
then a LINK record would be mandatory in order to resolve the ambiguity
and identify the missing residue (i.e. either a G6-A8 link with G5
missing or a A4-G6 link with G7 missing).

 LINKR was never in PDB standard and for this is not admissible.
 I think (Garib will give an exhaustive explanation if he wishes)
 Refmac uses them for purely technical purposes from long ago.
 In the end of processing, they should become one of the PDB's
 link records - either at depositor or PDB side, or be removed
 if they are redundant.
 
 I am sure Garib has reasons for having LINKR records in Refmac,
 however confusing this may be. It is, indeed, not a very clean
 practice to use self-invented additions to PDB format, but for
 as long as they are used only locally there is no a terrible harm
 in it as seems.

It's not the presence of LINKR records that really concerns me: as you
say there's no harm done if they are only used locally - and as long as
users are not confused into thinking that the LINKR records are valid
for deposition!

My concern is that in cases (such as my first example above) where the
residue numbers are non-contiguous but no atoms have actually been
omitted, users are required to insert additional LINK records in order
to re-refine already perfectly valid and unambiguous PDB entries.  This
makes automated refinement of PDB entries very difficult!

Cheers

-- Ian


Disclaimer
This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing 
i.tic...@astex-therapeutics.com and destroy all copies of the message and any 
attached documents. 
Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, 
Cambridge CB4 0QA under number 3751674


Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Eugene Krissinel

Hi Ian,

I may be wrong here, in which case Jawahar will correct me.


I understand that to mean that it's the *order* of residues that must be
identical, not the residues themselves, i.e. it's valid for residues to
be omitted from the co-ordinate section (but not of course from the
SEQRES section!).


- correct, however:

I thought that primary purpose of LINKs in PDB is to specify
extra connectivity like between ligands and polypeptides,
also when polypeptides loop up and make SS or other bonding
etc.

If I were to deal with your example, I would look into distance
profile between residues in coordinate section, which then
gives answer to your question.


It's not the presence of LINKR records that really concerns me: as you
say there's no harm done if they are only used locally - and as long as
users are not confused into thinking that the LINKR records are valid
for deposition!


LINKR simply wouldn't validate at deposition


My concern is that in cases (such as my first example above) where the
residue numbers are non-contiguous but no atoms have actually been
omitted, users are required to insert additional LINK records in order
to re-refine already perfectly valid and unambiguous PDB entries.  This
makes automated refinement of PDB entries very difficult!


Obviously LINKR will help where distance profile is insufficient to
reliably derive linking. I believe this is what Garib says in his
e-mail. The problem, as far as I can see it, is simply that on
the refinement stage residue numbers are equivalented with residue
positions, which is wrong in general, but acceptable locally. As
in George reply of today, a generic solution would be to keep original
residue numbers merely as labels.

Hope I did not mess it up completely.

Cheers,

Eugene.



On Mon, 17 Aug 2009, Ian Tickle wrote:



Hi Eugene  Jawahar

Thanks for responding!


let me take the liberty. Your reading of PDB documentation is
absolutely correct. PDB format has got 3 types of links: SSBOND,
LINK and CISPEP. And indeed, residue numbers have no significance
in the PDB whatsoever. The connectivity is given by SEQRES and
by the order of residues in the coordinate section (which must
be identical to SEQRES). Where this order is insufficient to
describe (extra) connectivity, LINK etc. records are used.


I understand that to mean that it's the *order* of residues that must be
identical, not the residues themselves, i.e. it's valid for residues to
be omitted from the co-ordinate section (but not of course from the
SEQRES section!).

Let's take a couple of concrete examples.  First, suppose the sequence
is ...AGGA... and the co-ordinate section contains:  ... A4 G6 G7 A8 ...
.  Then it's unambiguous from the sequence that A4-G6, G6-G7 and G7-A8
are linked (i.e. residue no 5 is not used in this case), so a LINK
record for A4-G6 is *not* mandatory (though I assume it's not an error
to give it).

Now, assuming the same sequence, suppose that one of the Gs could not be
seen in the structure, so the co-ordinate section contains only ... A4
G6 A8 ... .  Now there's an ambiguity: is the sequence actually A4 G5 G6
A8 or is it A4 G6 G7 A8 ?  Clearly it makes a difference!  Presumably
then a LINK record would be mandatory in order to resolve the ambiguity
and identify the missing residue (i.e. either a G6-A8 link with G5
missing or a A4-G6 link with G7 missing).


LINKR was never in PDB standard and for this is not admissible.
I think (Garib will give an exhaustive explanation if he wishes)
Refmac uses them for purely technical purposes from long ago.
In the end of processing, they should become one of the PDB's
link records - either at depositor or PDB side, or be removed
if they are redundant.

I am sure Garib has reasons for having LINKR records in Refmac,
however confusing this may be. It is, indeed, not a very clean
practice to use self-invented additions to PDB format, but for
as long as they are used only locally there is no a terrible harm
in it as seems.


It's not the presence of LINKR records that really concerns me: as you
say there's no harm done if they are only used locally - and as long as
users are not confused into thinking that the LINKR records are valid
for deposition!

My concern is that in cases (such as my first example above) where the
residue numbers are non-contiguous but no atoms have actually been
omitted, users are required to insert additional LINK records in order
to re-refine already perfectly valid and unambiguous PDB entries.  This
makes automated refinement of PDB entries very difficult!

Cheers

-- Ian


Disclaimer
This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex 

Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Ian Tickle
Hi Eugene

 If I were to deal with your example, I would look into distance
 profile between residues in coordinate section, which then
 gives answer to your question.

OK that's interesting, I hadn't realised that the interatomic distances
played an active role in determining connectivity: I had assumed the
whole point of the various connectivity specifiers was to avoid reliance
on distance calculations.  But resolution of the kind of ambiguity I
illustrated would not occur automatically, or would it? - i.e. would the
need for it only become clear when the structure was checked manually?
Of course it makes sense to use distance information for fully-refined
structures such as those that are (hopefully!) deposited with the PDB,
but it would be more of an issue with refinement programs that have to
deal with all kinds of errant input geometry!  I suppose the fundamental
problem is that we're trying to use the same format for both purposes
and there are fundamentally conflicting requirements.

Cheers

-- Ian


Disclaimer
This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing 
i.tic...@astex-therapeutics.com and destroy all copies of the message and any 
attached documents. 
Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, 
Cambridge CB4 0QA under number 3751674


[ccp4bb] imosflm with multiple data sets

2009-08-17 Thread Brett, Thomas
I am an imosflm novice and have a relatively simple question. I have a 360 deg 
data set collected in two swathes of 180 deg (one with phi=0 and omega going 
0-180 and the second with phi = 180 and omega going 0-180). What is the easiest 
way to process the two datasets using a matching orientation matrix (or one 
rotated by 180 deg as it were) so that all the data can be merged together. Is 
there an easy way to do it in imosflm or must one process the two sets 
separately and then manipulate later with pointless before scalling and merging 
everything together?
Thanks in advance.
-Tom

Tom J. Brett, PhD
Assistant Professor of Medicine 
Division of Pulmonary and Critical Care
Washington University School of Medicine
Campus Box 8052, 660 S. Euclid
Saint Louis, MO 63110

Re: [ccp4bb] imosflm with multiple data sets

2009-08-17 Thread Andrew Leslie

Dear Tom,

  There is a straightforward way to do what you want. 
It is probably simplest to start by reading in only the images from the 
first segment (0-180). Then do the indexing, cell refinement and 
integration in the usual way.


Then read in the second segment of data. You will notice that in this 
second segment, underneath the Sector name, there is a line starting 
Matrix and this will be Unknown. If you go to the Matrix line of 
the first segment, the matrix will have a name (based on the image 
template).  Double click on the name of the matrix. A popup window 
(Matrix properties) will appear. Click on the save matrix file icon 
(a blue disc) and save the matrix with an appropriate filename.


Now go to the Matrix line of the second segment, double click (on 
Unknown) as before and this time click on the Open matrix file icon 
(a folder) and read in the matrix that you saved from the first sector. 
You can now process the second segment using this matrix.


It would be even nicer if you could drag and drop the matrix, this is 
on our to do list.


Best wishes,

Andrew

On 17 Aug 2009, at 13:33, Brett, Thomas wrote:

I am an imosflm novice and have a relatively simple question. I have a 
360 deg data set collected in two swathes of 180 deg (one with phi=0 
and omega going 0-180 and the second with phi = 180 and omega going 
0-180). What is the easiest way to process the two datasets using a 
matching orientation matrix (or one rotated by 180 deg as it were) so 
that all the data can be merged together. Is there an easy way to do 
it in imosflm or must one process the two sets separately and then 
manipulate later with pointless before scalling and merging everything 
together?

Thanks in advance.
-Tom

Tom J. Brett, PhD
Assistant Professor of Medicine
Division of Pulmonary and Critical Care
Washington University School of Medicine
Campus Box 8052, 660 S. Euclid
Saint Louis, MO 63110


Re: [ccp4bb] LINKR in refmac

2009-08-17 Thread Eugene Krissinel

If I were to deal with your example, I would look into distance
profile between residues in coordinate section, which then
gives answer to your question.


OK that's interesting, I hadn't realised that the interatomic distances
played an active role in determining connectivity: I had assumed the
whole point of the various connectivity specifiers was to avoid reliance
on distance calculations.


yes where distances are unreliable or where you cannot unambiguously
determine the link point, e.g. when it's apriori unclear which atom
is leaving upon linking, how many hydrogens are involved etc.


But resolution of the kind of ambiguity I
illustrated would not occur automatically, or would it? - i.e. would the
need for it only become clear when the structure was checked manually?


I think this depends on the quality of your data. If you have a remote
model and poor resolution then the difficulties arise. You do not LINKR
everything, or do you? If two atoms may be automatically referenced
in a residue, the same can be done for the mainchain, so I would think
that one uses LINKR when things are not self-obvious from structural
positions.


Of course it makes sense to use distance information for fully-refined
structures such as those that are (hopefully!) deposited with the PDB,
but it would be more of an issue with refinement programs that have to
deal with all kinds of errant input geometry!  I suppose the fundamental
problem is that we're trying to use the same format for both purposes
and there are fundamentally conflicting requirements.


Probably that's why there are LINKs and LINKRs I guess.

Regards,

Eugene.



Cheers

-- Ian


Disclaimer
This communication is confidential and may contain privileged information 
intended solely for the named addressee(s). It may not be used or disclosed 
except for the purpose for which it has been sent. If you are not the intended 
recipient you must not review, use, disclose, copy, distribute or take any 
action in reliance upon it. If you have received this communication in error, 
please notify Astex Therapeutics Ltd by emailing 
i.tic...@astex-therapeutics.com and destroy all copies of the message and any 
attached documents.
Astex Therapeutics Ltd monitors, controls and protects all its messaging 
traffic in compliance with its corporate email policy. The Company accepts no 
liability or responsibility for any onward transmission or use of emails and 
attachments having left the Astex Therapeutics domain.  Unless expressly 
stated, opinions in this message are those of the individual sender and not of 
Astex Therapeutics Ltd. The recipient should check this email and any 
attachments for the presence of computer viruses. Astex Therapeutics Ltd 
accepts no liability for damage caused by any virus transmitted by this email. 
E-mail is susceptible to data corruption, interception, unauthorized amendment, 
and tampering, Astex Therapeutics Ltd only send and receive e-mails on the 
basis that the Company is not liable for any such alteration or any 
consequences thereof.
Astex Therapeutics Ltd., Registered in England at 436 Cambridge Science Park, 
Cambridge CB4 0QA under number 3751674





Re: [ccp4bb] imosflm with multiple data sets

2009-08-17 Thread hari jayaram
I didnt realize the following:

You read in images from the two wedges  collected with the same
crystal orientation.

mydata_1_###.img
mydata_101_###.img


Now when you index ,if you say use images from both datasets
mydata_1_###.img use image 1,90
mydata_101_###.img use image 30 , 120

The matrix for the second wedge (mydata_101_###.img)  is still marked unknown?
Isnt this different from the behaviour in the X- mosflm . SHould the
matrixes be the same since the orientation was calculated using images
from both.

Now , If I did not force the second wedge to have the same matrix ,
using the save to file and read from file method you just described ,
does the new imosflm use the last calculated matrix from the running
session or calculate a new matrix ?..I guess I have to check some of
the data I processed with my erroneous assumption to make sure that
the matrixes for the two wedges are the same .

Thanks for clarifying this..
hari




On Mon, Aug 17, 2009 at 9:13 AM, Andrew Leslieand...@mrc-lmb.cam.ac.uk wrote:
 Dear Tom,

                  There is a straightforward way to do what you want. It is
 probably simplest to start by reading in only the images from the first
 segment (0-180). Then do the indexing, cell refinement and integration in
 the usual way.

 Then read in the second segment of data. You will notice that in this second
 segment, underneath the Sector name, there is a line starting Matrix and
 this will be Unknown. If you go to the Matrix line of the first segment,
 the matrix will have a name (based on the image template).  Double click on
 the name of the matrix. A popup window (Matrix properties) will appear.
 Click on the save matrix file icon (a blue disc) and save the matrix with
 an appropriate filename.

 Now go to the Matrix line of the second segment, double click (on Unknown)
 as before and this time click on the Open matrix file icon (a folder) and
 read in the matrix that you saved from the first sector. You can now process
 the second segment using this matrix.

 It would be even nicer if you could drag and drop the matrix, this is on
 our to do list.

 Best wishes,

 Andrew

 On 17 Aug 2009, at 13:33, Brett, Thomas wrote:

 I am an imosflm novice and have a relatively simple question. I have a 360
 deg data set collected in two swathes of 180 deg (one with phi=0 and omega
 going 0-180 and the second with phi = 180 and omega going 0-180). What is
 the easiest way to process the two datasets using a matching orientation
 matrix (or one rotated by 180 deg as it were) so that all the data can be
 merged together. Is there an easy way to do it in imosflm or must one
 process the two sets separately and then manipulate later with pointless
 before scalling and merging everything together?
 Thanks in advance.
 -Tom

 Tom J. Brett, PhD
 Assistant Professor of Medicine
 Division of Pulmonary and Critical Care
 Washington University School of Medicine
 Campus Box 8052, 660 S. Euclid
 Saint Louis, MO 63110



Re: [ccp4bb] Questions about Phaser

2009-08-17 Thread Yuan Cheng

Peter Zwart wrote:

Hi,

This can might well be due to wrong space group.

I suggest fixing the space groups by hand and running it in each
possible space group that is possible.
(P 31 2 1, P 32 2 1, P 3 2 1 are your options)
Did you run xtriage already?

P


2009/8/17 Yuan Cheng ych...@email.unc.edu:

Hey,
 I am trying to use phaser to solve a protein structure. There is predicted
to be 8 mol/asu based on Matthew coefficient analysis.I am using a protein
that shares about 35% identity with my protein as a search model. Phaser
found the first five solutions and then failed to find the 6th. The LLG and
Z-score are as following. The possible loop in the chainsawed model has been
truncated.

SOLU SET
RFZ=5.2 TFZ=7.1 PAK=0 LLG=44
RFZ=5.3 TFZ=13.8 PAK=0 LLG=169
RFZ=4.3 TFZ=69.6 PAK=0 LLG=1209
RFZ=4.6 TFZ=60.6 PAK=0 LLG=2200
RFZ=4.3 TFZ=5.7 PAK=0 LLG=2163

  I used coot to check the difference map made with the model including the
above five solutions.The first four solutions fit the density very well
(didn't see many positive or negative densities). The 5th solution didn't
fit the density at all. I saw many empty density in the map, indicating I
still need to find more solutions.The space group I am using is P3 2 1.
Could this be caused by a wrong space group?
 Could anyone give me some suggestions about this? Thanks a lot!

Yuan






Hi every,
Thanks a lot for your reply! Actually I am pretty confusing 
here about the using of different space groups.
The.mtz file I am using now as the input file for phaser is in space 
group P3. Phaser gave me the first four solutions like I mentioned in 
last email, but failed at the 5th one. Then I realized there might be 
something wrong with the space group or the data. I used Phenix.xtriage 
to re-analyze my data (P3 space group). Merohedral twinning and 
pseu-translational symmetry were found as following,


Statistics depending on twin laws
--
| Operator  | type | R obs. | Britton alpha | H alpha | ML alpha |
--
| -h,-k,l   |   M  | 0.738  | -0.137| 0.000   | 0.022|
| h,-h-k,-l |   M  | 0.047  | 0.390 | 0.429   | 0.478|
| -k,-h,-l  |   M  | 0.746  | -0.143| 0.000   | 0.022|
--

The analyses of the Patterson function reveals a significant off-origin
peak that is 82.25 % of the origin peak, indicating pseudo translational 
symmetry.


Also, The analysis indicates P 3 2 1 and its alternatives P31 2 1 and 
P32 2 1 might be the correct space group. Then I used the Sort/reindex 
MTZ fils module to change the space group to P 3 2 1. I forced Phaser to 
use P 3 2 1 as the space group to search for more solutions with the 
four solutions already found fixed. But I didn't get any better result.
I used phenix.xtriage to analyze the data in P 3 2 1. It indicates there 
might be one twinning operator,but different tests gave different 
answers.But there is still a pseudo-translational symmetry.


Statistics depending on twin laws
-
| Operator | type | R obs. | Britton alpha | H alpha | ML alpha |
-
| -h,-k,l  |   M  | 0.759  | -0.152| 0.000   | 0.022|
-

The analyses of the Patterson function reveals a significant off-origin
peak that is 82.17 % of the origin peak, indicating pseudo translational 
symmetry.


I have a couple of question now
1)Do I need go back to HKL2000 and redo the index,integrate and scale. 
Since the .sca and .mtz I have now is in P3. I don't know whether the 
unit cell dimension is going to change if I redo it in P3 2 1.


2)what does the pseudo-translational symmetry actually means? I don't 
quite understand this concept and what should I do about it?


This is a really long email. I appreciate your attention very much.
Yuan


[ccp4bb] Summary of Extracting Amino Acid Sequence from PDB File

2009-08-17 Thread Buz Barstow

Dear All,

Thanks to everyone who replied to my query about extracting an amino  
acid sequence from a PDB file!


Here is a summary of responses of my query;

1. Use SwissPDBViewer

2. Use Pymol 1.2

load $TUT/1hpv.pdb

save 1hpv.fasta

# or by selection

save 1hpv_A.fasta, chain A


3. With Phenix, you can use the phenix.print_sequence tool to output  
the sequence in FASTA format.


4. If the PDB file is in the PDB, you can download the primary  
sequence from here.


5. Use MOLEMAN2

6. Use PDBSET (part of CCP4)

All the best,

---Buz


[ccp4bb] space groups not supported by Refmac5

2009-08-17 Thread Arefeh Seyedarabi
Hi,

I have a question; recently I have encountered a few space groups which are
not supported by Refmac5. Examples include I2 and P21221. Both these space
groups have been identified as the best solutions for the two different
datasets I am working on using Pointless. However, I am faced with
difficulties in Refmac5, and the program fails to complete when I select
refinement cycles with Arp-waters...with the message saying 'space group not
supported'.

Any suggestions on how this problem could be overcome?

Regards,

Arefeh

Arefeh Seyedarabi, PhD
Postdoctoral research assistant
School of Biological and Chemical sciences
Queen Mary, University of London
Mile End road
London
E1 4NS
Based at Joseph Priestley Building G.35

020 78828480


Re: [ccp4bb] space groups not supported by Refmac5

2009-08-17 Thread David J. Schuller
For the latter, P21221, you could reindex to get P21212, which is
supported.

-  
===
You can't possibly be a scientist if you mind people
thinking that you're a fool. - Wonko the Sane
===
   David J. Schuller
   modern man in a post-modern world
   MacCHESS, Cornell University
   schul...@cornell.edu


On Mon, 2009-08-17 at 17:37 +0100, Arefeh Seyedarabi wrote:
 Hi,
 
 I have a question; recently I have encountered a few space groups which are
 not supported by Refmac5. Examples include I2 and P21221. Both these space
 groups have been identified as the best solutions for the two different
 datasets I am working on using Pointless. However, I am faced with
 difficulties in Refmac5, and the program fails to complete when I select
 refinement cycles with Arp-waters...with the message saying 'space group not
 supported'.
 
 Any suggestions on how this problem could be overcome?
 
 Regards,
 
 Arefeh
 
 Arefeh Seyedarabi, PhD
 Postdoctoral research assistant
 School of Biological and Chemical sciences
 Queen Mary, University of London
 Mile End road
 London
 E1 4NS
 Based at Joseph Priestley Building G.35
 
 020 78828480


Re: [ccp4bb] space groups not supported by Refmac5

2009-08-17 Thread Ethan Merritt
On Monday 17 August 2009 09:37:40 Arefeh Seyedarabi wrote:
 Hi,
 
 I have a question; recently I have encountered a few space groups which are
 not supported by Refmac5. Examples include I2 and P21221. 

This is not correct.  Refmac is perfectly happy with these settings,
although there are some glitches to be careful of when you eventually
get to the point of data deposition.  (See earlier thread about
mtz2various mangling the spacegroup description when converting to *.cif).

 Both these space 
 groups have been identified as the best solutions for the two different
 datasets I am working on using Pointless. However, I am faced with
 difficulties in Refmac5, and the program fails to complete when I select
 refinement cycles with Arp-waters...with the message saying 'space group not
 supported'.

Ah. Perhaps this is a message from Arp/wArp? I don't know.
But having recently refined several I2 structures in refmac,
I can attest that it works fine in recent versions.

Ethan

-- 
Ethan A Merritt
Biomolecular Structure Center
University of Washington, Seattle 98195-7742


Re: [ccp4bb] space groups not supported by Refmac5

2009-08-17 Thread S. Karthikeyan
Hi,

Following mail (retrived from ccp4bb archieve) may help for intechanging the
cell axis.
HTH
S.Karthikeyan

==
I have a CCP4 script for this:

http://bl831.als.lbl.gov/~jamesh/elves/scripts/reindex.com

If you run the script like this:
reindex.com yourdata.mtz P22121

Then you will get a file called reindexed.mtz that has the space group P21212
with the cell permuted so that the shortest edge is b. Specifying P21221 will
give you P21212 with the shortest edge at c, and P21212 will give you the
shortest edge at a, like it is supposed to be. Like George said, this way you
are always working in a cannonical space group so you will have much less
portability issues between programs.

-James Holton
MAD Scientist

=


 For the latter, P21221, you could reindex to get P21212, which is
 supported.

 -
 ===
 You can't possibly be a scientist if you mind people
 thinking that you're a fool. - Wonko the Sane
 ===
David J. Schuller
modern man in a post-modern world
MacCHESS, Cornell University
schul...@cornell.edu


 On Mon, 2009-08-17 at 17:37 +0100, Arefeh Seyedarabi wrote:
 Hi,

 I have a question; recently I have encountered a few space groups which are
 not supported by Refmac5. Examples include I2 and P21221. Both these space
 groups have been identified as the best solutions for the two different
 datasets I am working on using Pointless. However, I am faced with
 difficulties in Refmac5, and the program fails to complete when I select
 refinement cycles with Arp-waters...with the message saying 'space group not
 supported'.

 Any suggestions on how this problem could be overcome?

 Regards,

 Arefeh

 Arefeh Seyedarabi, PhD
 Postdoctoral research assistant
 School of Biological and Chemical sciences
 Queen Mary, University of London
 Mile End road
 London
 E1 4NS
 Based at Joseph Priestley Building G.35

 020 78828480




Re: [ccp4bb] space groups not supported by Refmac5

2009-08-17 Thread Andreas Förster

Yeah, I2 works indeed just fine (for me, right now).


Andreas


Ethan Merritt wrote:

On Monday 17 August 2009 09:37:40 Arefeh Seyedarabi wrote:

Hi,

I have a question; recently I have encountered a few space groups which are
not supported by Refmac5. Examples include I2 and P21221. 


This is not correct.  Refmac is perfectly happy with these settings,
although there are some glitches to be careful of when you eventually
get to the point of data deposition.  (See earlier thread about
mtz2various mangling the spacegroup description when converting to *.cif).

Both these space 
groups have been identified as the best solutions for the two different

datasets I am working on using Pointless. However, I am faced with
difficulties in Refmac5, and the program fails to complete when I select
refinement cycles with Arp-waters...with the message saying 'space group not
supported'.


Ah. Perhaps this is a message from Arp/wArp? I don't know.
But having recently refined several I2 structures in refmac,
I can attest that it works fine in recent versions.

Ethan



--
Andreas Förster, Research Associate
Paul Freemont  Xiaodong Zhang Labs
Department of Biochemistry, Imperial College London
http://www.msf.bio.ic.ac.uk


[ccp4bb] Protein Protein interactions

2009-08-17 Thread protein.chemist protein.chemist
Hello All,

Can anyone tell me what are the programs used to find out the different
interactions in a protein.
I am talking about both intra and intermolecular interactions.

Thanks in advance.

Mariah



-- 
Mariah Jones
Department of Biochemistry
University of Florida


Re: [ccp4bb] Protein Protein interactions

2009-08-17 Thread Regina Kettering
CCP4 has a program called contact where you can specify the chains, distances 
and atoms you want to explore for contacts and it will give a list of all 
contacts that fit that criteria, along with the distance.  It works best if you 
already have an idea for which chain interfaces you want to explore.  A much 
slower way is to use coot to bring up the model and look at the residue 
environments.  Again, much better if you already know which residues to use.

A better predictor for unknown interfaces would be PISA, accessible either 
directly through EMBL-EBI or indirectly through ExPASy.

Regina

--- On Mon, 8/17/09, protein.chemist protein.chemist pp73...@gmail.com wrote:

 From: protein.chemist protein.chemist pp73...@gmail.com
 Subject: [ccp4bb] Protein Protein interactions
 To: CCP4BB@JISCMAIL.AC.UK
 Date: Monday, August 17, 2009, 1:13 PM
 Hello All,
 
 Can anyone tell me what are the programs used to find out
 the different interactions in a protein.
 I am talking about both intra and intermolecular
 interactions.
 
 Thanks in advance.
 
 Mariah
 
 
 
 
 -- 
 Mariah Jones
 Department of Biochemistry
 University of Florida
 
 


  


[ccp4bb] MX beamtime at CLS

2009-08-17 Thread Pawel Grochulski
Dear Crystallographers,

At the Canadian Light Source the Call for Proposals is currently open
AND accepting General User Proposals. 

*   Macromolecular Crystallography (08ID-1 beamline)

*   Deadline: September 2, 2009
*   Scheduling Period:  October 27 - December 18, 2009
*   A call for proposals is issued four times per year
*   Process to apply for beamtime
http://www.lightsource.ca/uso/macromolecular_crystallography.php 

Please visit http://ex.lightsource.ca/cmcf/ to gain updated information
about our experimental setup.

 

Regards,

 

Pawel

__
Pawel Grochulski, Ph.D., D.Sc.

Canadian Macromolecular Crystallography Facility

Canadian Light Source Inc.,

101 Perimeter Road, Saskatoon, 

SK S7N 0X4, Canada.

Phone: 306-657-3538;  Fax: 306-657-3535

http://ex.lightsource.ca/cmcf/ http://ex.lightsource.ca/cmcf/ 



[ccp4bb] Scale datasets from different crystals

2009-08-17 Thread Donald Damian Raymond

Hello all,
	I have incomplete data from two different crystals. Is there a  
program in ccp4 that I can use to merge the two scaled datasets? I  
used CAD and scaleit, but it does not give any scaling statistics like  
completeness and Rmerge. Any help would be appreciated.

-Donald.


Re: [ccp4bb] Scale datasets from different crystals

2009-08-17 Thread Pete Meyer
If you want to treat the end result as a single dataset, you're probably 
 better off combining unmerged datasets (two different batches in 
scala, or two different scaling sets in scalepack).


As far as I understand it, the data model in the mtz files 
(project/crystal/dataset/column) isn't really set up to allow combining 
two datasets post-merging; so most likely you'd have to cook up a custom 
procedure (most likely involving some duct tape, or equivalently dumping 
to ascii after scaling, merging overlapping reflections, and 
reconversion to mtz).


Pete

Donald Damian Raymond wrote:

Hello all,
	I have incomplete data from two different crystals. Is there a  
program in ccp4 that I can use to merge the two scaled datasets? I  
used CAD and scaleit, but it does not give any scaling statistics like  
completeness and Rmerge. Any help would be appreciated.

-Donald.


[ccp4bb] How to determine water number?

2009-08-17 Thread AntonioLeung
Hi all,
  
 I am a novice working on protein structure. When I pick water using COOT, too 
many waters picked, filling in the whole cell. My question is how can I 
determine which water is needed, which is not needed?

Re: [ccp4bb] How to determine water number?

2009-08-17 Thread Pavel Afonine

Hi Antonio,

My question is how can I determine which water is needed, which is not 
needed?


may be this will give you some clues - here are the approximate criteria 
for water picking that are used in automatic water picking and 
refinement in  phenix.refine:


1) peak at mFo-DFc map is higher than ~3sigma, and
2) peak center is within a hydrogen bond to another atom (water or 
macromolecule), and
3) peak has approximately the same shape as a water molecule would have 
at this resolution and local environment, and

4) peak at 2mFo-DFc is higher than ~1.5 sigma,

after a round of coordinate and B-factor refinement,

the criteria (2-3-4) are still ok, and

5) refined B-factor of newly placed water is meaningful (didn't jump to 
large value), otherwise a water is deleted.


There are a few other technical tricks to make this process robust and 
efficient at high resolution, higher than ~1.2A or so, but this is, I 
guess, beyond of what you were asking about.


Pavel.


Re: [ccp4bb] imosflm with multiple data sets

2009-08-17 Thread Frank von Delft

Actually, drag-and-drop DOES work, and is *dead* handy!

(But a considerable annoyance:  you HAVE to open the sector to be able 
to click on the matrix line -- and then you have to drag that matrix 
past all the 300 (or whatever) images to get to the next sector.  For 
many images, this really slow.  Better to put matrix and images on 
separate sub-nodes.)



Andrew Leslie wrote:

Dear Tom,

  There is a straightforward way to do what you want. 
It is probably simplest to start by reading in only the images from 
the first segment (0-180). Then do the indexing, cell refinement and 
integration in the usual way.


Then read in the second segment of data. You will notice that in this 
second segment, underneath the Sector name, there is a line starting 
Matrix and this will be Unknown. If you go to the Matrix line of 
the first segment, the matrix will have a name (based on the image 
template).  Double click on the name of the matrix. A popup window 
(Matrix properties) will appear. Click on the save matrix file icon 
(a blue disc) and save the matrix with an appropriate filename.


Now go to the Matrix line of the second segment, double click (on 
Unknown) as before and this time click on the Open matrix file icon 
(a folder) and read in the matrix that you saved from the first 
sector. You can now process the second segment using this matrix.


It would be even nicer if you could drag and drop the matrix, this 
is on our to do list.


Best wishes,

Andrew

On 17 Aug 2009, at 13:33, Brett, Thomas wrote:

I am an imosflm novice and have a relatively simple question. I have 
a 360 deg data set collected in two swathes of 180 deg (one with 
phi=0 and omega going 0-180 and the second with phi = 180 and omega 
going 0-180). What is the easiest way to process the two datasets 
using a matching orientation matrix (or one rotated by 180 deg as it 
were) so that all the data can be merged together. Is there an easy 
way to do it in imosflm or must one process the two sets separately 
and then manipulate later with pointless before scalling and merging 
everything together?

Thanks in advance.
-Tom

Tom J. Brett, PhD
Assistant Professor of Medicine
Division of Pulmonary and Critical Care
Washington University School of Medicine
Campus Box 8052, 660 S. Euclid
Saint Louis, MO 63110