[ccp4bb] more REFMAC problems

2008-12-23 Thread Anita Lewit-Bentley

Dear all,

just to add to the list of problems mentioned by several people:

I have been using the LSQ (least-squares) option in REFMAC (using a  
command file, as this option is not available vie the GUI interface)  
just to tidy up my structure. This works fine (with all recent REFMAC  
versions) - but the output mtz file does not carry any structure  
factors!! (again for all recent versions). One has to go through some  
complicated acrobatics to recover the structure factors corresponding  
to one's refined structure


Another problem I encountered concerns the output of Coot: the output  
coordinate file carries over the header from the input file, without  
any modifications to it. Thus if at the start of a refinement one has,  
say, cis-peptide bonds, they remain in the pdb file even after one has  
corrected them in Coot - and then REFMAC does odd things to them! In  
the end, one has to edit the pdb file output by Coot before running  
Refmac - not very automatic.


Best wishes for Christmas and the New Year, anyway!

Anita





Anita Lewit-Bentley
Unité d'Immunologie Structurale
CNRS URA 2185
Département de Biologie Structurale  Chimie
Institut Pasteur
25 rue du Dr. Roux
75724 Paris cedex 15
FRANCE

Tel: 33- (0)1 45 68 88 95
FAX: 33-(0)1 40 61 30 74
email: ale...@pasteur.fr



[ccp4bb] CCP4 cover over the Yuletide and Study Weekend periods

2008-12-23 Thread Ballard, CC (Charles)
Dear All

I am afraid that there will be no cover from the CCP4 droids at
Daresbury from 24 December until 2 January, with limited cover until 6
January.  

Here's wishing you all a Happy New Year

Charles

CCP4 core team



[ccp4bb] Mosflm cover was: CCP4 cover over the Yuletide and Study Weekend periods

2008-12-23 Thread Harry Powell

Hi folks

Just to let you all know - there will be no Mosflm cover during this  
period either.


However, Luke, Andrew and I will all be at the CCP4 Study Weekend in  
Nottingham; we will be ready, willing and able to deal with any  
enquiries you may wish to bring.



On 23 Dec 2008, at 13:01, Ballard, CC (Charles) wrote:


Dear All

I am afraid that there will be no cover from the CCP4 droids at  
Daresbury from 24 December until 2 January, with limited cover until  
6 January.


Here's wishing you all a Happy New Year

Charles

CCP4 core team



Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre,  
Hills Road, Cambridge, CB2 0QH






Re: [ccp4bb] LSQKAB, version 6.0 vs version 6.1

2008-12-23 Thread Clemens Vonrhein
Hi Dave,

On Mon, Dec 22, 2008 at 02:58:31PM -0500, Borhani, David wrote:
 I think the LSQKAB change at Line 291(old)/Line 300(new) DOES introduce
 new and possibly incorrect logic.

Very possible, but ...
 
 I haven't looked at all the code, but this one change does seem to
 substitute a check that chain, residue number, and atom name (only 3
 characters; incorrect) match [OLD] for a check that chain, residue
 number, atom name (4 chars, correct), insertion code (correct, assuming
 that the insertion codes and residues numbers in the two proteins are
 lined up correctly), AND ALT CODE match [NEW].

I read that slightly differently:

OLD: check on the first three characters of the atom name

NEW: check on the first three characters of the atom name
  AND
 check on alternate conformation
  AND
 check on insertion code

There is no (new) check on chain or residue number - which is correct,
since the LSQKAB syntax allows to specify different chain identifiers
and different residue numbers for the work and reference PDB file.

The new code makes sense to me (but please double check and correct me
if I'm wrong): without it you get a complete mess in the match-up
(since atom name, chain and residue number are simply not enough to
pick one and _only_ one atom).

 The alt code match is, I suspect, a bug, in exactly the situation that
 Jose provided: one protein may have them, but the other may not (or may
 have different ones). One should perform the alignment such that the
 protein (residue) without alt codes aligns onto the other protein
 (residue) with the A alt code; to discard the residue pair is simply
 because alt codes don't match is not correct.

I'm not sure about that: LSQKAB is intended to superimposed two sets
of atoms. For that the user needs to specify exactly (!) what atoms
belong into these two sets. Your suggestion of having LSQKAB pick
AltConf A instead of an atom without AltConf introduces new logic
into LSQKAB that wasn't there before. So I wouldn't classify that as a
bug, since it does the right thing: making sure that one and _only_
one atom will be picked (whereas before this wasn't guaranteed).

Please note that I don't say your suggestion doesn't make sense: I
like automatic superposition programs that make sensible structural
assumptions and decisions (some LSQMAN commands or SSM in Coot). Just
that LSQKAB isn't really intended that way: it does exactly what it
says on the tin.

As a comparison, I've run a test with three PDB files:

  a) just 5 residues

  b) same 5 residues, but one side-chain has two alternate
 conformations (A and B)

  c) same 5 residues, but one residue has insertion code instead of
 residue number increment

  d) same 5 residues, but now with the alternate conformation
 side-chain and the insertion code residue

Running this against 4 LSQKAB binaries:

  A) LSQKAB sources from 6.0.2, compiled against 6.0.2 libraries

  B) LSQKAB sources from 6.0.2, compiled against 6.1.0 libraries

  C) LSQKAB sources from 6.1.0, compiled against 6.0.2 libraries

  D) LSQKAB sources from 6.1.0, compiled against 6.1.0 libraries

shows some interesting items (assuming that LSQKAB should match-up
only identical atoms, i.e. leaving your suggestion of automatic
decisions aside).

 - the 6.0.2 LSQKAB source shows non-zero RMS values in a variety of
   cases

   This makes no sense, since for any pair of the above PDB files the
   common atoms are identical.

 - the 6.0.2 LSQKAB source gives different results when swapping the
   two PDB files one superposes (i.e. superposing PDB1 onto PDB2 gives
   a different result thatn superposing PDB2 onto PDB1)

   Again, this doesn't make sense.

   Both of these points are due to the missing checks introduced into
   the latest version (which make sure that only identical atoms are
   picked).

 - the 6.0.1 LSQKAB source always gives rms values of zero and the
   order of PDB files doesn't matter.

To me the 6.1.0 sources look correct ... ?

Anyway, getting back to the original question (most people reading the
CCP4bb will be bored by now anyway):

 If I do the same superposition (with a pdb file that contains 
 alternative conformations) with LSQKAB version 6.0 and 6.1:
 1) Version 6.0 reports 110 atoms to be refined and does not report any 
 error or warning. The loggraph contains data for the residues with 
 alternative conformations.
 2) Version 6.1 reports 97 atoms to be refined, and it reports 13 atoms 
 as no match for workcd atom [...]. The loggraph does NOT contain data 
 for the residues with alternative conformations.
 
 Based on that, I have assumed that version 6.0 does include atoms in 
 alternative conformations (in fact, it seems to take into account each 
 conformations independently).

I can understand that this looks like a regression in 6.1 (since it
uses more atoms and shows residues with alternate conformations in the
loggraph). But I'm fairly certain that it did the wrong thing
nevertheless, 

Re: [ccp4bb] LSQKAB, version 6.0 vs version 6.1 - reposting (Sorry!)

2008-12-23 Thread Clemens Vonrhein
Dear all,

oops - due to some disk/network issues on my side, the final edits of
my email got lost. Sorry for reposting this again (corrected):

On Mon, Dec 22, 2008 at 02:58:31PM -0500, Borhani, David wrote:
 I think the LSQKAB change at Line 291(old)/Line 300(new) DOES introduce
 new and possibly incorrect logic.

Very possible, but ...
 
 I haven't looked at all the code, but this one change does seem to
 substitute a check that chain, residue number, and atom name (only 3
 characters; incorrect) match [OLD] for a check that chain, residue
 number, atom name (4 chars, correct), insertion code (correct, assuming
 that the insertion codes and residues numbers in the two proteins are
 lined up correctly), AND ALT CODE match [NEW].

I read that slightly differently:

OLD: check on the first three characters of the atom name

NEW: check on the first three characters of the atom name
  AND
 check on alternate conformation
  AND
 check on insertion code

There is no (new) check on chain or residue number - which is correct,
since the LSQKAB syntax allows to specify different chain identifiers
and different residue numbers for the work and reference PDB file.

The new code makes sense to me (but please double check and correct me
if I'm wrong): without it you get a complete mess in the match-up
(since atom name, chain and residue number are simply not enough to
pick one and _only_ one atom).

 The alt code match is, I suspect, a bug, in exactly the situation that
 Jose provided: one protein may have them, but the other may not (or may
 have different ones). One should perform the alignment such that the
 protein (residue) without alt codes aligns onto the other protein
 (residue) with the A alt code; to discard the residue pair is simply
 because alt codes don't match is not correct.

I'm not sure about that: LSQKAB is intended to superimposed two sets
of atoms. For that the user needs to specify exactly (!) what atoms
belong into these two sets. Your suggestion of having LSQKAB pick
AltConf A instead of an atom without AltConf introduces new logic
into LSQKAB that wasn't there before. So I wouldn't classify that as a
bug, since it does the right thing: making sure that one and _only_
one atom will be picked (whereas before this wasn't guaranteed).

Please note that I don't say your suggestion doesn't make sense: I
like automatic superposition programs that make sensible structural
assumptions and decisions (some LSQMAN commands or SSM in Coot). Just
that LSQKAB isn't really intended that way (it does exactly what it
says on the tin). If one would want such a feature (which would be
nice) it needs to be coded and controlled (on/off) with some
additional input cards I guess.

 ---

As a comparison, I've run a test with four PDB files:

  a) just 5 residues

  b) same 5 residues, but one side-chain has two alternate
 conformations (A and B)

  c) same 5 residues, but one residue has insertion code instead of
 residue number increment

  d) same 5 residues, but now with the alternate conformation
 side-chain and the insertion code residue

Running this against 4 LSQKAB binaries:

  A) LSQKAB sources from 6.0.2, compiled against 6.0.2 libraries

  B) LSQKAB sources from 6.0.2, compiled against 6.1.0 libraries

  C) LSQKAB sources from 6.1.0, compiled against 6.0.2 libraries

  D) LSQKAB sources from 6.1.0, compiled against 6.1.0 libraries

shows some interesting items (assuming that LSQKAB should match-up
only identical atoms, i.e. leaving your suggestion of automatic
decisions aside).

 - the 6.0.2 LSQKAB source shows non-zero RMS values in a variety of
   cases

   This makes no sense, since for any pair of the above PDB files the
   common atoms are identical.

 - the 6.0.2 LSQKAB source gives different results when swapping the
   two PDB files one superposes (i.e. superposing PDB1 onto PDB2 gives
   a different result thatn superposing PDB2 onto PDB1)

   Again, this doesn't make sense.

   Both of these points are due to the missing checks introduced into
   the latest version (which make sure that only identical atoms are
   picked).

 - the 6.0.1 LSQKAB source always gives rms values of zero and the
   order of PDB files doesn't matter.

To me the 6.1.0 sources look correct ... ?

Anyway, getting back to the original question (most people reading the
CCP4bb will be bored by now anyway):

 If I do the same superposition (with a pdb file that contains 
 alternative conformations) with LSQKAB version 6.0 and 6.1:
 1) Version 6.0 reports 110 atoms to be refined and does not report any 
 error or warning. The loggraph contains data for the residues with 
 alternative conformations.
 2) Version 6.1 reports 97 atoms to be refined, and it reports 13 atoms 
 as no match for workcd atom [...]. The loggraph does NOT contain data 
 for the residues with alternative conformations.
 
 Based on that, I have assumed that version 6.0 

Re: [ccp4bb] LSQKAB, version 6.0 vs version 6.1 - reposting (Sorry!)

2008-12-23 Thread Borhani, David
Hi Clemens,

Thanks for all your tests; the scripts/keywords you used to run LSQKAB
with these test systems would help to clarify what may be going right
vs. going wrong.

A few points that I hope may be helpful:

1. Atom names are 4 characters. If a true match is desired, all 4
characters (even the first one that is often a  ) must be compared.

2. I think the new code is not correct, as your examples show, and as
others have found in using the program. The logic, in those cases where
alt coded atoms are present, seems to be either wrong, unexpected, or
ill-defined (i.e., it matters which coord set is work vs. reference), or
perhaps even all of the above.

3. I agree that chain, residue number, and atom name do not by
themselves specify a unique atom; insertion code must also be used. Alt
code *may* be used, and that's where (IMHO) it gets tricky (and
apparently the older versions of LSQKAB just used an implicit logic
(i.e., likely matched the first found alt coded atom, without any
checks):
A. Option one, no defined logic: just ignore the alt code; use
any (random) atom you get (first).
B. Option two, defined logic (what that logic should be is the
key point to discuss, I think).

Rigid potential logic:
1. User must explicitly specify what will constitute a match.
Absent such a specification, program stops with
an error if alt coded atoms are found (or if they don't
meet the specification).
(I don't recommend this!)

Flexible (intelligent?) potential logic:
1. Match the atom without the alt code (i.e.,  ), if it
exists; else match an atom with an altcode.
2. Now matching alt codes:
A. Is there an atom with alt code A? Use it, else look
for B, C, etc., in sort order.
(FYI: documentation (6.0.2-03) says: If there
are two or more conformations, the first (labelled A) 
is chosen for comparison.)
B. ALTERNATIVELY, use the alt coded atom with the
highest occupancy; use sort order to 
resolve ties (A  B  C... [usually, one tries,
at least, to put the most significant 
atom as the   alt code or the A atom]).
(I prefer using occupancy instead of B factor, because once one is
modeling alt conformations, occupancy receives some conscious attention;
the B will then just refine to where it needs to be given the
user-assigned occupancy [unless one is refining occupancies in SHELX].
So, in most cases, I suspect, occupancy trumps B factor. Others may
disagree.)

There also may need to be a new keyword/keyvalue to allow the user to
specify which of several potential alternative logics to use.

4. It appears to me that the new version (6.1.0) doesn't have any
changes to the FIT/MATCH keywords to handle insertion codes. If the user
specifies, for example:
FIT RESIDU SIDE 155 TO 156 CHAIN A 
MATCH RESIDU 155 TO 156 CHAIN A
then IF there exists a residue 155A in the working coords, there must
also be a residue 155A in the ref coords, else error.

There are good reasons to allow users to alter this behavior, e.g.
fitting immunoglobulin hypervariable regions, which often have (a
variable number of) insertions. Current LSQKAB logic would appear to
make this task difficult. To be more explicit, if I want to fit residues
25-40, knowing that there is a variable loop, with insertion codes after
residue 30, i.e. I want to fit 25-30 and 31-40, it would be nice to be
able to specify 25-40 and SKIP INSERTIONS or something similar.

5. Finally, ensuring that whatever logic is chosen works no matter which
coordinate set is specified as work or reference would be highly
desireable, as your examples clearly point out!

Dave

P.S. - I'm not sure I understand the problem that Wangsa mentions, but
it may be related to the 3- vs. 4-character atom name match.

 -Original Message-
 From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On 
 Behalf Of Clemens Vonrhein
 Sent: Tuesday, December 23, 2008 9:33 AM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] LSQKAB, version 6.0 vs version 6.1 - 
 reposting (Sorry!)
 
 Dear all,
 
 oops - due to some disk/network issues on my side, the final edits of
 my email got lost. Sorry for reposting this again (corrected):
 
 On Mon, Dec 22, 2008 at 02:58:31PM -0500, Borhani, David wrote:
  I think the LSQKAB change at Line 291(old)/Line 300(new) 
 DOES introduce
  new and possibly incorrect logic.
 
 Very possible, but ...
  
  I haven't looked at all the code, but this one change does seem to
  substitute a check that chain, residue number, and atom name (only 3
  characters; incorrect) match [OLD] for a check that chain, residue
  number, atom name (4 chars, correct), insertion code 
 (correct, assuming
  that the insertion codes and residues numbers in the two 
 proteins are
  lined up correctly), AND ALT CODE match [NEW].
 
 I read that slightly differently:
 
 OLD: check 

Re: [ccp4bb] LSQKAB, version 6.0 vs version 6.1 - reposting (Sorry!)

2008-12-23 Thread Clemens Vonrhein
Hi David,

On Tue, Dec 23, 2008 at 10:31:28AM -0500, Borhani, David wrote:
 Hi Clemens,
 
 Thanks for all your tests; the scripts/keywords you used to run LSQKAB
 with these test systems would help to clarify what may be going right
 vs. going wrong.

That was just a simple run with

  lsqkab workcd work.pdb refrcd ref.pdb EOF
  FIT RESI ALL 1 TO 5
  MATCH 1 TO 5
  EOF

 A few points that I hope may be helpful:
 
 1. Atom names are 4 characters. If a true match is desired, all 4
 characters (even the first one that is often a  ) must be compared.

I agree: the 3-character test is not something I'm involved with at
all. This is what LSQKAB is doing - and I _think_ it has to do partly
with MMDB (which does some shifting as far as I can remember).
 
 2. I think the new code is not correct, as your examples show, and as
 others have found in using the program. The logic, in those cases where
 alt coded atoms are present, seems to be either wrong, unexpected, or
 ill-defined (i.e., it matters which coord set is work vs. reference), or
 perhaps even all of the above.

Maybe my email wasn't quite clear: e.g. the problem with which coord
set is work vs. reference happens in the OLD code (CCP3 6.0.2) and is
fixed in the NEW code (6.1). All the unexpected behaviour is present
in the old 6.0.2 version of LSQKAB - the 6.1 version is fixed and
behaves as expected.

The original problem reported was that the output seems to suggest
that LSQKAB was using AltConf atoms in the superposition. And so it
was: but it did it wrongly (i.e. it mattered which PDB file was
defined as reference and which as work). The new code fixes that - but
if you want a feature like

  match up with AltConf A in case one PDB has no AltConf and the
  other has

then this needs to be newly introduced into LSQKAB: it never was in
there and the impression that it might have worked like that in the
old version was due to some wrong logic introducing this random
behaviour.

 3. I agree that chain, residue number, and atom name do not by
 themselves specify a unique atom; insertion code must also be used. Alt
 code *may* be used, and that's where (IMHO) it gets tricky (and
 apparently the older versions of LSQKAB just used an implicit logic
 (i.e., likely matched the first found alt coded atom, without any
 checks):
   A. Option one, no defined logic: just ignore the alt code; use
 any (random) atom you get (first).

Yes, that is 6.0.2 behaviour.

   B. Option two, defined logic (what that logic should be is the
 key point to discuss, I think).

Yes, that is 6.1 behaviour: exact matching. But yes: there should be
some better message (one can deduce this from the number of atoms in
the working set compared to the number of atoms used - printed just
above the RMS message).

 Rigid potential logic:
   1. User must explicitly specify what will constitute a match.
 Absent such a specification, program stops with
   an error if alt coded atoms are found (or if they don't
 meet the specification).
 (I don't recommend this!)
 
 Flexible (intelligent?) potential logic:
   1. Match the atom without the alt code (i.e.,  ), if it
 exists; else match an atom with an altcode.
   2. Now matching alt codes:
   A. Is there an atom with alt code A? Use it, else look
 for B, C, etc., in sort order.
   (FYI: documentation (6.0.2-03) says: If there
 are two or more conformations, the first (labelled A) 
   is chosen for comparison.)

Ah: that might have been true for pre-mmdb coordinate library use (but
even then it might have depended on the way the PDB file was
written). But with mmdb (as far as I understand) the atoms might be
stored in a different (random?) order.

   B. ALTERNATIVELY, use the alt coded atom with the
 highest occupancy; use sort order to 
   resolve ties (A  B  C... [usually, one tries,
 at least, to put the most significant 
   atom as the   alt code or the A atom]).
 (I prefer using occupancy instead of B factor, because once one is
 modeling alt conformations, occupancy receives some conscious attention;
 the B will then just refine to where it needs to be given the
 user-assigned occupancy [unless one is refining occupancies in SHELX].
 So, in most cases, I suspect, occupancy trumps B factor. Others may
 disagree.)

That would be the nicest way.

 There also may need to be a new keyword/keyvalue to allow the user to
 specify which of several potential alternative logics to use.
 
 4. It appears to me that the new version (6.1.0) doesn't have any
 changes to the FIT/MATCH keywords to handle insertion codes. If the user
 specifies, for example:
   FIT RESIDU SIDE 155 TO 156 CHAIN A 
   MATCH RESIDU 155 TO 156 CHAIN A
 then IF there exists a residue 155A in the working coords, there must
 also be a residue 155A in the ref coords, else error.

Possible: I'm not sure how the sequence 155 TO 156 is 

Re: [ccp4bb] Transferring a Free R set.

2008-12-23 Thread Gerard DVD Kleywegt
issue.  Could I just confirm whether or not you think its necessary to 
transfer initial free R assignment to any new data sets or to isomorphous 
data sets such as substrate complexes.


well, i would always do a slow-cool at the start so then it would not be 
necessary to transfer them


ian says:


Slow cooling or pseudo-MD such as randomly shifting co-ordinates
actually moves you away from convergence, and as I said convergence is


but at the stage of the refinement we're talking about (you have just placed 
the model in the asu, and no refinement has taken place against the new data 
yet) you are far away from convergence anyway. a slow-cool then does several 
beneficial things: it decouples r and rfree, reduces memory/model bias, 
and refines your starting model with a larger radius of convergence than 
minimisation


see some of axel brunger's papers, reviews and book chapters from the mid 
1990s onward - he has done a lot of tests involving SA, rfree, etc.


by the way - if you *do* want to transfer test flags, or extend a test set to 
higher resolution, etc. - dataman has a fair number of options for doing so; 
see http://xray.bmc.uu.se/usf/dataman_man.html#H9


--dvd

**
Gerard J.  Kleywegt
[Research Fellow of the Royal  Swedish Academy of Sciences]
Dept. of Cell  Molecular Biology  University of Uppsala
Biomedical Centre  Box 596
SE-751 24 Uppsala  SWEDEN

http://xray.bmc.uu.se/gerard/  mailto:ger...@xray.bmc.uu.se
**
   The opinions in this message are fictional.  Any similarity
   to actual opinions, living or dead, is purely coincidental.
**


[ccp4bb] 6.1.0 burp: imosflm from ccp4i

2008-12-23 Thread Frank von Delft

Hi (for after Xmas/NY):

When I try to run imosflm from ccp4i, the following burps to console.  
I'm almost completely sure that it's the default package-downloaded 
bltwish that runs.


Cheers!
phx


Top level CCP4 directory is /usr/local/ccp4/6.1.0/ccp4-6.1.0
Using CCP4 programs from /usr/local/ccp4/6.1.0/ccp4-6.1.0/bin
MOSDIR is /work/PRGDF/10-proc
Error in startup script: unknown namespace in import pattern itcl::*
   while executing
namespace import itcl::*
   (file 
/usr/local/ccp4/6.1.0/ccp4-6.1.0/ccp4i/imosflm/src/imosflm.tcl line 91)

   invoked from within
source $env(IMOSFLM)
   (file /usr/local/ccp4/6.1.0/ccp4-6.1.0/ccp4i/imosflm/imosflm.tcl 
line 111)

-


[ccp4bb] information for study weekend attendees...

2008-12-23 Thread harry powell

Hi folks

I found some tourist advice for visitors to Nottingham in the New  
Year...


http://news.bbc.co.uk/1/hi/england/nottinghamshire/7798194.stm


Harry
--
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre,  
Hills Road, Cambridge, CB2 2QH







[ccp4bb] MR- Problem-76 % seqeunce identity-no solution

2008-12-23 Thread Meetmr Ss
Dear all,

I struck with MR problem. MY target has 76% sequence identity with the model. I 
tried Phaser, AMoRe and Molrep. None of them gave me satisfactory solution. If 
you have any suggestions and or New programs I would like to try. 

Thanks
somu