Re: [ccp4bb] disordered helix

2013-05-13 Thread Dale Tronrud

   Sometimes a floppy bit of a protein is even more floppy in a
particular crystal form.  Your maps do not appear to support your
model of a helix in this location.  I would not build it unless
maps based on later refinement show something reasonable in the
omit map.  (Of course if you leave out the helix, all your maps
will be omit maps.)

   It is quite common to submit models to the PDB that do not
contain all of the amino acids expected based on the sequence.
If you can't see where the chain goes you certainly can't be
expected to build it.

Dale Tronrud

On 05/13/2013 04:23 AM, atul kumar wrote:

I have attached the map and omit map(after deleting helix) images.

2Fo-Fc(1 sigma)

Fo-Fc(3sigma)

On 5/13/13, Eleanor Dodson eleanor.dod...@york.ac.uk wrote:

Hard to say without seeing the maps and experimenting. My first check would
be to set the NTD occupancies to 0.0 - refine the CTD alone, then look at
the maps in COOT.

Or maybe let an automatic modelling building program such as Buccaneer try
to rebuild the NTD section, with starting phases from the CTD.

Eleanor




On 13 May 2013 09:04, atul kumar atulsingh21...@gmail.com wrote:


Dear all,

I have solved the structure of my protein by molecular replacement at
2.9A, with Rfactor and Rfree 18 and 22 respectively. Overall everything
seems fine, its a two domain protein NTD and CTD, the NTD have high
average
B factor compared to CTD. A helix of NTD seems to be disordered, I tried
different geometric weights but the refined structure does not seem to
follow proper geometry for this helix. The B-factor of this helix is very
high compared to overall B factor for NTD and omit map shows only some
partial density in this region( off course not conclusive). All the
homologous structure have helix in this region although with high
B-factor.
Should I submit the current pdb or need more refinement?

thanks and regards

Atul Kumar





Re: [ccp4bb] ctruncate bug?

2013-06-20 Thread Dale Tronrud

   If you are refining against F's you have to find some way to avoid
calculating the square root of a negative number.  That is why people
have historically rejected negative I's and why Truncate and cTruncate
were invented.

   When refining against I, the calculation of (Iobs - Icalc)^2 couldn't
care less if Iobs happens to be negative.

   As for why people still refine against F...  When I was distributing
a refinement package it could refine against I but no one wanted to do
that.  The R values ended up higher, but they were looking at R
values calculated from F's.  Of course the F based R values are lower
when you refine against F's, that means nothing.

   If we could get the PDB to report both the F and I based R values
for all models maybe we could get a start toward moving to intensity
refinement.

Dale Tronrud

On 06/20/2013 09:06 AM, Douglas Theobald wrote:

Just trying to understand the basic issues here.  How could refining directly 
against intensities solve the fundamental problem of negative intensity values?


On Jun 20, 2013, at 11:34 AM, Bernhard Rupp hofkristall...@gmail.com wrote:


As a maybe better alternative, we should (once again) consider to refine 
against intensities (and I guess George Sheldrick would agree here).


I have a simple question - what exactly, short of some sort of historic inertia 
(or memory lapse), is the reason NOT to refine against intensities?

Best, BR


Re: [ccp4bb] Alternating positive and negative density

2013-06-24 Thread Dale Tronrud

   Based on eye-balling your map it looks to me that your grid spacing
is about 0.5 A.  The wavelength of your ripple is 4 grid spacings, and
the ripple is right along the x axis.  My guess is that you have a rogue
reflection with index of h00 where h is about 2 A resolution.

   How you are getting this in multiple data sets is a mystery to me,
but I would concentrate on finding that reflection and figuring out
why it is anomalously large.  Start with the Fourier coefficients that
went into calculating this map to find the exact value of h causing the
problem and then track that reflection back through your Fcalc's and
Fobs's.

Dale Tronrud

On 06/23/2013 09:57 PM, Peter Randolph wrote:

Short version:
Hi, I'm working on what should be a straightforward molecular
replacement problem (already solved protein in new space group), but my
Fo-Fc map contains a peculiar series of alternating positive and
negative peaks of difference density. I'm wondering if anyone has anyone
seen this before? Sample images are attached and more background is below.

More background:
I had initially solved an /apo/ structure of my protein (from previous
diffraction data in another crystal form), and more recently collected
diffraction data for crystals of the protein co-crystallized with
potential binding partners (small RNAs). All the datasets I've processed
so far have the same spacegroup (P2(1)2(1)2(1)) and cell dimensions as
the apo structure.

I have tried two general approaches, both with the same initial steps of
indexing / integrating / scaling in XDS, converting to MTZ format
without R-free flags, then importing R-free-flags from the (previous)
apo structure's MTZ.  I would then run phenix.refine for initial
rigid-body refinement using the apo-model and the new mtz to see if
there were signs of any new positive density corresponding to bound
ligands. While the 2Fo-Fc map fits the apo protein 3D model perfectly,
the Fo-Fc map shows bands of alternating positive and negative density
running throughout the structure.  What's odd is that these 'bands'
appear to be systematic rather than random (please see attached image),
and aren't located anywhere that a binding partner could bind, leading
me to suspect they may be artefactual (these bands actually run through
the body of the protein, so one possibility is that the b-strands are
off-register by a multiple of a peptide unit?). If I use the same mtz
file and structural model, and instead do molecular replacement with
phaser, I see the same issue.  I've tried this workflow with a couple of
datasets and using P222 as well as P2(1)2(1)2(1), and each time I see
the same issue of spurious(?) bands. Any help or advice would be much
appreciated, especially if anyone has seen anything like this?

Thanks a lot,
Peter Randolph

--
Peter Randolph
PhD Candidate
Mura Laboratory
Department of Chemistry
University of Virginia
(434)924.7979


Re: [ccp4bb] modified amino acids in the PDB

2013-07-09 Thread Dale Tronrud

On 07/09/2013 07:23 AM, Mark J van Raaij wrote:

- really the only complicated case would be where a group is covalently linked 
to more than one amino acid, wouldn't it? Any case where only one covalent link 
with an is present could (should?) be treated as a special amino acid, i.e. 
like selenomethionine.
- groups without any covalent links to the protein are better kept separate I 
would think (but I guess this is stating the obvious).



   Let's consider one of your simple cases.  Imagine a heterodimer
(chains alpha and beta) with a single disulfide link between the
peptides.  Do you prefer to have an alpha chain with one residue
being a CYS with an entire beta chain attached as a single residue,
or a beta chain with one residue being a CYS with an entire alpha
chain as a single residue.  Either way you are going to have trouble
fitting into the PDB format's five columns all the unique atom names
for the bloated residue. ;-)

   The problem with this kind of topic is that the molecule is what
it is and it doesn't care how we describe it.  People break the molecule
up into parts to help in their understanding of it and different
people have different needs.  Do you prefer to think of rhodopsin as
containing a LYS residue linked by a Schiff base linkage to retinal
or as having a single, monster, residue with a name you have probably
never heard of?  There is value in both representations depending
on the context.

   A really nice feature of the geometry definitions that have come
out of the Refmac community is that one can define the monster residue
in terms of the LYS-(Schiff linkage)-retinal breakdown.  What hasn't
been done is to create the software that will convert a model from one
form to the other, as the user needs.

   I think this is the direction we should go.  Instead of arguing if,
for example, the B factor column should contain the total isotropic
B factor or the residual B factor unfit by the overarching TLS model of
motion, the file supplied by the wwPDB should be a complete,
unambiguous, representation of the model and software should exist
that displays for the user whatever representation they want.  Then the
representation stored in the master repository would not be very
important.

Dale Tronrud


Mark J van Raaij
Lab 20B
Dpto de Estructura de Macromoleculas
Centro Nacional de Biotecnologia - CSIC
c/Darwin 3
E-28049 Madrid, Spain
tel. (+34) 91 585 4616
http://www.cnb.csic.es/~mjvanraaij





On 9 Jul 2013, at 12:49, Frances C. Bernstein wrote:


In trying to formulate a suggested policy on het groups
versus modified side chains one needs to think about the
various cases that have arisen.

Perhaps the earliest one I can think of is a heme group.
One could view it as a very large decoration on a side
chain but, as everyone knows, one heme group makes four
links to residues.  In the early days of the PDB we decided
that heme obviously had to be represented as a separate group.

I would also point out that nobody would seriously suggest that
selenomethionine should be represented as a methionine with a
missing sulfur and a selenium het group bound to it.

Unfortunately all the cases that fall between selenomethionine
and heme are more difficult.  Perhaps the best that one must
hope for is that whichever representation is chosen for a
particular case, it be consistent across all entries.

  Frances

P.S. One can also have similar discussions about the representation
of microheterogeneity and of sugar chains but we should leave those
for another day.

=
Bernstein + Sons
*   *   Information Systems Consultants
5 Brewster Lane, Bellport, NY 11713-2803
*   * ***
 *Frances C. Bernstein
  *   ***  f...@bernstein-plus-sons.com
*** *
  *   *** 1-631-286-1339FAX: 1-631-286-1999
=

On Tue, 9 Jul 2013, MARTYN SYMMONS wrote:


Hi Clemens
I guess the reason you say 'arbitrary' is because there is no explanation 
of this
rule decision?
   It would be nice if some rationalization was available alongside the values 
given.
So a sentence along the lines of 'we set the number owing to the following
considerations' ?
   However a further layer of variation is that the rule does not seem to be
consistently applied
  - just browsing CYS modifications:
iodoacetamide treatment gives a CYS with only 4 additional atoms but it is 
split
off as  ACM.
However some ligands much larger than 10 residues have been kept with the 
cysteine
( for example CY7 in 2jiv and NPH in 1a18.
My betting is that it depends on whether something has been seen 'going 
solo' as a
non-covalent ligand previously so that it pops up as an atomic structural match 
with
a pre-defined three-letter code.
   This would explain for example the ACM case which you might expect to occur 
in a
modified Cys.  But it has also been observed as a non

Re: [ccp4bb] Dose anyone see this ligand before?

2013-07-17 Thread Dale Tronrud

   Do you have any reason to expect either of these molecules would be in
your crystal?   The model you build has to fit the density, be consistent with
the surrounding environment (which you haven't shared with us) and you
have to have some story for how that molecule got in your crystal.  Personally
I would steer away from industrial compounds and focus more on biological
molecules and common additives used in purification and crystallization.

   The environment is critical to identifying this molecule.  What hydrogen 
bonds
does this molecule make?  What charges are near by?  Certainly the presence
or absence of hydrogen bonds will distinguish between these two compounds
before you go to the trouble to build a model of either.

Dale Tronrud

On 7/17/2013 6:35 AM, Wei Feng wrote:

Dear all,
Thank you for your advices.
I had tried to use MPD and pyrophosphate etc to fix the density map but all of 
them were too small.
We guess that the molecular formula should be C8H18O2. So we search this 
formula in google and find two candidate molecules
1: http://flyingexport.en.ecplaza.net/dhad-99-5--137042-689140.html
2: http://en.m.wikipedia.org/wiki/Di-tert-butyl_peroxide
Could you tell me how to get the coordinate of these molecules?
Thank you for your time!
Wei






Re: [ccp4bb] A case of perfect pseudomerehedral twinning?

2013-10-15 Thread Dale Tronrud
   Since Phil is no doubt in bed, I'll answer the easier part.  Your
second matrix is nearly the equivalent position (x,-y,-z).  This
is a two-fold rotation about the x axis.  You also have a translation
of about 31 A along x so if your A cell edge is about 62 A you have
a 2_1 screw.

Dale Tronrud

On 10/15/2013 12:29 PM, Yarrow Madrona wrote:
 Hi Phil,
 
 Thanks for your help.
 
 I ran a Find-NCS routine in the phenix package. It came up with what I
 pasted below:
 I am assuming the the first rotation matrix is just the identity. I need
 to read more to understand rotation matrices but I think the second one
 should have only a single -1 to account for a possible perfect 2(1) screw
 axis between the two subunits in the P21 asymetric unit. I am not sure why
 there are two -1 values. I may be way off in my interpretation in which
 case I will go read some more. I will also try what you suggested. Thanks.
 
 -Yarrow
 
 NCS operator using PDB
 
 #1 new_operator
 rota_matrix1.0.0.
 rota_matrix0.1.0.
 rota_matrix0.0.1.
 tran_orth 0.0.0.
 
 center_orth   17.72011.4604   71.4860
 RMSD = 0
 (Is this the identity?)
 
 #2 new_operator
 
 rota_matrix0.9994   -0.02590.0250
 rota_matrix   -0.0260   -0.99970.0018
 rota_matrix0.0249   -0.0025   -0.9997
 tran_orth   -30.8649  -11.9694  166.9271
 Hello Yarrow,

 Since you have a refined molecular replacement solution I recommend
 using that rather than global intensity statistics.

 Obviously if you solve in P21 and it's really P212121 you should have
 twice the number of molecules in the asymmetric unit and one half of the
 P21 asymmetric unit should be identical to the other half.

 Since you've got decent resolution I think you can determine the real
 situation for yourself: one approach would be to test to see if you can
 symmetrize the P21 asymmetric unit so that the two halves are identical.
   You could do this via stiff NCS restraints (cartesian would be better
 than dihedral).  After all the relative XYZs and even B-factors would be
 more or less identical if you've rescaled a P212121 crystal form in P21.
   If something violates the NCS than it can't really be P212121.

 Alternatively you can look for clear/obvious symmetry breaking between
 the two halves: different side-chain rotamers for surface side-chains
 for example.  If you've got an ordered, systematic, difference in
 electron density between the two halves of the asymmetric unit in P21
 then that's a basis for describing it as P21 rather than P212121.
 However if the two halves look nearly identical, down to equivalent
 water molecule densities, then you've got no experimental evidence that
 P21 with 2x molecules generates a better model than P212121 than 1x
 molecules.  An averaging program would show very high correlation
 between the two halves of the P21 asymmetric unit if it was really
 P212121 and you could overlap the maps corresponding to the different
 monomers using those programs.

 Phil Jeffrey
 Princeton


 
 


Re: [ccp4bb] A case of perfect pseudomerehedral twinning?

2013-10-15 Thread Dale Tronrud
   R factors cannot be used to detect twining.  The traditional R
is calculated using structure factors (roughly the square root of
intensity) but you can't do that calculation in the presence of
twining because each structure factor contributes to two intensities.
The formula for the R in the presence of twining is very
different than that of the formula used in its absence.  It would
have been better to have used a different name and prevent the
confusion.

   If you are worried about your systematic absences you need to
figure out which images they were recorded on and judge the spot
for yourself.

   Everything you have said points to your crystal being P212121
(or very nearly P212121).

Dale Tronrud

On 10/15/2013 02:31 PM, Yarrow Madrona wrote:
 Thank you Dale,
 
 I will hit-the-books to better the rotation matrices. I am concluding
 from all of this that the space group is indeed P212121. So I still wonder
 why I have some outliers in the intensity stats for the two additional
 screw axis and why R and Rfree both drop by 5% when I apply a twin law to
 refinement in P21.
 
 Thanks for your help.
 
 -Yarrow
 
 
Since Phil is no doubt in bed, I'll answer the easier part.  Your
 second matrix is nearly the equivalent position (x,-y,-z).  This
 is a two-fold rotation about the x axis.  You also have a translation
 of about 31 A along x so if your A cell edge is about 62 A you have
 a 2_1 screw.

 Dale Tronrud

 On 10/15/2013 12:29 PM, Yarrow Madrona wrote:
 Hi Phil,

 Thanks for your help.

 I ran a Find-NCS routine in the phenix package. It came up with what I
 pasted below:
 I am assuming the the first rotation matrix is just the identity. I need
 to read more to understand rotation matrices but I think the second one
 should have only a single -1 to account for a possible perfect 2(1)
 screw
 axis between the two subunits in the P21 asymetric unit. I am not sure
 why
 there are two -1 values. I may be way off in my interpretation in which
 case I will go read some more. I will also try what you suggested.
 Thanks.

 -Yarrow

 NCS operator using PDB

 #1 new_operator
 rota_matrix1.0.0.
 rota_matrix0.1.0.
 rota_matrix0.0.1.
 tran_orth 0.0.0.

 center_orth   17.72011.4604   71.4860
 RMSD = 0
 (Is this the identity?)

 #2 new_operator

 rota_matrix0.9994   -0.02590.0250
 rota_matrix   -0.0260   -0.99970.0018
 rota_matrix0.0249   -0.0025   -0.9997
 tran_orth   -30.8649  -11.9694  166.9271
 Hello Yarrow,

 Since you have a refined molecular replacement solution I recommend
 using that rather than global intensity statistics.

 Obviously if you solve in P21 and it's really P212121 you should have
 twice the number of molecules in the asymmetric unit and one half of
 the
 P21 asymmetric unit should be identical to the other half.

 Since you've got decent resolution I think you can determine the real
 situation for yourself: one approach would be to test to see if you can
 symmetrize the P21 asymmetric unit so that the two halves are
 identical.
   You could do this via stiff NCS restraints (cartesian would be better
 than dihedral).  After all the relative XYZs and even B-factors would
 be
 more or less identical if you've rescaled a P212121 crystal form in
 P21.
   If something violates the NCS than it can't really be P212121.

 Alternatively you can look for clear/obvious symmetry breaking between
 the two halves: different side-chain rotamers for surface side-chains
 for example.  If you've got an ordered, systematic, difference in
 electron density between the two halves of the asymmetric unit in P21
 then that's a basis for describing it as P21 rather than P212121.
 However if the two halves look nearly identical, down to equivalent
 water molecule densities, then you've got no experimental evidence that
 P21 with 2x molecules generates a better model than P212121 than 1x
 molecules.  An averaging program would show very high correlation
 between the two halves of the P21 asymmetric unit if it was really
 P212121 and you could overlap the maps corresponding to the different
 monomers using those programs.

 Phil Jeffrey
 Princeton






 
 


Re: [ccp4bb] Problematic PDBs

2013-10-17 Thread Dale Tronrud
   I would start with 1E4M (residue 361 of chain M) and 1QW9 (170 of
chain B).  First show the model and then reveal the electron density.
This promotes a healthy skepticism of PDB models and enforces the
importance of always looking at a model in the context of the map.

   For model building I would recommend 2PWJ and 3SQK.  In 3SQK the
linker to the His tag in chain B was built using the wrong sequence.
It is fairly easy to build a sequence into the density and then
recognize what the linker actually is.  In 2PWJ the wrong sequence was
used up to residue 31.  I've never been able to figure out how this
error came to be.  Some horrible, horrible mistake was made when
sequencing the gene and the person who built the model believed the
sequence more than the density.  The model building required to correct
2PWJ is more challenging since a number of short cuts were made
cutting out loops.  If I recall, my model has about 10 more amino acids
than the PDB model.

In all of these cases the majority of the resides in each model are
fine.  3SQK has been replaced with a corrected model (4F4J).

Dale Tronrud

On 10/17/2013 06:51 AM, Lucas wrote:
 Dear all,
 
 I've been lecturing in a structural bioinformatics course where graduate
 students (always consisting of people without crystallography background
 to that point) are expected to understand the basics on how x-ray
 structures are obtained, so that they know what they are using in their
 bioinformatics projects. Practices include letting them manually build a
 segment from an excellent map and also using Coot to check problems in
 not so good structures.
 
 I wonder if there's a list of problematic structures somewhere that I
 could use for that practice? Apart from a few ones I'm aware of because
 of (bad) publicity, what I usually do is an advanced search on PDB for
 entries with poor resolution and bound ligands, then checking then
 manually, hopefully finding some examples of creative map
 interpretation. But it would be nice to have specific examples for each
 thing that can go wrong in a PDB construction.
 
 Best regards,
 Lucas


Re: [ccp4bb] AW: [ccp4bb] Fwd: undefined edensity blob at glutamine sidechain

2013-12-10 Thread Dale Tronrud
   It doesn't look like you left a CN on your gold atom. These things
are pretty much covalently bound.

Dale Tronrud

On 12/10/2013 08:13 AM, PriyankMaindola wrote:
 dear all:
 I. i am not able to fit trp, 
 since 
 1. trp doesnt fit well
 2. positive density comes in fo-fc map after refinement
 3. this is a soaked crystal str with heavy atom soln, the native one has
 perfect density for gln, so mutation to trp is unlikely
 
 II. on increasing contour level, 2fo-fc map fades above 4.5 rmsd if I do
 not put anything in the blob and see the refined map
 
 III. placing Au+ and refining (occ 1 ; B-fac 63 A2) gives figure 4.png
 (attached below)
 however anomalous diff map does give positive density but not a clear
 round, spherical one.
 
 
 On 10 December 2013 19:29, herman.schreu...@sanofi.com
 mailto:herman.schreu...@sanofi.com wrote:
 
 My first guess was also a metal ion. However, a tryptophan as Fred
 suggested cannot be ruled out. A simple preliminary test is to
 scroll up the contouring level and look when the contours of the
 blob disappear. If the contours quickly disappear, you have
 something disordered or light. If the contours of the blob disappear
 at the same moment or later as e.g. sulfur atoms, you have something
 heavy like a metal ion. You still have to fit all possibilities and
 see what refines best.
 
 __ __
 
 Best,
 
 Herman 
 
 __ __
 
 *Von:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK
 mailto:CCP4BB@JISCMAIL.AC.UK] *Im Auftrag von *Matthias Zebisch
 *Gesendet:* Dienstag, 10. Dezember 2013 14:21
 *An:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
 *Betreff:* Re: [ccp4bb] Fwd: undefined edensity blob at glutamine
 sidechain
 
 __ __
 
 check an anomalous map!
 The obvious thing to do to rule out gold binding
 
 
 
 -
 
 Dr. Matthias Zebisch
 
 Division of Structural Biology,
 
 Wellcome Trust Centre for Human Genetics,
 
 University of Oxford,
 
 Roosevelt Drive,
 
 Oxford OX3 7BN, UK
 
 __ __
 
 Phone (+44) 1865 287549; 
 
 Fax (+44) 1865 287547
 
 Email matth...@strubi.ox.ac.uk mailto:matth...@strubi.ox.ac.uk
 
 Website http://www.strubi.ox.ac.uk
 
 -
 
 On 12/10/2013 12:44 PM, PriyankMaindola wrote:
 
 ​
 
 dear members
 
 __ __
 
 i am trying to solve this crystal structure but  
 
 ​I am puzzled with an​
 
  undefined  blob 
 
 ​that​
 
   
 
 ​appeared at a glutamine residue after refinement. I have
 attached pics of that below. 
 
 Is it a covalent modification of acid-amide side chain.. 
 
 ​... as there is no charged environment around and density seems
 continuous.
 
 __ __
 
 ​please suggest
 
 ​
 
 
 
 __ __
 
 following reagents were encountered by protein 
 
 ​during purification, crystallization and soaking​
 
 :
 
 phenyl methyl sulfonyl fluoride
 
 benzamidine
 
 tris
 
 dtt 
 
 ​ (could it be cyclized dtt?)​
 
 __ __
 
 k[au(cn)2]
 
 acidic pH
 
 isopropanol
 
 citrate
 
 sulfate
 
 phosfate
 
 K+, Na+, Cl-
 
 
 
 
 ​map contour:
 2fo-fc: 1rmsd
 
 fo-fc (green): 3 rmsd​
 
 __ __
 
 __ __
 
 -- 
 
 *Priyank*
 
 __ __
 
 
 
 
 -- 
 *Priyank Maindola*


Re: [ccp4bb] resubmission of pdb

2014-01-31 Thread Dale Tronrud
   I would write back to the annotator who send the processed files
to you and ask if you can restart the deposition.  The worst they
can say is no and you're back to ADIT.  On the other hand they
will probably be as happy as you to save the work that has already
been done.

Dale Tronrud

On 01/31/2014 01:04 PM, Faisal Tarique wrote:
 Dear all
 
 Dear Dr. PDB,
 
 Some time back i had submitted a coordinate in PDB but because of
 unacceptance of the manuscript we had to retract the submission. During
 this procedure i got few zipped file from the annotator such as 1.
 rcsb0.cif-public.gz,  2. rcsb0.pdb.gz and  3.
 rcsb0-sf.cif.gz..Now i want to submit the same ..My question is what
 is the best way to do it again..??
 Should we start  from the beginning through ADIT Deposition tool and
 resubmit it with a new PDB id or there is some way to submit again those
 zip files which the annotator sent us after retraction..May you please
 suggest what could be the easiest way to submit our structure to PDB
 without much efforts.
 
 
 -- 
 Regards
 
 Faisal
 School of Life Sciences
 JNU
 


[ccp4bb] Meeting Announcement: Northwest Crystallography Workshop

2014-02-07 Thread Dale Tronrud
  (Pacific) Northwest Crystallography Workshop
   http://oregonstate.edu/conferences/event/nwcw2014/
   June 20-22, 2014

   Registration is now open for this year's edition of the Northwest
Crystallography Workshop.  It is being hosted at Oregon State University
in Corvallis in the heart of the Willamette Valley surrounded by wine
country, wildlife refuges, and with both the Cascade Mountains and the
Pacific Ocean within easy driving distance.  It can be easily accessed
from either the Portland or Eugene airports.

   This biennial meeting has been held at various locations in the
Pacific Northwest since 1981.  It has always proven to be a great venue
to meet other researchers in the region who are interested both in using
macromolecular crystallography to solve structures and in developing and
enhancing methods.

   The workshop part of the name will be taken seriously.  We will
have talks and posters with priority for speaking slots given to
students and post-docs who focus on methodologies or interesting
structure determination stories, and/or how structural observations
provide insight into function and biology.

   There will be a reception on Friday evening.  On Saturday there will
be talks during the day and a banquet followed by a keynote address in
the evening.  Talks will continue on Sunday morning with the workshop
wrapping-up at noon.  A light lunch will be served on Saturday and a
boxed lunch will be available on Sunday.

   Register today and get your abstracts submitted anytime between now
and the April 30 abstract deadline. We've tried to keep registration
costs low and early registration (through April 30) is $75 for students
and $100 for others. Reasonably priced on-campus dormitory housing is
available and must be arranged at the time of registration. Special
rates at two local hotels have been arranged for those who want to book
their own lodgings.

   We look forward to a great meeting and celebration of the International
Year of Crystallography.

Dale and Andy

Dale E. Tronrud and P. Andrew Karplus
Department of Biochemistry  Biophysics
2011 ALS Bldg
Oregon State University
Corvallis, OR 97331
USA


Re: [ccp4bb] Can not see density map when I turn off normalization in PYMOL

2014-02-19 Thread Dale Tronrud
   When you don't normalize the map you have to specify your contour
level in whatever units the map came in.  Your output says the stdev
is 0.075 so I guess you need to contour at 0.225 to see the equivalent
image.

Dale Tronrud

P.S. I feel compelled to note that what the program is reporting as
the standard deviation is really the root mean square deviation from
zero.  The standard deviation of a map is a much more subtle quantity
as discussed recently in PNAS.

On 02/19/2014 09:30 AM, hongshi WANG wrote:
 Hello there,
 
  
 
 I am making a fo-fc map for one ligand using pymol. I strictly followed
 the pymol wiki protocol (Display CCP4 Maps). Finally, I can get the
 ligand map using command:
 
  isomesh fo-fc_ligand, omitmap, 3, ligand, carve=2.
 
 However, the problem is the map I got from pymol is smaller than the one
 I can see in coot at the same contour level (3.0). 
 
 So I gave a second trial based on the assumption that it may be caused
 by the mis-normalization.  I input the command: “unset
 normalize_ccp4_maps” to stop PyMOL from normalizing a cpp4 map. After
 that I loaded my ccp4 map file and tried to do the same things as what I
 did for the first time. But I could not see any mesh net (density map)
 shown up. I check the command window.
 
 /PyMOLunset normalize_ccp4_maps/
 
 / Setting: normalize_ccp4_maps set to off./
 
 / ObjectMapCCP4: Map Size 134 x 128 x 122/
 
 / ObjectMapCCP4: Map will not be normalized./
 
 / ObjectMapCCP4: Current mean = -0.66 and stdev = 0.074981./
 
 / ObjectMap: Map read.  Range: -0.511 to 0.616/
 
 / Crystal: Unit Cell  200  300  100/
 
 / Crystal: Alpha Beta Gamma90.000  100.354   90.000/
 
 / Crystal: RealToFrac Matrix/
 
 / Crystal:0.0060   -0.0.0011/
 
 / Crystal:0.0.0045   -0./
 
 / Crystal:0.0.0.0053/
 
 / Crystal: FracToReal Matrix/
 
 / Crystal:  2000.  -34.5817/
 
 / Crystal:0.  3000./
 
 / Crystal:0.0.  100/
 
 / Crystal: Unit Cell Volume  6993536./
 
 / ExecutiveLoad: E:/ bdligand002.ccp4 loaded as bdligand002, through
 state 1./
 
 / PyMOLisomesh fo-fc_ligand, bdligand002, 3, ligand, carve=2/
 
 / Executive: object fo-fc_ligand created./
 
 / Isomesh: created fo-fc_ligand, setting level to 2/
 
 / ObjectMesh: updating fo-fc_ligand./
 
  
 
  It seems like no error, but my ligand map, fo-fc_ligand has no density
 map shown up. I also tried to show the whole mesh at level 2.0 for
 bdligand002. I still could not see the density map.
 
  
 
  My pymol is version 1.3 in windows 8 operation system. 
 
  
 
  
 
 Any help will be greatly appreciated!
 
  
 
 Thanks in advance
 
  
 
 hongshi 
 


Re: [ccp4bb] minimum acceptable sigma level for very small ligand and more

2014-03-19 Thread Dale Tronrud

Hi,

   First, there is nothing magical about contouring a map at 1 rms.
The standard deviation of the electron density values really has no
relationship to the rms of those values, and appears to generally be
much smaller.  This is discussed quite brilliantly in the recent paper
http://www.ncbi.nlm.nih.gov/pubmed/24363322

   If you have a ligand with low occupancy you expect you will have
to dial down the contour level to see it.  The question isn't how
low can you go but does your model fit ALL the available data and is
there any other model that will also fit those data.  Even if a ligand
has low occupancy it still must have good bond lengths and angles and
must make reasonable interactions with the rest of the molecule.

   One of your observations is that full occupancy cracks the crystal.
It would be good if your model explains this observation as well.

   If your ligand is present 60% of the time, what is there the other
40%?  Usually when there is a partially occupied ligand there is water
present the rest of the time.  The apparent superposition of the ligand
and the water will result in some density that is strong.  Those strong
bits will give clues about the minor conformation.  The low occupancy
water model must also make sense in terms of hydrogen bonds and bad
contacts.

   Remember, if you are looking at lower than usual rms contours in
the 2Fo-Fc style map you must evaluate your refined model by looking
at lower contour levels in your Fo-Fc style map.  You can't give your
model a free ride by excusing a weak density map but then blowing off
weak difference peaks.

   You must be very careful to consider alternative models and to
accepts that sometimes you just can't figure these things out.  Just
because the density is weak does not mean that you can give it a pass
for not fitting the model.  The model has to fit the density, and
fit it better than any other model.

   You must also make clear to your readers what the occupancy of your
ligand is and the quality of the maps that lead you to this conclusion.

Dale Tronrud

P.S.  I have had great experiences with the maps produced by Buster for
looking at weak ligand density.  I have also published a model with
a 0.35 occupancy ligand although the resolution there was 1.3 A.

On 03/19/2014 07:39 AM, Amit Kumar wrote:


Hello,

My protein is 26 kDa and the resolution of the data is 1.90 angs. My
ligand is 174 Daltons. and it was soaked into the crystal. Ligand is
colored and the crystal after soaking takes up intense color. However if
we soak more than optimum, the color deepens in intensity but the
crystal diffracts no more. So perhaps the ligand's occupancy can not be
the 1.00.

After model building I see ligand density, starting to appear at 0.7
sigma and clear at 0.5-0.6 sigma, close to the protein residue where it
should bind. Occupancy is ~0.6 after the refinement and B factors for
the atoms of the ligand range from 30-80.

Questions I have
(1) What is the acceptable sigma level for very small ligands for peer
review/publication?
(2) I did refinement by Refmac and by Phenix refine, separately. The map
quality for the ligand is better after the refmac refinement than after
the Phenix refinement. Why is such a difference and which one should I
trust? I used mostly default parameters for both (Phenix and Refmac)
before the refinement.

Thanks for your time.
Amit


[ccp4bb] Second announcement for the (Pacific) Northwest Crystallography Workshop

2014-04-09 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

 Second announcement for the
  (Pacific) Northwest Crystallography Workshop
   http://oregonstate.edu/conferences/event/nwcw2014/
   June 20-22, 2014

   Registration is continuing for this year's edition of the Northwest
Crystallography Workshop.  It is being hosted at Oregon State University
in Corvallis in the heart of the Willamette Valley surrounded by wine
country, wildlife refuges, and with both the Cascade Mountains and the
Pacific Ocean within easy driving distance.  It can be easily accessed
from either the Portland or Eugene airports.

   The workshop part of the name will be taken seriously.  We will
have talks and posters with priority for speaking slots given to
students and post-docs who focus on methodologies or interesting
structure determination stories, and/or how structural observations
provide insight into function and biology.

   Oregon State is the home of the Ava Helen and Linus Pauling Papers,
which is a fascinating collection that goes far beyond papers.  We
have arranged two tours of this collection for Friday afternoon before
the workshop.  If you can make it to Corvallis for either the 2 PM or
4 PM tour you will be amazed by this collection.  Let us know if you
plan to attend and we will reserve a spot.

   On May 1st the registration fee will increase by $25 from the
current $75 for students and $100 for others.

   And get those abstracts in!

   We look forward to a great meeting and celebration of the International
Year of Crystallography.

Dale and Andy

Dale E. Tronrud and P. Andrew Karplus
Department of Biochemistry  Biophysics
2011 ALS Bldg
Oregon State University
Corvallis, OR 97331
USA
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlNFojcACgkQU5C0gGfAG13evACfWzJo2D/RQEGT6xeYICJl/kTI
4eYAn0m4YyqkOajtKkuY6zojeic1+9Il
=lYDp
-END PGP SIGNATURE-


Re: [ccp4bb] crystallographic confusion

2014-04-18 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


   I see no problem with saying that the model was refined against every
spot on the detector that the data reduction program said was observed
(and I realize there is argument about this) but declare that the
resolution of the model is a number based on the traditional criteria.
This solution allows for the best possible model to be constructed and
the buyer is still allowed to make quality judgements the same way
as always.

Dale Tronrud

On 4/18/2014 5:22 PM, Lavie, Arnon wrote:
 Dear Kay.
 
 Arguably, the resolution of a structure is the most important
 number to look at; it is definitely the first to be examined, and
 often the only one examined by non-structural biologists.
 
 Since this number conveys so much concerning the
 quality/reliability of the the structure, it is not surprising that
 we need to get this one parameter right.
 
 Let us examine a hypothetical situation, in which a data set at
 the 2.2-2.0 resolution shell has 20% completeness. Is this a 2.0 A
 resolution structure?  While you make a sound argument that
 including that data may result in a better refined model (more
 observations, more restraints), I would not consider that model the
 same quality as one refined against a data set that has 90%
 completeness at that resolution shell.
 
 As I see it, there are two issues here: one, is whether to include
 such data in refinement?  I am not sure if low completeness
 (especially if not random) can be detrimental to a correct model,
 but I will let other weigh in on that.
 
 The second question is where to declare the resolution limit of a 
 particular data set?  To my mind, here high completeness (the term
 high needs a precise definition) better describes the true
 resolution limit of the diffraction, and with this what I can
 conclude about the quality of the refined model.
 
 My two cents.
 
 Arnon Lavie
 
 On Fri, April 18, 2014 6:51 pm, Kay Diederichs wrote:
 Hi everybody,
 
 since we seem to have a little Easter discussion about
 crystallographic statistics anyway, I would like to bring up one
 more topic.
 
 A recent email sent to me said: Another referee complained that
 the completeness in that bin was too low at 85% - my answer was
 that I consider the referee's assertion as indicating a
 (unfortunately not untypical case of) severe statistical
 confusion. Actually, there is no reason at all to discard a
 resolution shell just because it is not complete, and what would
 be a cutoff, if there were one? What constitutes too low?
 
 The benefit of including also incomplete resolution shells is
 that every reflection constitutes a restraint in refinement (and
 thus reduces overfitting), and contributes its little bit of
 detail to the electron density map. Some people may be mis-lead
 by a wrong understanding of the cats and ducks examples by
 Kevin Cowtan: omitting further data from maps makes Fourier
 ripples/artifacts worse, not better.
 
 The unfortunate consequence of the referee's opinion (and its 
 enforcement and implementation in papers) is that the structures
 that result from the enforced re-refinement against truncated
 data are _worse_ than the original data that included the
 incomplete resolution shells.
 
 So could we as a community please abandon this inappropriate and 
 un-justified practice - of course after proper discussion here?
 
 Kay
 
 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlNRz14ACgkQU5C0gGfAG138HwCfYbUXb5MgQvC/8iCftiuuP1pn
H0AAn24ej2FSBxbNbndjnHoJ/xAKCitK
=Xh7C
-END PGP SIGNATURE-


[ccp4bb] NWCW 2014: Last day to register at early registration rates

2014-04-30 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


 Northwest Crystallography Workshop
   http://oregonstate.edu/conferences/event/nwcw2014/
   June 20-22, 2014

   The (Pacific) Northwest Crystallography Workshop is a regional
gathering of people who are interested in macromolecular structure
determination but folk from anywhere are welcome.  This year it will
be held at Oregon State University in Corvallis Oregon.

   Today is the last day to register at the early registration rates
of $75/$100 for students/others.  Tomorrow the prices will rise by
$25.  We encourage you to sign up today!

   We will continue to accept abstracts until May 16th, but please
try to get them in ASAP.  Registration is not linked to abstract
submission so you can register today and submit an abstract later.
Next week, however, we will begin to define the speaking schedule.

Dale E. Tronrud and P. Andrew Karplus

Department of Biochemistry and Biophysics
2011 ALS Bldg
Oregon State University
Corvallis, OR  97331
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlNhKJgACgkQU5C0gGfAG13b5wCfdlIVFbwJlZX2MPzXLdSODCpm
nxwAoL19Va3igMKmXQhtkBmBf+7rkPsk
=U+SM
-END PGP SIGNATURE-


Re: [ccp4bb] stalled refinement after MR solution

2014-05-08 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


   Refinement of a model with only 50% completeness is problematic, but
you have four copies of a molecule (in P1) so your molecular
replacement is only looking for 24 parameters.  You should be able to
get a solution with 50% completeness.

Dale Tronrud

On 5/8/2014 1:43 PM, Yarrow Madrona wrote:
 Hi Jacob. I am worried that I would dramatically suffer in data 
 completeness. I am not sure how reliable the data is when you are
 have 50% completeness. These crystals are also pretty much
 impossible to reproduce at the moment.
 
 
 On Thu, May 8, 2014 at 1:30 PM, Keller, Jacob
 kell...@janelia.hhmi.org mailto:kell...@janelia.hhmi.org
 wrote:
 
 Since your search model is so good, why not go down to p1 to see 
 what’s going on, then re-merge if necessary?
 
 __ __
 
 JPK
 
 __ __
 
 *From:*yarrowmadr...@gmail.com mailto:yarrowmadr...@gmail.com 
 [mailto:yarrowmadr...@gmail.com mailto:yarrowmadr...@gmail.com] 
 *On Behalf Of *Yarrow Madrona *Sent:* Thursday, May 08, 2014 4:29
 PM *To:* Keller, Jacob
 
 
 *Subject:* Re: [ccp4bb] stalled refinement after MR solution
 
 __ __
 
 I have had problems in the past with a and c cell being equal and 
 having pseudo-merhohedral twining where the space group looked
 like C2221 but the true space group was P21 (near perfect 2fold
 NCS). But I didn't think twining was possible in this case.
 
 __ __
 
 On Thu, May 8, 2014 at 12:43 PM, Keller, Jacob 
 kell...@janelia.hhmi.org mailto:kell...@janelia.hhmi.org
 wrote:
 
 The b and c cell constants look remarkably similar
 
 JPK
 
 
 -Original Message- From: CCP4 bulletin board
 [mailto:CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK] On
 Behalf Of Randy Read Sent: Thursday, May 08, 2014 3:41 PM To:
 CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK Subject: Re:
 [ccp4bb] stalled refinement after MR solution
 
 Hi Yarrow,
 
 If Dale said that, he probably wasn't saying what he meant clearly 
 enough!  The NCS 2-fold axis has to be parallel to the 
 crystallographic 2-fold (screw) axis to generate tNCS.  In your 
 case, the NCS is a 2-fold approximately parallel to the y-axis,
 but it's nearly 9 degrees away from being parallel to y.  That
 explains why the Patterson peak is so small, and there will be very
 little disruption from the statistical effects of tNCS.
 
 The anisotropy could be an issue.  It might be interesting to look 
 at the R-factors for the stronger subset of the data.  It can make 
 sense to apply an elliptical cutoff of the data using the
 anisotropy server (though Garib says that having systematically
 incomplete data can create problems for Refmac), but I hope you're
 not using the anisotropically scaled data for refinement.  The
 determination of the anisotropic B-factors by Phaser without a
 model (underlying the anisotropy server) will not be as accurate as
 what Refmac or phenix.refine can do with a model.
 
 Finally, as Phil Evans always says, the space group is just a 
 hypothesis, so you should always be willing to go back and look at 
 the evidence for the space group if something doesn't work as
 expected.
 
 Best wishes,
 
 Randy Read
 
 - Randy J. Read Department of Haematology, University of
 Cambridge Cambridge Institute for Medical ResearchTel: +44 1223
 336500 tel:%2B44%201223%20336500 Wellcome Trust/MRC Building
 Fax: +44 1223 336827 tel:%2B44%201223%20336827 Hills Road
  E-mail: rj...@cam.ac.uk mailto:rj...@cam.ac.uk Cambridge CB2
 0XY, U.K. www-structmed.cimr.cam.ac.uk
 http://www-structmed.cimr.cam.ac.uk
 
 On 8 May 2014, at 18:11, Yarrow Madrona amadr...@uci.edu 
 mailto:amadr...@uci.edu wrote:
 
 Hello CCP4 community,
 
 I am stumped and would love some help. I have a molecular
 replacement solution that has Rfree stuck around 40% while Rwork
 is aorund 30%. The model is actually the same enzyme with a
 similar inhibitor bound. Relevant information is below.
 
 -Yarrow
 
 I have solved a structure in a P21 spacegroup:
 
 51.53 88.91 89.65, beta = 97.1.
 
 Processing stats (XDS) are very good with low Rmerge (~5%
 overall)
 and good completeness.
 
 I don't think twinning is an option with these unit cell
 dimensions. My data was highly aniosotropic. I ran the data
 through the UCLA anisotropic server to scale in the B- direction 
 (http://services.mbi.ucla.edu/anisoscale/)
 
 I get a small (a little over 5) patterson peak suggesting there
 is
 not much t-NCS to worry about. However, the output structure does 
 have 2 fold symmetry (see below) and as Dale Tronrud pointed out, 
 there is always tNCS in a P21 space group with two monomers
 related by a 2-fold axis.
 I calculated the translation to be unit cell fractions of 0.36
 0.35, 0.32.
 
 rota_matrix   -0.9860   -0.1636   -0.0309 rota_matrix   -0.1659
 0.95110.2605 rota_matrix   -0.01320.2620   -0.9650 
 tran_orth  34.3310  -24.0033  107.0457
 
 center_orth   15.76077.2426   77.7512
 
 Phaser stats: SOLU SET  RFZ=20.3 TFZ

Re: [ccp4bb] PDB passes 100,000 structure milestone

2014-05-14 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


   The policy doesn't say you can supersede someone else's entry.
It says you can deposit your own version, if you have a publication.
Then there will be two bogus structures instead of one.  Pretty soon
the PDB will start to look like one of the crappy Matrix movies.

Dale Tronrud

On 5/14/2014 6:47 PM, James Holton wrote:
 
 A little loophole that might make everyone happy can be found
 here: http://www.wwpdb.org/policy.html search for A re-refined
 structure based on the data from a different research group
 
 Apparently, anyone can supersede any PDB entry, even if they
 weren't the original depositor.  All they need is a citation.
 Presumably, someone could re-refine 2hr0 against the data that
 were deposited with it. Possibly showing how to get an R-factor of
 0% out of it.  I'd definitely cite that paper.
 
 -James Holton MAD Scientist
 
 On 5/14/2014 11:01 AM, Nat Echols wrote:
 On Wed, May 14, 2014 at 10:53 AM, Mark Wilson mwilso...@unl.edu 
 mailto:mwilso...@unl.edu wrote:
 
 As for the meaning of integrity, I'm using this word in place of 
 others that might be considered more legally actionable.  A
 franker conversation would likely more clearly draw the line that
 we're wrestling with here.
 
 
 The reference to integrity was Bernhard's - quoting the PDB
 mission statement; I just disagree with his interpretation of the
 meaning.  As far as 2hr0 is concerned, I think we're quite safe
 calling it fraudulent at this point, since (ironically) Nature
 itself has said as much:
 
 http://www.nature.com/news/2009/091222/full/462970a.html
 
 -Nat
 
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlN0V1UACgkQU5C0gGfAG124eQCffE9h2fdDDi2TDLSwr9DabrZI
GzoAn2QTo1/VTW8ZYSHCpcgCX+EHFv/q
=Ja+6
-END PGP SIGNATURE-


[ccp4bb] Deadline Approaches for Registration to the Northwest Crystallographic Workshop

2014-06-05 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


  (Pacific) Northwest Crystallography Workshop
   http://oregonstate.edu/conferences/event/nwcw2014/
   June 20-22, 2014

   Summer is fast approaching and so is the Northwest Crystal-
lography Workshop to be held here in beautiful Corvallis Oregon.
The last day to register is next Tuesday, June 10th.  If you are
planing to attend but have not registered, you'd best get to it!

   This workshop has been held at various locations in the
Pacific Northwest since 1981.  It has always proven to be a great
venue to meet other researchers in the region who are interested
both in using macromolecular crystallography to solve structures
and in developing and enhancing methods.  This will be a cozy
meeting with lots to learn and plenty of networking opportunities.

Dale and Andy

- -- 
Dale E. Tronrud and P. Andrew Karplus
Department of Biochemistry and Biophysics
2011 ALS Bldg
Oregon State University
Corvallis, OR  97331
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlOQtYEACgkQU5C0gGfAG10MBgCgozEbT3eZohyTRBkwj7sN6Bwi
2gcAoLaKt9Xw+KAgs4HX6+fuNDTHBkLR
=sVDs
-END PGP SIGNATURE-


Re: [ccp4bb] Solvent channels

2014-06-27 Thread Dale Tronrud
On 06/27/2014 06:33 AM, Bernhard Rupp wrote:
 For small ion soaking for phasing purposes, partial occupancy is not a
 problem. For example, a few 1/2 occupied Iodines still can phase quite
well.
 1/2 a C is only 3 electrons, not that great. Add in higher
displacement, and
 odds are that the ligand interpretation will become difficult.
Particularly
 when the binding constants are poor, one will out of principle never reach
 full occupancy, which further exacerbates the weak density problem.
 Patience is definitely a virtue here.

 BR

   Here you are starting to mix equilibrium arguments with the previous
kinetic arguments.  If you have a weak binder you can always get full
occupancy by adding enough of the compound - to determine how much, you
must consider not only the binding constant but the number of binding
sites in the crystal and the total volume of the drop containing your
crystal.  Time is not a factor.

   Halide ions and cryoprotectants are known to pervade crystals very
rapidly, but they are usually added with overwhelming force.  Much
more is added than is required to bind to every specific binding site in
the crystal.  The rate of diffusion, as mass flow, depends not only on
viscosity but on the concentration of unbound molecules inside the crystal.

   When I was soaking an inhibitor into a crystal of Thermolysin I was
having problems with the crystals falling apart.  My belief was that the
inhibitor caused a small change in cell constants and since the
inhibitor first bound in a shell around the surface of the crystal
strain was created and the crystal cracked.  My solution was to add
small aliquots of inhibitor with a long enough wait between to allow
each batch to diffuse throughout the crystal.  Despite waiting up to 6
hours between additions the crystals still cracked.

   This is when I realized that after the inhibitor bound in the outer
shell of the crystal the remaining concentration of free inhibitor was
one billionth (since the binding constant was nanomolar) that of the
concentration of active sites and the remaining mass flow within the
crystal was insignificant.  Of course the next aliquot would rapidly
diffuse through the occupied region of the crystal and be bound in the
shell just below it, becoming trapped itself and increasing the strain.

   Your movie doesn't include any details of concentration of your dye,
nor what its binding constant is to any sites in a protein nor any
mention of kon or koff.  The lack of information makes it very difficult
to draw any conclusions from the experiment, but I believe the
experience from many other molecules is that small molecules do move
very rapidly through protein crystals, until they are caught by a
binding site.  I don't believe your movie represents typical diffusion
of small molecules in a protein crystal.

   My interpretation of your movie is:

1) The dye rapidly diffuses into the crystal reaching a simple
equilibrium where the concentration in the bulk solvent matches that of
the outside solution.  Since the protein excludes about half of the
volume of the crystal the overall concentration is half that of the
mother liquor and the color of the crystal is 1/2 as dark as the
surrounding solution.

2) With a slow kon, the dye molecules within the crystal start binding
specifically to the protein.  Since the dye is aromatic it probably has
to dig deep into the protein to find a binding site and this takes time.
 As dye is removed from the bulk solvent it is rapidly replaced by
diffusion from outside the crystal, and the crystal begins to darken,
eventually becoming darker than the surrounding liquid.

   The speed of binding is controlling the kinetics not diffusion.

Dale Tronrud


 
 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Keller, Jacob
 Sent: Friday, June 27, 2014 3:07 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Solvent channels
 
 And yet halides--even iodide--permeate those same lysozyme crystals and
 others entirely in 30--60 sec.
 
 JPK
 
 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of
 Bernhard Rupp
 Sent: Friday, June 27, 2014 9:00 AM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Solvent channels
 
 Just a remark: diffusion is a slow and random-walk process. Particularly
 large molecules in viscous media (PEG anybody?) move (diffuse) slowly in
 solution. To simply extrapolate from the fact that the ligand is smaller
 than the solvent channels to the odds of the presence of a ligand is a risky
 proposition. Positive omit difference density after 'shoot first' as Boaz
 indicated is a much better indication. And shoot you probably will a lot.
 
 The little movie below shows how slowly even a small aromatic dye molecule
 soaks into a crystal.  Total time 10 hrs.
 
 http://www.ruppweb.org/cryscam/lysozyme_dye_small.wmv
 
 The literally hundreds of empty ligand structures collected in Twilight

[ccp4bb] New Version of the Protein Geometry Database Now Available

2014-06-27 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Protein Geometry Database Server V 1.0
 http://pgd.science.oregonstate.edu/
Developed by Andy Karplus' laboratory at Oregon State University

   We are pleased to announce the availability of an enhanced version
of the Protein Geometry Database (PGD) web service, originally
announced in Berkholz et al (2010) Nucleic Acids Research 38, D320-5.
This server allows you to explore the many backbone and side chain
conformations that exist in the PDB as well as the protein geometry
(lengths and angles) that occur in those conformations. This service
is ideal for finding instances of particular conformations or peculiar
bond lengths or angles.  It is also quite adept at identifying sets of
fragments that can then be examined for systematic variation in
ideal geometry. The expanded PGD now includes all conformational and
covalent geometry information not just for the backbone but also for
the sidechains.

   There are three basic operations available: selecting a set of
fragments via a delimited search, analyzing the geometry of those
fragments, and dumping the results to your computer for more
specialized analysis.

   To control bias in statistical analyses due to the variable number
of entries with the same or similar sequence, the database contains
only the highest quality model in each sequence cluster as identified
by the Pisces server from Roland Dunbrack's lab.  Two settings, 90%
and 25% sequence identity, are available.  Currently, at the 90%
sequence identity level there are 16,000 chains and at the 25% level
this drops to about 11,000 chains.

   You can filter a search based on the quality of the model as
indicated by resolution and R values.  A search can also be filtered
based on DSSP secondary structure, amino acid type, the phi/psi/omega
angles and bond lengths, angles, and chi angles.  For example, you can
find all cysteine residues in the center of three-residue peptide
fragments (i.e. not at a peptide terminus), in beta sheet, with both
peptide bonds trans, and CB-SG length greater than 1.85 A from models
with resolution better than 1.5 A.  By the way, in the no more than
25% sequence identity category there are 25 of them.

   Once you have a set of results, you can create 2D plots showing the
relationships of up to three features (i.e. bond lengths, bond angles,
or conformational angles).  For instance, you can look at how a given
feature varies with phi and psi using a phi(i)/psi(i) plot.  Or, you
can just as easily look at the variation with psi(i)/phi(i+1), or even
the relationships between any selected bond angles.  As one example,
it is instructive to perform a default search and plot NCaCb vs NCaC
colored based on CbCaC.  As this search is restricted to just the
highest resolution models, you can see the justification for chiral
volume restraints.

   Finally, all of your results can be downloaded for your own analysis.

   Development of the PGD continues.  If you have worked with the site
and have any ideas and suggestions for how to improvement it, please
drop us a line.

   The publication describing the PGD is:

Berkholz, D.S., Krenesky, P.B., Davidson, J.R.,   Karplus, P.A.
(2010) Protein Geometry Database: A flexible engine to explore
backbone conformations and their relationships with covalent geometry.
Nucleic Acids Res. 38, D320-5.

   Also, some examples of published analyses enabled by earlier
versions of the PGD are listed here:.

Berkholz, D.S., Shapovalov, M.V., Dunbrack, R.L.J.  Karplus, P.A.
(2009). Conformation dependence of backbone geometry in proteins.
Structure 17, 1316-1325.

Hollingsworth, S.A., Berkholz, D.S.  Karplus, P.A. (2009). On the
occurrence of linear groups in proteins. Protein Science 18, 1321-1325

Hollingsworth, S.A.  Karplus, P. A. (2010). Review: A fresh look at
the Ramachandran plot and the occurrence of standard structures in
proteins. BioMolecular Concepts 1, 271-283.

Berkholz, D.S., Driggers, C.M., Shapovalov, M.V., Dunbrack, R.L., Jr.
 Karplus P.A. (2012) Nonplanar peptide bonds in proteins are common
and conserved but not biased toward active sites. Proc Natl Acad Sci U
S A.  109, 449-53.

Dale Tronrud  P. Andrew Karplus
Department of Biochemistry and Biophysics
Oregon State University
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.22 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlOt044ACgkQU5C0gGfAG12aZACdG68Vyjw7JJimw0ofMZrJQLu8
B1IAn0Q5xx8ptRosgMXswdYmdjNKyFkA
=D63d
-END PGP SIGNATURE-


Re: [ccp4bb] New Version of the Protein Geometry Database Now Available

2014-06-30 Thread Dale Tronrud
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


   The Protein Geometry Database looks at proteins as collections of
bond lengths, angles, and torsion angles.  It is not the place to go
when you want to know how a protein part is related in space to some
other (covalently) distant part.

   Andy tells me that Jacque Fetrow, who was at Wake Forest University,
has a database that might answer your query.  There is a paper at

J Mol Biol. 2003 Nov 28;334(3):387-401.

Structure-based active site profiles for genome analysis and functional
family subclassification.

   Neither one of us has used it.

Hope that helps,
Dale Tronrud

On 6/27/2014 1:49 PM, Keller, Jacob wrote:
 I have wanted for some time to search for catalytic-triad-like
 configurations by defining three Ca-Cb bonds from known catalytic
 triads, then searching the pdb for matches, but have not thought of
 a quick and/or easy way to do this--can your software do this sort
 of thing, or is there some other software which could be used for
 this?
 
 JPK
 
 -Original Message- From: CCP4 bulletin board
 [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Dale Tronrud Sent:
 Friday, June 27, 2014 4:27 PM To: CCP4BB@JISCMAIL.AC.UK Subject:
 [ccp4bb] New Version of the Protein Geometry Database Now
 Available
 
 Protein Geometry Database Server V 1.0 
 http://pgd.science.oregonstate.edu/ Developed by Andy Karplus'
 laboratory at Oregon State University
 
 We are pleased to announce the availability of an enhanced version
 of the Protein Geometry Database (PGD) web service, originally
 announced in Berkholz et al (2010) Nucleic Acids Research 38,
 D320-5. This server allows you to explore the many backbone and
 side chain conformations that exist in the PDB as well as the
 protein geometry (lengths and angles) that occur in those
 conformations. This service is ideal for finding instances of
 particular conformations or peculiar bond lengths or angles.  It is
 also quite adept at identifying sets of fragments that can then be
 examined for systematic variation in ideal geometry. The expanded
 PGD now includes all conformational and covalent geometry
 information not just for the backbone but also for the sidechains.
 
 There are three basic operations available: selecting a set of
 fragments via a delimited search, analyzing the geometry of those
 fragments, and dumping the results to your computer for more
 specialized analysis.
 
 To control bias in statistical analyses due to the variable number
 of entries with the same or similar sequence, the database contains
 only the highest quality model in each sequence cluster as
 identified by the Pisces server from Roland Dunbrack's lab.  Two
 settings, 90% and 25% sequence identity, are available.  Currently,
 at the 90% sequence identity level there are 16,000 chains and at
 the 25% level this drops to about 11,000 chains.
 
 You can filter a search based on the quality of the model as
 indicated by resolution and R values.  A search can also be
 filtered based on DSSP secondary structure, amino acid type, the
 phi/psi/omega angles and bond lengths, angles, and chi angles.  For
 example, you can find all cysteine residues in the center of
 three-residue peptide fragments (i.e. not at a peptide terminus),
 in beta sheet, with both peptide bonds trans, and CB-SG length
 greater than 1.85 A from models with resolution better than 1.5 A.
 By the way, in the no more than 25% sequence identity category
 there are 25 of them.
 
 Once you have a set of results, you can create 2D plots showing the
 relationships of up to three features (i.e. bond lengths, bond
 angles, or conformational angles).  For instance, you can look at
 how a given feature varies with phi and psi using a phi(i)/psi(i)
 plot.  Or, you can just as easily look at the variation with
 psi(i)/phi(i+1), or even the relationships between any selected
 bond angles.  As one example, it is instructive to perform a
 default search and plot NCaCb vs NCaC colored based on CbCaC.  As
 this search is restricted to just the highest resolution models,
 you can see the justification for chiral volume restraints.
 
 Finally, all of your results can be downloaded for your own
 analysis.
 
 Development of the PGD continues.  If you have worked with the site
 and have any ideas and suggestions for how to improvement it,
 please drop us a line.
 
 The publication describing the PGD is:
 
 Berkholz, D.S., Krenesky, P.B., Davidson, J.R.,   Karplus, P.A. 
 (2010) Protein Geometry Database: A flexible engine to explore
 backbone conformations and their relationships with covalent
 geometry. Nucleic Acids Res. 38, D320-5.
 
 Also, some examples of published analyses enabled by earlier
 versions of the PGD are listed here:.
 
 Berkholz, D.S., Shapovalov, M.V., Dunbrack, R.L.J.  Karplus, P.A. 
 (2009). Conformation dependence of backbone geometry in proteins. 
 Structure 17, 1316-1325.
 
 Hollingsworth, S.A., Berkholz, D.S.  Karplus, P.A. (2009). On the
 occurrence

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-03 Thread Dale Tronrud
   I'm not sure how encryption can solve a problem of truth or falsity.
Public key encryption only says that the message that is decrypted using
the public key must have been encrypted by someone who knows the private
key.  A person can use their private key to encrypt a lie as well as the
truth.

   I don't quite follow your prescription, but if you are saying that
the beamline gives the depositor a code when they collect data that
proves data were collected, how does the beamline personal know the
contents of the crystal?  Couldn't one simply collect HEWL and then
deposit any model they like?

   The beamline could encrypt all images with their private key, and
the data integration program could decrypt the images using the public
key.  That way when a depositor presents a set of images it could be
proved that those images came, unmodified, from that beamline.  The
images would still have to be deposited, however.  (And this provides
no protection against forgeries of home source data sets.)

Dale Tronrud

On 04/03/12 13:19, Bryan Lepore wrote:
 On the topic of MX fraud : could not an encryption algorithm be
 applied to answer the question of truth or falsity of a pdb/wwpdb/pdbe
 entry? has anyone proposed such an idea before?
 
 for example (admittedly this is a mess):
 
 * a detector parameter - perhaps the serial number - is used as a
 public key. the detector parameter is shared among
 beamlines/companies/*pdb. specifically, the experimentor requests it
 at beamtime.
 
 * experimentor voluntarily encrypts something, using GPLv3 programs,
 small but essential to the deposition materials, like the R-free set
 indices (or please suggest something better), using their private key.
 maybe symmetric cipher would work better for this. or the Free R set
 indices are used to generate a key.
 
 * at deposition time, the *pdb unencrypts the relevant entry
 components using their private key related to the detector used.
 existing deposition methods pass or fail based on this (so maybe not
 the Free R set).
 
 * why do this : at deposition time, *pdb will have a yes-or-no result
 from a single string of characters. can be a stop-gap measure until
 images can be archived easily. all elements of the chain are required
 to be free and unencumbered by proprietary interests. importantly, it
 is voluntary. this will prevent entries such as Schwarzenbacher or
 Ajees getting past deposition - so admittedly, not many.
 
 references:
 http://en.wikipedia.org/wiki/RSA_(algorithm)
 http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange
 
 -Bryan


Re: [ccp4bb] Disorder or poor phases?

2012-04-10 Thread Dale Tronrud
Dear Gerard,

   No, the updated model (4BCL) was published in 1993 (although apparently
not deposited until 1998 - What was wrong with me?)  Both were refined
with that classic least-squares program TNT.  I hope there was some
improvement in the software between 1986 and 1993, and I always tried to
work with the most recent version, but there wasn't a switch in target
function.

   I agree that the distortions in these maps would have been less if
an ML approach had been used and perhaps the location of the disordered
residues would have been apparent earlier in the process.   Maybe this
sort of problem will not be seen again at 1.9 A resolution.  My goal was
simply to provide an example where errors due to model phases didn't
distribute evenly throughout the map but had greater consequence in some
locations.

Dale

On 04/10/12 13:45, Gerard Bricogne wrote:
 Dear Dale,
 
  There is perhaps a third factor in the progress you were able to make,
 namely the improvement in the refinement programs. Your first results were
 probably obtained with a least-squares-based program, while the more recent
 would have come from maximum-likelihood-based ones. The difference lies in
 the quality of the phase information produced from the model through
 comparison of Fo and Fc, with much greater bias-correction capabilities in
 the ML approach. Here, it removed the bias towards some regions being absent
 in the model, and made them no longer be absent in the maps. So it is a
 question of the quality of the phase information.
 
 
  With best wishes,
  
   Gerard.
 
 --
 On Tue, Apr 10, 2012 at 12:00:28PM -0700, Dale Tronrud wrote:
The phases do have effects all over the unit cell but that does not
 prevent them from constructively and destructively interfering with one
 another in particular locations.  Some years ago I refined a model of
 the bacteriochlorophyll containing protein to a 1.9 A data set when the
 sequence of that protein was unknown.  This is primarily a beta sheet
 protein and a number of the loops between the strands were disordered.
 Later the amino acid sequence was determined and I finished the refinement
 after building in these corrections.  The same data set was used, but
 a number of the loops had become ordered.  While the earlier model
 (3BCL) had 357 amino acids the final model (4BCL) had 366.

These nine amino acids didn't become ordered over the intervening
 years.  They were just as ordered when I was building w/o a sequence,
 it is just that I couldn't see how to build them based on the map's
 appearance.

One possibility is that the density for these residues was weak
 and the noise (that was uniform over the entire map) obliterated their
 signal where it only obscured the stronger density.  Another possibility
 is that the better model had a better match of the low resolution F's
 and less intense ripples radiating from the surface of the molecule,
 resulting in things sticking out being less affected.

Whatever the details, the density for these amino acids were too
 weak to model with the poorer model phases and became buildable with
 better phases.  The fact that they could not be seen in the early map
 was not an indication that they were disordered.

The first six amino acids of this protein have never been seen in
 any map, including the 1.3 A resolution model 3EOJ (which by all rights
 should have been called 5BCL ;-) ).  These residues appear to be truly
 disordered.  Going back to 3BCL - The map for this model is missing
 density for a number of residues of which we know some are disordered
 and some simply unmodelable because of the low quality of the phases.
 I don't know of a way, looking at that map alone, of deciding which
 is which.  Because of this observation I don't believe it is supportable
 to say I don't see density for these atoms therefore they must be
 disordered.  Additional evidence is required.

 Dale Tronrud



 On 04/10/12 08:38, Tim Gruene wrote:
 Dear Francis,

 the phases calculated from the model affect the whole unit cell hence it
 is more likely this is real(-space, local) disorder rather than poor
 phases.

 Regards,
 Tim

 P.S.: The author should not look at an 2fofc-map but a
 sigma-A-weighted map to reduce model bias.

 On 04/10/12 17:22, Francis E Reyes wrote:
 Hi all,

 Assume that the diffraction resolution is low (say 3.0A or worse)
 and the model (a high resolution homologue, from 2A xray data is 
 available) was docked into experimental phases (say 4A or worse)
 and extended to the 3.0A data using refinement (the high resolution
 model as a source of restraints). There are some conformational
 differences between the high resolution model and the target
 crystal.

 The author observes that in the 2fofc map at 3A, most of the model 
 shows reasonable density, but for a stretch of backbone the
 density is weak.

 Is the weakness of the density in this region because of disorder
 or bad model phases?


 Would

Re: [ccp4bb] Disorder or poor phases?

2012-04-11 Thread Dale Tronrud

On 4/10/2012 10:44 PM, Kay Diederichs wrote:

Hi Dale,

my experience is that high-B regions may become visible in maps only late in 
refinement. So my answer to the original poster would be - both global reciprocal-space 
(phase quality) and local real-space (high mobility) features contribute to a region not appearing 
ordered in the map. This would be supported by your experience if those residues that you 
could not model in 3BCL had high (or at least higher) B-factors compared to the rest of the model. 
Is that so?


   Actually the residues I couldn't model in 3BCL had no B's... :)

   Seriously, the residues that appeared for 4BCL did have B values much
higher than average.  Their density was weak in the best of circumstances
and more susceptible to obliteration by the distortions caused by
imprecision in the phases.  I don't really want to describe this as phase
error as that phrase conjures notions of large changes in phase.  The
R value only dropped from 18.9% to 17.8% from 3BCL to 4BCL.  I don't
expect there were huge differences in the phase angles, but the differences
were enough.

Dale

best,

Kay


Re: [ccp4bb] Criteria for Ligand fitting

2012-04-25 Thread Dale Tronrud
   While I'm quite happy with all the responses this question has provoked there
is an additional point I would like to contribute.  It is not enough to say that
you can interpret your map with a model based on what you expect.  You have to
also show that you can't interpret your map with any other reasonable model.
Saying that my map is consistent with my model is a very weak statement in the
absence of exclusivity.

   A recent example of this sort of problem can be read about at (warning: 
tooting
my own horn)

http://www.springerlink.com/content/b8h6lg138635380v/?MUD=MP

Dale Tronrud

On 04/23/12 21:02, Naveed A Nadvi wrote:
 Dear Crystallographers,
 
 We have obtained a 1.7 A dataset for a crystal harvested from crystallization 
 drop after 2 weeks of soaking with inhibitor. The inhibitor has an aromatic 
 ring and also an acidic tail derived from other known inhibitors. The active 
 site hydrophobic crown  had been reported to re-orient and a charged residue 
 is known to position for forming a salt-bridge with similar ligands. When 
 compared to apo strucutres, we can clearly see the re-orientation of these 
 protein residues. 
 
 However, there are no clear density visible for the ligand in the Fo-Fc map. 
 Some density is visible in the 2Fo-Fc map with default settings in COOT. We 
 were expecting co-valent modifcations between the inhbitor, co-factor and 
 protein residues. In fact, the Fo-Fc map suggested the protein residue is no 
 longer bonded to the co-factor (red negative density) and a green positive 
 density is observed nearby for the protein residue. These observations, along 
 with the extended soaking and the pre-determined potency convince us that the 
 inhibitor is present in the complex.
 
 When I lower the threshold of the blue 2Fo-Fc map (0.0779 e/A^3; 0.2 rmsd) we 
 can see the densities for the aromatic ring and the overall structural 
 features. These densities were observed without the cofactor and the inhibtor 
 in the initial MR search model. The R/Rfree for this dataset without 
 inhibitor was 0.20/0.24 (overall Bfactor 17.4 A^2). At 50% occupancy, 
 modeling the inhibtor showed no negative desities upon subsequent refinement. 
 With the inhibtor, the R/Rfree was 0.18/0.22  (overall Bfactor 18.8 A^2). The 
 temp factors of the inhibitor atoms (50% occ) were 15-26 A^2.
 
 My understanding is phase from the MR search model may influence Fo-Fc maps, 
 and the 2Fo-Fc map minimizes phase bias. Since the inhibitor was absent from 
 the MR search model, can these observations be used to justify the fitting of 
 the ligand in the map? Given the low map-level used to 'see' the ligand, 
 would this be considered noise? Can I justfiy the subsequent fall in R/Rfree 
 and the absence of negative density upon ligand fitting as proof of correct 
 inhibtor modeling? I would appreciate if you could comment on this issue. Or 
 tell me that I'm dying to see the inhibitor and hence imagining things!
 
 Kind Regards,
 
 Naveed Nadvi.


Re: [ccp4bb] Anisotropic diffraction

2012-04-29 Thread Dale Tronrud

   If the data set had P6 symmetry before anisotropic scaling it would
keep that symmetry afterwards.  If it was only P2 symmetry before, it
certainly would not have P6 afterwards.  Any anisotropic scaling I've
seen constrains the anisotropy to the lattice symmetry so symmetry
cannot be degraded via its application.

   If your data set had, in principle, P6 symmetry but was expressed in
a lower symmetry asymmetric unit and contained nonsymmetry-conforming
noise before anisotropic scaling it would also contain broken symmetry
afterwards.  The higher symmetry was not lost, it was never there to
begin with.

Dale Tronrud

On 4/28/2012 12:06 AM, Zhijie Li wrote:

Hi,

My first thought was same with David: the truncation won't change the crystal's 
space group. The symmetry of the crystal is
reflected by the symmetry of the amplitudes of many many reflections across all 
resolutions. Ellipsoidal truncation itself only
removes some very weak reflections from the outer shells. The remaining 
reflections will still have a good number of reflections
carrying the symmetry of the crystal.

However a second thought on the anisotropic scaling and B-factor correction led 
me to this scenario: suppose we have a crystal
that's really P6, but we have cowardly indexed it to a lower space group P2, 
with the 2-fold axis, b, coinciding the real 6-fold
axis. By losing the a=c restrain, the anisotropic scaling along H and L now may 
not be strictly equal (for example, could be caused
by outliers that would have been identified and filtered out if indexed 
correctly as P6), resulting in the loss of the 6-fold
symmetry in the scaled dataset. Apparently this is an artifact due to an 
improper SG assignment before the anisotropic scaling and
B-factor correction.

Just some crazy thoughts. Please correct me if I am wrong.



BTW, to Theresa: an very informative introduction on ellipsoidal truncation and 
anisotropic scaling can be found here:

http://services.mbi.ucla.edu/anisoscale/



--
From: Theresa Hsu theresah...@live.com
Sent: Friday, April 27, 2012 3:18 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] Anisotropic diffraction


Dear crystallographers

A very basic question, for anisotropic diffraction, does data truncation with 
ellipsoidal method change the symmetry? For example,
if untruncated data is space group P6, will truncated data index as P622 or P2?

Thank you.

Theresa


Re: [ccp4bb] Strange Density

2012-05-15 Thread Dale Tronrud
   Your holo structure has a Ca++ and three water molecules that have not been
built into your low resolution apo map.  These atoms are not expected to be
resolved at 3 A resolution, so I would expect them to appear as a large, 
misshapened,
blob.  Your screenshot only shows one contour level.  It is quite possible that
the highest density value is not at the center of the blob.

   You might have a lower occupancy Ca++ atom at the site and the image is 
confused
by the low resolution.  Remember, even if the concentration of Ca++ is lower in
this mother liquor any Ca++ that binds will bind exactly as it does in the
fully occupied case.  A weakly binding Ca++ site will not bind before the 
strongly
binding site.

   I would first look to see what your holo map looks like when it's resolution 
is
truncated to 3 A.  This will give you a sense of what a Ca++ binding in this 
site
would look like.  You could try refining a model with the Ca++ and water 
molecules,
with lower occupancy, and see what the residual difference map looks like.  You 
will,
of course, have to have strong restraints on the geometry to hold this model 
together
at 3 A resolution, but fortunately you have a higher resolution model to base 
these
restraints on.

   The PDB file is a statement of your belief of what is in the crystal.  Don't 
waste
your time refining models that don't make chemical sense.  An ion floating in 
space
with no ligands is not a reasonable model so even if it fits the density it 
can't
be correct.

   There a multiple ways of justifying the model of a crystal and others on the 
list
will likely have different ideas for the criteria that should be used.  My 
belief is
that you know the holo model and the most likely outcome of your Ca++ extraction
experiment (in a Bayesian prior sense) is a lower occupancy binding of the Ca++ 
and
its water molecules.  If you build and refine that model and the difference map 
is
acceptable you can say that this model is consistent with your experiment.  If 
there
is residual density then you can conclude that something is replacing the Ca++,
but untangling superimposed, partial occupancy, models at 3.1 A resolution is
extremely difficult.  I think all you will be able to say is that something
replaces the Ca++ but it cannot be identified.

   Not everything can be identified in a 3 A map.  Not everything can be 
identified
in a 1 A map.  Your job is to say these parts I understand and these parts I 
don't.

Dale Tronrud

On 05/15/12 07:51, RHYS GRINTER wrote:
 Dear Community,
 
 As I'm a relatively new to protein crystallography this might turn out to be 
 an obvious question, however.
 
 I'm working on the structure of a enzyme requiring Ca2+ for activity and with 
 calcium coordinated in the active site by Asp and 2x backbone carbonyl 
 groups, in a crystal structure with Ca in the crystallisation conditions 
 (http://i1058.photobucket.com/albums/t401/__Rhys__/MDC_TD_15A.jpg). 
 When Ca is omitted from the crystallizing conditions and a divalent chelator 
 (EGTA) is added the crystals are of significantly lower resolution (3.13A). 
 Refinement of this data reveals density for a molecule coordinated by the Ca 
 coordinating Asp and backbone, however this density is significantly further 
 away (3.4-3.8A) too far away for water or a strongly coordinated divalent 
 cation(http://i1058.photobucket.com/albums/t401/__Rhys__/MDC_EGTA_315.jpg). 
 The density is also much weaker than for Ca in the previous model 
 disappearing at 3.5 sigma.
 
 The crystallisation conditions for the Ca free condition is:
 
 0.1M Tris/Bicine buffer [pH 8.5]
 8% PEG 8000
 30% Ethylene Glycol
 1mM EGTA
 
 The protein was purified by nickel affinity/SEC and dialysed into: 
 20mM NaCl 
 20mM Tris [pH 8.0]
 
 
 A colleague suggested that sulphate or phosphate could fit at these 
 distances, but these ions have not been added at any stage of the 
 crystallisation process. 
 
 
 Could anyone give me some insight into what this density might represent?
 
 Thanks in advance,
 
 Rhys Grinter
 PhD Candidate
 University of Glasgow


Re: [ccp4bb] Calculating ED Maps from structure factor files with no sigma

2012-05-23 Thread Dale Tronrud
On 05/23/12 08:06, Nicholas M Glykos wrote:
 Hi Ed,
 
 
 I may be wrong here (and please by all means correct me), but I think
 it's not entirely true that experimental errors are not used in modern
 map calculation algorithm.  At the very least, the 2mFo-DFc maps are
 calibrated to the model error (which can be ideologically seen as the
 error of experiment if you include model inaccuracies into that).
 
 This is an amplitude modification. It does not change the fact that the 
 sigmas are not being used in the inversion procedure [and also does not 
 change the (non) treatment of missing data]. A more direct and relevant 
 example to discuss (with respect to Francisco's question) would be the 
 calculation of a Patterson synthesis (where the phases are known and 
 fixed).
 
 
 I have not done extensive (or any for that matter) testing, but my 
 evidence-devoid gut feeling is that maps not using experimental errors 
 (which in REFAMC can be done either via gui button or by excluding SIGFP 
 from LABIN in a script) will for a practicing crystallographer be 
 essentially indistinguishable.
 
 It seems that although you are not doubting the importance of maximum 
 likelihood for refinement, you do seem to doubt the importance of closely 
 related probabilistic methods (such as maximum entropy methods) for map 
 calculation. I think you can't have it both ways ... :-)
 
 
 
 The reason for this is that model errors as estimated by various
 maximum likelihood algorithms tend to exceed experimental errors.  It
 may be that these estimates are inflated (heretical thought but when you
 think about it uniform inflation of the SIGMA_wc may have only
 proportional impact on the log-likelihood or even less so when they
 correlate with experimental errors).  Or it may be that the experimental
 errors are underestimated (another heretical thought).
 
 My experience from comparing conventional (FFT-based) and maximum-entropy- 
 related maps is that the main source of differences between the two maps 
 has more to do with missing data (especially low resolution overloaded 
 reflections) and putative outliers (for difference Patterson maps), but in 
 certain cases (with very accurate or inaccurate data) standard deviations 
 do matter.

   In a continuation of this torturous diversion from the original question...

   Since your concern is not how the sigma(Fo) plays out in refinement but
how uncertainties are dealt with in the map calculation itself (where an
FFT calculates the most probable density values and maximum entropy would
calculate the best, or centroid, density values) I believe the most
relevant measure of the uncertainty of the Fourier coefficients would be
sigma(2mFo-DFc).  This would be estimated from a complex calculation of
sigma(sigmaA), sigma(Fo), sigma(Fc) and sigma(Phic).  I expect that the
contribution of sigma(Fo) would be one of the smallest contributors to this
calculation, as long as Fo is observed.  I wouldn't expect the loss of
sigma(Fo) to be catastrophic.

   Wouldn't sigma(sigmaA) be the largest component since sigmaA is a function
of resolution and based only on the test set?

Dale Tronrud


 
 
 All the best,
 Nicholas
 
 


Re: [ccp4bb] Fwd: [ccp4bb] Death of Rmerge

2012-05-31 Thread Dale Tronrud
On 05/31/12 12:07, Jacob Keller wrote:
 Alas, how many lines like the following from a recent Science paper
 (PMID: 22605777), probably reviewer-incited, could have been avoided!

 Here, we present three high-resolution crystal structures of the
 Thermus thermophilus (Tth) 70S ribosome in complex withRMF, HPF, or
 YfiA that were refined by using data extending to 3.0 Å (I/sI = 1),
 3.1 Å (I/sI = 1), and 2.75 Å (I/sI = 1) resolution, respectively. The
 resolutions at which I/sI = 2 are 3.2 Å, 3.4 Å, and 2.9 Å,
 respectively.


   I don't see how you can avoid something like this.  With the new,
higher, resolution limits for data (which are good things) people will
tend to assume that a 2.6 A resolution model will have roughly the
same quality as a 2.6 A resolution model from five years ago when
the old criteria were used.  KK show that the weak high resolution
data contain useful information but certainly not as much information
as the data with stronger intensity.

   The resolution limit of the data set has been such an important
indicator of the quality of the resulting model (rightly or wrongly)
that it often is included in the title of the paper itself.  Despite
the fact that we now want to include more, weak, data than before
we need to continue to have a quality indicator that readers can
use to assess the models they are reading about.  While cumbersome,
one solution is to state what the resolution limit would have been
had the old criteria been used, as was done in the paper you quote.
This simply gives the reader a measure they can compare to their
previous experiences.

   Now would be a good time to break with tradition and institute
a new measure of quality of diffraction data sets.  I believe several
have been proposed over the years, but have simply not caught on.
SFCHECK produces an optical resolution.  Could this be used in
the title of papers?  I don't believe it is sensitive to the cutoff
resolution and it produces values that are consistent with what the
readers are used to.  With this solution people could include whatever
noisy data they want and not be guilty of overstating the quality of
their model.

Dale Tronrud

 JPK
 
 
 
 On Thu, May 31, 2012 at 1:59 PM, Edward A. Berry ber...@upstate.edu wrote:
 Yes! I want a copy of this program RESCUT.

 REMARK 200  R SYM FOR SHELL(I) : 1.21700
 I noticed structure 3RKO reported Rmerge in the last shell greater
 than 1, suggesting the police who were defending R-merge were fighting
 a losing battle. And this provides a lot of ammunition to those
 they are fighting.

 Jacob Keller wrote:

 Dear Crystallographers,

 in case you have not heard, it would appear that the Rmerge statistic
 has died as of the publication of  PMID: 22628654. Ding Dong...?

 JPK

 --
 ***
 Jacob Pearson Keller
 Northwestern University
 Medical Scientist Training Program
 email: j-kell...@northwestern.edu
 ***


 
 
 
 --
 ***
 Jacob Pearson Keller
 Northwestern University
 Medical Scientist Training Program
 email: j-kell...@northwestern.edu
 ***
 
 


Re: [ccp4bb] sigma levels of averaged maps in coot ( or e/A3)

2012-06-06 Thread Dale Tronrud

   I'm afraid I seriously mistrust the sigma and e/A^3 numbers reported
by Coot for ncs averaged maps.  I work with a crystal with near perfect
6-fold ncs and the e/A^3 numbers make no sense.  For a 2Fo-Fc style map
the e/A^3 values should be nearly the same after as before.  They are not.

   The sigma of an averaged map has a definitional problem - what is the
volume to normalized over.  With a map with crystal symmetry the answer is
pretty clear, use the asymmetric unit.  The asymmetric unit of an averaged
map will be the least-common-multiple of the rotated unit cells and could
easily measure in hundreds if not thousands of unit cells.  Not very
practical and not very useful.  Paul says that he normalizes over a box,
which is the easy way out, but I don't believe it has any statistical meaning.
The box will contain some parts of the ncs asymmetric unit multiple times,
and include some cs related regions.

   My opinion is that the e/A^3 calculation for ncs averaged density in Coot
is broken.  (I have not used the daily-build version, just the stable releases,
but none have worked in my hands for years.)  I usually contour an unaveraged
map at my desired level, and then adjust the averaged map so that it mostly
matches those contours.  If your ncs is less perfect this will not work as
well for you.

Dale Tronrud

On 6/6/2012 2:20 PM, Paul Emsley wrote:

On 06/06/12 21:47, Ursula Schulze-Gahmen wrote:

I calculated threefold averaged omit maps in coot. These maps look nice and 
clean, but I am having trouble making sense of the displayed sigma levels or 
e/A3 values. When I display the unaveraged and averaged maps at a similar 
density level for the protein the unaveraged map is at 0.024 e/A3 and 2.7 
sigma, while the averaged map is displayed at 0.0016e/A3 and 7.6 sigma. I read 
the previous discussion about this issue where it was recommended to rely on 
the e/A3 values for comparison, but even that doesn't seem to work in this case.



Don't forget that in one case you are looking at a whole map and in the other 
(an average of) maps generated from a box encapsulating each chain. I wouldn't 
stress if I were you...

Paul.



Re: [ccp4bb] Chiral volume outliers SO4

2012-07-12 Thread Dale Tronrud
   While this change has made your symptom go away it is stretching it a bit to
call this a fix.  You have not corrected the root problem that the names you
have given your atoms do not match the convention which is being applied for SO4
groups.  Changing the cif means that you don't have to worry about it, but 
people
who study such details will be forced to deal with the incorrect labels of your
model in the future.

   Wouldn't it just be easier to swap the names of two oxygen atoms in each SO4,
leaving the cif alone?  Your difficulties will go away and people using your 
model
in the future will also have a simpler life.

   This labeling problem is not new.  The fight to standardize the labeling of
the methyl groups in Valine and Leucine was raging in the 1980's.  Standardizing
the labels on the PO4 groups in DNA/RNA was much more recent.  It helps everyone
when you know you can overlay two models and have a logical solution without
a rotation matrix with a determinate of -1.

   Besides, you will continue to be bitten by this problem as you use other
programs, until you actually swap some labels.

Dale Tronrud

On 07/12/12 15:00, Joel Tyndall wrote:
 Hi all,
 
 Thanks very much to all who responded so quickly. The fix is a one liner in 
 the SO4.cif file (last line)
 
 SO4  chir_01  S  O1 O2 O3both 
 
 which I believe is now in the 6.3.0 release.
 
 Interestingly the chirality parameters were not in the SO4.cif file in 6.1.3 
 but then appeared in 6.2.0.
 
 Once again I'm very happy to get to the bottom of this and get it fixed. I do 
 wonder if it had become over parametrised.
 
 Cheers
 
 Joel
 
 
 
 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Robbie 
 Joosten
 Sent: Thursday, 12 July 2012 12:16 a.m.
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Chiral volume outliers SO4
 
 Hi Ian,
 
  
 @Ian: You'd be surprised how well Refmac can flatten sulfates if you 
 have a chiral volume outlier (see Figure 1d in Acta Cryst. D68: 
 484-496
 (2012)).
 But this is only because the 'negative' volume sign was erroneously 
 used
 in
 the chiral restraint instead of 'both' (or better still IMO no chiral
 restraint at
 all), right?  If so I don't find it surprising at all that Refmac 
 tried to
 flip the
 sulphate and ended up flattening it.
  Seems to be a good illustration of the GIGO (garbage in - garbage
 out) principle.  Just because the garbage input in this case is in the
 official
 CCP4 distribution and not (as is of course more commonly the
 case) perpetrated by the user doesn't make it any less garbage.
 The problem is that in the creation of chiral volume targets chemically 
 equivalent (groups of) atoms are not recognized as such. So any new or 
 recreated restraint files will have either 'positiv' or 'negativ' and the 
 problem starts all over again. That is why it is better to stay consistent 
 and choose one chirality (the same one as in the 'ideal' coordinates in the 
 PDB ligand descriptions). This will also make it easier compare ligands after 
 aligning them (this applies to ligands more complex than sulfate).
 Obviously, users should not be forced to deal with these things. Programs 
 like Refmac and COOT should fix chiral volume inversions for the user, 
 because it is only relevant inside the computer. That is the idea of chiron, 
 just fix these 'problems' automatically by swapping equivalent atoms whenever 
 Refmac gives a chiral volume inversion warning.  It should make life a bit 
 easier.
   
 
 The point I was making is that in this and similar cases you don't 
 need a
 chiral
 restraint at all: surely 4 bond lengths and 6 bond angles define the
 chiral
 volume pretty well already?  Or are there cases where without a chiral 
 restraint the refinement still tries to flip the chirality (I would 
 fine
 that hard to
 believe).
 I agree with you for sulfate, and also for phosphate ;). I don't know what 
 happens in other compounds at poor resolution, when bond and angle targets 
 (and their SDs) are not equivalent. I guess that some angle might 'give way'
 before others. That is something that should be tested. I have a growing list 
 of chiral centers that have this problem if you are interested.
 
 Cheers,
 Robbie


Re: [ccp4bb] Chiral volume outliers SO4

2012-07-15 Thread Dale Tronrud

On 7/13/2012 1:58 AM, Tim Gruene wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear all,

I am surprised by the discussion about chiraliy of an utterly
centrosymmetric molecule. Shouldn't the four Oxygen atoms be at least
from a QM point-of-view to indistinguishable? What reason is there to
maintain a certain 'order' in the human-induced numbering scheme?


   There are good reasons for maintaining order in this human-induced
numbering scheme.  A common operation is to superimpose two molecules
and calculate the rmsd of the positional differences.  This calculation
is not useful when the Val CG1 and CG2 are swapped in one molecule relative
to the other.  Suddenly you have, maybe a handful, of atoms that differ
in position by about 3.5 A when most of us would consider this to be
nonsense.  We want the rmsd between equivalent atoms regardless of the
human-induced numbering scheme.  There are two ways this can come about.
1) The overlay program could swap the labels on one to match the other or
2) The labels can be defined to be consistent from the start.

   Neither 1) or 2) are objectively better in any absolute sense. The
Powers that Be, however, have decided that for Val, Leu, Phe, Tyr, and
the PO2 in DNA, RNA, and many co-enzymes models should be adjusted to
conform to a standard.  If we are doing this for these groups in order
to make comparison of models simpler, why stop there?  If we say there
are standards for some groups but not others we have the worst of both
worlds - We have to both modify models and write complicated comparison
programs.

   The failure of comparison programs to correct for labeling differences
is generally a silent error - A handful of 3.5 A differences mixed into
thousands of very small differences will not likely cause an increase in
the rmsd that would be noticed.  Only if the individual differences are
plotted, or the biggest differences are listed will the user notice the
problem.  Silent errors are the worst errors since they are the most
likely to make it all the way to publication.

   As I see it, the problem here lies in the program that created the
original poster's SO4 group.  If it had matched the convention now present
in the CCP4 cif there would be none of these problems.  That program
should be tracked down and updated.

   The problem of labeling groups that have symmetry along a rotatable
torsion angle is a persistent problem that, I'm afraid, has no permanent
solution other than CONSTANT VIGILANCE.  I see that the newer versions
of Coot have taken up this burden, at least for Phe and Tyr.  (I guess
we need a picture of a coot with one big roving eye.)

   Since we are already unambiguously defining the labeling for a number
of the groups we use, I think it is up to you to justify why this group
should be treated differently.

Dale Tronrud

P.S. On a only slightly off topic note - I'm quite afraid of using both
as the definition for chirality.  I've noticed that this keyword is used
as an excuse to not figure out what the real chirality of an atom is and
as a result people build models with bad chiral centers that are not flagged
by their software (another silent error that makes it to publication.)  The
PDB is littered with cofactors and ligands that have inverted chiral centers
(Even centers that Pasteur would approve).  I would prefer that both
was not a legal value and researchers would be required to think about
chirality.



Cheers,
Tim

On 07/13/12 00:22, Dale Tronrud wrote:

While this change has made your symptom go away it is stretching it
a bit to call this a fix.  You have not corrected the root
problem that the names you have given your atoms do not match the
convention which is being applied for SO4 groups.  Changing the cif
means that you don't have to worry about it, but people who study
such details will be forced to deal with the incorrect labels of
your model in the future.

Wouldn't it just be easier to swap the names of two oxygen atoms in
each SO4, leaving the cif alone?  Your difficulties will go away
and people using your model in the future will also have a simpler
life.

This labeling problem is not new.  The fight to standardize the
labeling of the methyl groups in Valine and Leucine was raging in
the 1980's.  Standardizing the labels on the PO4 groups in DNA/RNA
was much more recent.  It helps everyone when you know you can
overlay two models and have a logical solution without a rotation
matrix with a determinate of -1.

Besides, you will continue to be bitten by this problem as you use
other programs, until you actually swap some labels.

Dale Tronrud

On 07/12/12 15:00, Joel Tyndall wrote:

Hi all,

Thanks very much to all who responded so quickly. The fix is a
one liner in the SO4.cif file (last line)

SO4  chir_01  S  O1 O2 O3both

which I believe is now in the 6.3.0 release.

Interestingly the chirality parameters were not in the SO4.cif
file in 6.1.3 but then appeared in 6.2.0.

Once again

Re: [ccp4bb] refining large region with multiple conformers

2012-08-07 Thread Dale Tronrud
   You have to build the model you actually believe matches what is in
the crystal.  Do you believe that each amino acid is occupying two
conformations independent of its neighbors?  I wouldn't go that way.
I would start with an apo conformation, labeled with 'g', and a holo
conformation including the ligand, labeled with 'h', and allow all of
'g's one occupancy value, and the 'h's another, and insist that they
sum to 1.0.  You may have to build water molecules in the binding site
of 'g' that are displaced by the ligand in 'h'.

   If this, simplest of models, doesn't do the trick you have to be
lead by your difference maps and chemical intuition to devise more
complex models.  Select the simplest model that makes sense and fits
your data.

   It would be interesting to see if you can find a program that would
allow you to restrain the ncs of the second protein chain and the 'h'
conformation of your mixed model, leaving the 'g' conformation unrestrained
by ncs.

Dale Tronrud

P.S. I'm avoiding the use of 'A' and 'B' alt locs because these are
routinely used when splitting side chains but are almost never intended
to imply that all 'A's are coordinated with each other and all 'B's are
likewise.  To be proper, the reuse of alt loc codes for unrelated
conformations should not be allowed, but there are simply not enough
letters to allow the rule to be enforced.

On 08/07/12 07:59, Kendall Nettles wrote:
 Hi,
 We have a structure with the ligand showing two overlapping conformers.
 When we refine it with both conformers separately, it is pretty clear
 that there are substantial differences in the protein as a result, for
 about a third of the protein chain. My question is, would it be better
 to try to define alternate conformers for those specific regions, or
 would it be OK to refine with two entire alternate protein chains? There
 is also a second protein chain that shows only a single binding mode for
 the ligand.  It's a 2.0 angstrom structure. The yellow 2Fo-Fc map goes
 with the green model in the attached pic. Also, do we want to let each
 amino acid have its own occupancy? or should one ligand copy and one
 chain all have the same occupancy? I'm leaning towards the latter since
 the differences should be directly tied to the ligand binding mode. 
 Kendall Nettles


Re: [ccp4bb] Unexplainable Density

2012-08-08 Thread Dale Tronrud
   It is hard for me to visualize density with just screenshots - I like
to rotate the image to see the 3d.  This is really clear density, however,
and you should be able to figure out what it is.  As far as I can tell
from your images it looks like half of an EDTA with the other half trailing
off into the space above.  Was your protein exposed to that compound somewhere
along the way?

Dale Tronrud

On 08/08/12 04:56, Mario Sniady wrote:
 
 
 While building our crystal structure model we encountered density which
 we weren't able to assign. The unknown molecule/molecules seems to
 coordinate a Ni^2+ -ion. This ion is also coordinated by a histidin and
 probably one H_2 O-molecule. The crystals have been grown from a
 protein-complex including the carotinoid peridinin, the lipid DGDG and
 Chlorophyll. Besides this the solution in which they have been grown
 contains Tris pH 8.5, NiCl_2 and PEG 2000 MME.
 
 The linked images show a rotation around the density in 90°-steps (1.1A,
 1.2sigma contour level). The above mentioned H_2 O-molecule has been
 removed. It is supposed to fill the density that is below the Ni^2+ -Ion
 in the first picture.
 Images:  http://www.bioxtal.rub.de/myst.html.en
 
 Any hints are welcome =)
 
 Mario
 


Re: [ccp4bb] loading maps in coot using EDS

2012-08-08 Thread Dale Tronrud
   It appears that the Electron Density Server could not calculate a map
for 3TVN.  These cryptic messages are what you get from Coot when there
is no map on the server.  I can see from the RCSB web page that there is
no EDS link in the Experimental Details section, which also happens
when the EDS comes up empty.

   When the EDS fails to calculate a reasonable map for an entry they do
not tell us why.  If they knew what the problem was they would fix it
themselves.  They remain silent hoping that the authors of the entry
will contact them and give them some help.  It is absolutely amazing that
they can calculate as many maps as they do.

Dale Tronrud

On 08/08/12 13:39, Shya Biswas wrote:
 Hi all,
 
 I was trying to get maps using the *fetch PDB and Map using EDS option*
 in coot, however the map would not open I am using coot version 0.6.2
 was wondering if anybody else had similar problems and how to fix this,
 the following is the error message I get. It used to work fine with a
 previous version of coot.
 
 
 
 CCP4MTZfile: open_read - File missing or corrupted:
 coot-download/3tvn_sigmaa.mtz
 INFO:: not an mtz file: coot-download/3tvn_sigmaa.mtz
 ERROR: no f_cols!
 ERROR: no phi_cols!
 valid_labels(coot-download/3tvn_sigmaa.mtz,FOFCWT,PHFOFCWT,,0) returns 0
 CCP4 library signal library_file:End of File (Error)
  raised in ccp4_file_raw_read 
 System signal 0:Success (Error)
  raised in ccp4_file_rarch 
 CCP4 library signal library_file:End of File (Error)
  raised in ccp4_file_raw_read 
 System signal 0:Success (Error)
  raised in ccp4_file_readchar 
 CCP4 library signal mtz:Read failed (Error)
  raised in MtzGet 
 CCP4MTZfile: open_read - File missing or corrupted:
 coot-download/3tvn_sigmaa.mtz
 INFO:: not an mtz file: coot-download/3tvn_sigmaa.mtz
 ERROR: no f_cols!
 ERROR: no phi_cols!
 WARNING:: label(s) not found in mtz file coot-download/3tvn_sigmaa.mtz
 FOFCWT PHFOFCWT
 WARNING:: -1 is not a valid molecule in set_scrollable_map
 
 thanks,
 Shya
 


Re: [ccp4bb] loading maps in coot using EDS

2012-08-08 Thread Dale Tronrud
On 08/08/12 14:31, Katherine Sippel wrote:
 The 3tvn coordinates/SF were released today. I'm not sure what the lag
 time is between the PDB and EDS but you'd probably need to download the
 structure factors and generate the map yourself.

   A very good point.  I saw the deposition date of 2011 but didn't read
down to the release date.  The EDS does not get an advanced look at entries
in the PDB.  The data has to be released to the public before it can
begin the calculations.  This can take a couple weeks.

   In addition, the server, itself, appears to be down at the moment.  I
don't think you could download the map even if it existed.

Dale Tronrud

 
 If you're not in a super rush I know the person who refined that
 specific PDB and I may be able to get you a copy of her final maps to
 send you off-board once she gets back from vacation.
 
 Cheers,
 Katherine
 
 
 On Wed, Aug 8, 2012 at 3:39 PM, Shya Biswas shyabis...@gmail.com
 mailto:shyabis...@gmail.com wrote:
 
 Hi all,
 
 I was trying to get maps using the *fetch PDB and Map using EDS
 option* in coot, however the map would not open I am using coot
 version 0.6.2 was wondering if anybody else had similar problems and
 how to fix this, the following is the error message I get. It used
 to work fine with a previous version of coot.
 
 
 
 CCP4MTZfile: open_read - File missing or corrupted:
 coot-download/3tvn_sigmaa.mtz
 INFO:: not an mtz file: coot-download/3tvn_sigmaa.mtz
 ERROR: no f_cols!
 ERROR: no phi_cols!
 valid_labels(coot-download/3tvn_sigmaa.mtz,FOFCWT,PHFOFCWT,,0) returns 0
  CCP4 library signal library_file:End of File (Error)
  raised in ccp4_file_raw_read 
  System signal 0:Success (Error)
  raised in ccp4_file_rarch 
  CCP4 library signal library_file:End of File (Error)
  raised in ccp4_file_raw_read 
  System signal 0:Success (Error)
  raised in ccp4_file_readchar 
  CCP4 library signal mtz:Read failed (Error)
  raised in MtzGet 
 CCP4MTZfile: open_read - File missing or corrupted:
 coot-download/3tvn_sigmaa.mtz
 INFO:: not an mtz file: coot-download/3tvn_sigmaa.mtz
 ERROR: no f_cols!
 ERROR: no phi_cols!
 WARNING:: label(s) not found in mtz file
 coot-download/3tvn_sigmaa.mtz FOFCWT PHFOFCWT
 WARNING:: -1 is not a valid molecule in set_scrollable_map
 
 thanks,
 Shya
 
 


Re: [ccp4bb] comparing differences across multiple structures of the same protein

2012-09-10 Thread Dale Tronrud

   I believe that the definition of significant for crystallographic
data should be based on the difference map.  If a shift of that magnitude
causes a feature to appear in the map, then the crystal data is driving
the shift.  If you can have a shift that large, for the particular atoms
in question, and the difference map remains flat then the crystal data
doesn't care.

   A refinement program will move an atom for lots of reasons in
addition to the diffraction data, sometimes for no reason at all (simulated
annealing, for example).  The difference map is a pure expression of the
will of the diffraction data.

   The most sensitive calculation is the F(holo)-F(apo) map, but this
requires isomorphous crystals.  It might be possible to paste into the
holo model a couple residues from the apo model, refine all parameters
except the position of these atoms, and see if the Fo-Fc map objects.

   Remember, a lysine on the surface can probably be built in twenty
different conformations and the difference map flat in every case while
a couple atoms elsewhere could have a shift of 0.1 A that lights up the
map.   There are no generic cut-offs or thresholds that work.

Dale Tronrud

On 9/10/2012 9:01 PM, Michael Murphy wrote:

I am trying to compare structures of the same protein in the apo form and when 
bound to several different ligands. There are
differences, but they are subtle and I am unsure whether they are actually 
significant or just do to coordinate error or something
similar. Is there a theoretical minimum (in Angstroms maybe?) that a side chain 
or secondary structure element needs to be displaced
by between structures to be considered to be real? This may depend on 
resolution/B-factors as well?  Phenix reports overall
coordinate error for each structure, but this must vary for at least a bit for 
certain amino acid residues just like B-factors do.



Re: [ccp4bb] Strange density

2012-11-28 Thread Dale Tronrud

Hi,

   These sorts of questions are always difficult, particularly in the
absence of any information about the protein or the contents of the
mother liquor.  If the carbonyl you are talking about is the little
magenta dot visible through the hole in your blob, this could be a
metal atom with some long chelating molecule around the equator. In
the extreme it could be some sort of porphyrin, although the density
would be very poor if it was.

Dale Tronrud

On 11/28/2012 7:48 AM, Read, Jon wrote:

Anyone see anything like this before? The data is 1.7Angstrom data with good 
statistics. The picture shows the solid FoFc density contoured at 3  Sigma in 
light brown and -3 Sigma in purple. The density is odd as it appears to be 
bound to a peptide carbonyl with no other obvious interactions with the 
protein. There is a characteristic tail at one end.



AstraZeneca UK Limited is a company incorporated in England and Wales with 
registered number: 03674842 and a registered office at 2 Kingdom Street, 
London, W2 6BD.

*Confidentiality Notice: *This message is private and may contain confidential, 
proprietary and legally privileged information. If you have received this 
message in error, please notify us and remove it from your system and note that 
you must not copy, distribute or take any action in reliance on it. Any 
unauthorised use or disclosure of the contents of this message is not permitted 
and may be unlawful.

*Disclaimer:* Email messages may be subject to delays, interception, 
non-delivery and unauthorised alterations. Therefore, information expressed in 
this message is not given or endorsed by AstraZeneca UK Limited unless 
otherwise notified by an authorised representative independent of this message. 
No contractual relationship is created by this message by any person unless 
specifically indicated by agreement in writing other than email.

*Monitoring: *AstraZeneca UK Limited may monitor email traffic data and content 
for the purposes of the prevention and detection of crime, ensuring the 
security of our computer systems and checking compliance with our Code of 
Conduct and policies.



Re: [ccp4bb] Strange density

2012-11-28 Thread Dale Tronrud
   Actually one can make a lot of sense about e/A^3 in the absence of
F(000).  You actually think of the density as difference from the
average and not an absolute measurement.

   For an Fo-Fc style map the F(000) term is simply the difference
between the number of missing electrons in the model and the number
of extra electrons.  Since we are probably missing all data of
resolution lower than about 20 A because of the beamstop the model
defects are only counted if they are within about 20 A or so of
the point you are looking at.  In the latter stages of refinement,
when one is trying to identify strange density, the rest of the
model should be pretty good and the expected mean value of the
difference map very near zero.  Of course your model is missing
atoms for the blob itself so the difference density will tend to
sink, resulting in somewhat lower peaks and negative density around
the edges but this effect is usually not huge.

   On the other hand, contouring based on rmsd (i.e. sigma ack!)
causes huge differences depending on the other things that are going
on in your map.  The rmsd of your first difference map can be many
times larger than it is in your last.  The density for a missing
water molecule contoured at 3 rmsd in the first map will look very
different than the same water molecule contoured at 3 rmsd in the
last map.  That water molecule contoured at, say, 0.18 e/A^3 would
look pretty much the same.

   In the first difference map that water molecule will be surrounded
by a huge number of other features when you contour at 0.18 e/A^3
and by very few in the last map, but isn't that as it should be?
The map is supposed to be flatter at the end.

Dale Tronrud

On 11/28/12 12:30, Pavel Afonine wrote:
 For map in e-/A^3 units to make sense one needs to obtain F000, which
 may be more tricky than one may think. Interesting, how Coot does this
 given just a set of Fourier map coefficients?
 
 Pavel
 
 On Wed, Nov 28, 2012 at 12:21 PM, Greg Costakes gcost...@purdue.edu
 mailto:gcost...@purdue.edu wrote:
 
 You stated that the map is set to 3 sigma, but what is the e-/A^3? 
 In Coot I often find that my fo-fc map needs to be maxed out (max
 sigma) in order to get to an acceptable e-/A^3. It is possible that
 your fo-fc map at 3 sigma has an e-/A^3 of 0.04 or something low
 like that.
 
 
 ---
 Greg Costakes
 PhD Candidate
 Department of Structural Biology
 Purdue University
 Hockmeyer Hall, Room 320
 240 S. Martin Jischke Drive, West Lafayette, IN 47907
 
 
 
 
 
 
 *From: *Jon Read jon.r...@astrazeneca.com
 mailto:jon.r...@astrazeneca.com
 *To: *CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
 *Sent: *Wednesday, November 28, 2012 10:48:04 AM
 *Subject: *[ccp4bb] Strange density
 
 
 Anyone see anything like this before? The data is 1.7Angstrom data
 with good statistics. The picture shows the solid FoFc density
 contoured at 3  Sigma in light brown and -3 Sigma in purple. The
 density is odd as it appears to be bound to a peptide carbonyl with
 no other obvious interactions with the protein. There is a
 characteristic tail at one end.
 
  
 
  
 
  
 
 
 
 AstraZeneca UK Limited is a company incorporated in England and
 Wales with registered number: 03674842 and a registered office at 2
 Kingdom Street, London, W2 6BD.
 
 *Confidentiality Notice: *This message is private and may contain
 confidential, proprietary and legally privileged information. If you
 have received this message in error, please notify us and remove it
 from your system and note that you must not copy, distribute or take
 any action in reliance on it. Any unauthorised use or disclosure of
 the contents of this message is not permitted and may be unlawful.
 
 *Disclaimer:* Email messages may be subject to delays, interception,
 non-delivery and unauthorised alterations. Therefore, information
 expressed in this message is not given or endorsed by AstraZeneca UK
 Limited unless otherwise notified by an authorised representative
 independent of this message. No contractual relationship is created
 by this message by any person unless specifically indicated by
 agreement in writing other than email.
 
 *Monitoring: *AstraZeneca UK Limited may monitor email traffic data
 and content for the purposes of the prevention and detection of
 crime, ensuring the security of our computer systems and checking
 compliance with our Code of Conduct and policies.
 
 


Re: [ccp4bb] Strange density

2012-11-28 Thread Dale Tronrud
   No such luck!  If one calculated the Root Mean Square Deviation from
the Mean then F(000) makes no difference, but everyone I know calculates
the Deviation from 0.0.  I guess that makes it an rms and not an
rmsd.

   We can use maps calculated w/o the F(000) because we are generally
more interested in the shape of the density than its height.  We use
the shape to come up with interpretations, which is the hard part.
The height can give us a clue about the occupancy an atom would have
if we refined one there - It is a shortcut to avoid building and
refining models that are destined to be nonsense.  If the molecule
you are building will refine to an occupancy of 0.1 you could spend
you time better by doing something else.

Dale Tronrud

On 11/28/12 15:13, Lijun Liu wrote:
 F000 contributes to the whole map as a level (F000/V).  If calculated
 with a only difference of with or w/o F000, should the sigma levels of
 the two maps be the same?   That is why we could rely on maps for
 modeling that are calculated w/o F000 item.   Lijun
 
 
 On Nov 28, 2012, at 2:30 PM, Pavel Afonine wrote:
 
 For map in e-/A^3 units to make sense one needs to obtain F000, which
 may be more tricky than one may think. Interesting, how Coot does this
 given just a set of Fourier map coefficients?

 Pavel

 On Wed, Nov 28, 2012 at 12:21 PM, Greg Costakes gcost...@purdue.edu
 mailto:gcost...@purdue.edu wrote:

 You stated that the map is set to 3 sigma, but what is the
 e-/A^3?  In Coot I often find that my fo-fc map needs to be maxed
 out (max sigma) in order to get to an acceptable e-/A^3. It is
 possible that your fo-fc map at 3 sigma has an e-/A^3 of 0.04 or
 something low like that.

 
 ---
 Greg Costakes
 PhD Candidate
 Department of Structural Biology
 Purdue University
 Hockmeyer Hall, Room 320
 240 S. Martin Jischke Drive, West Lafayette, IN 47907

 
 


 
 *From: *Jon Read jon.r...@astrazeneca.com
 mailto:jon.r...@astrazeneca.com
 *To: *CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
 *Sent: *Wednesday, November 28, 2012 10:48:04 AM
 *Subject: *[ccp4bb] Strange density



 Anyone see anything like this before? The data is 1.7Angstrom data
 with good statistics. The picture shows the solid FoFc density
 contoured at 3  Sigma in light brown and -3 Sigma in purple. The
 density is odd as it appears to be bound to a peptide carbonyl
 with no other obvious interactions with the protein. There is a
 characteristic tail at one end.

  

  
  



 

 AstraZeneca UK Limited is a company incorporated in England and
 Wales with registered number: 03674842 and a registered office at
 2 Kingdom Street, London, W2 6BD.


 *Confidentiality Notice: *This message is private and may contain
 confidential, proprietary and legally privileged information. If
 you have received this message in error, please notify us and
 remove it from your system and note that you must not copy,
 distribute or take any action in reliance on it. Any unauthorised
 use or disclosure of the contents of this message is not permitted
 and may be unlawful.


 *Disclaimer:* Email messages may be subject to delays,
 interception, non-delivery and unauthorised alterations.
 Therefore, information expressed in this message is not given or
 endorsed by AstraZeneca UK Limited unless otherwise notified by an
 authorised representative independent of this message. No
 contractual relationship is created by this message by any person
 unless specifically indicated by agreement in writing other than
 email.


 *Monitoring: *AstraZeneca UK Limited may monitor email traffic
 data and content for the purposes of the prevention and detection
 of crime, ensuring the security of our computer systems and
 checking compliance with our Code of Conduct and policies.















 


Re: [ccp4bb] archival memory?

2012-12-12 Thread Dale Tronrud

   Good luck on your search in 100 years for a computer with a
USB port.  You will also need software that can read a FAT32
file system.

Dale Glad I didn't buy a lot of disk drives with Firewire Tronrud

On 12/12/2012 1:02 PM, Richard Gillilan wrote:

SanDisk advertises a Memory Vault disk for archival storage of photos that 
they claim will last 100 years.

(note: they do have a scheme for estimating lifetime of the memory, Arrhenius 
Equation ... interesting. Check it out: 
www.sandisk.com/products/usb/memory-vault/ and click the Chronolock tab.).

Has anyone here looked into this or seen similar products?

Richard Gillilan
MacCHESS



Re: [ccp4bb] archival memory?

2012-12-12 Thread Dale Tronrud

   I don't believe there is a solution that does not involve active
management.  You can't write your data and pick up those media 25
years later and expect to get your data back -- not without some
heroic effort involving the construction of your own hardware.

   I have data from Brian Matthews' lab going back to the mid-1970's
and those data started life on 7-track mag tapes.  I've moved them
from there to 9-track 1600 bpi tapes, to 9-track 6250 bpi tapes, to
just about every density of Exabyte tape, to DVD, and most recently
to external magnetic hard drives (each with USB, Firewire, and eSATA
interfaces).  The hard drives are about five years old and so far
are holding up.  Last time I checked I could still read the 10 year
old DVD's.  I'm having real trouble reading Exabyte tapes.

   Write your data to some medium that you expect to last for at least
five years but anticipate that you will then have to move them to
something else.

   Instead of spending time working on the 100 year solution you should
spend your time annotating your data so that someone other than you
can figure out what it is.  Lack of annotation and editing is the
biggest problem with old data.

Dale Tronrud

P.S. If someone needs the intensities for heavy atom derivatives of
Thermolysin written in VENUS format, I'm your man.



On 12/12/2012 1:57 PM, Richard Gillilan wrote:

Better option? Certainly not TAPE or electromechanical disk drive. CD's and 
DVD's don't last nearly that long and James Holton has pointed out.

I suppose there might be a cloud solution where you rely upon data just 
floating around out there in cyberspace with a life of its own.

Richard

On Dec 12, 2012, at 4:41 PM, Dale Tronrud wrote:



   Good luck on your search in 100 years for a computer with a
USB port.  You will also need software that can read a FAT32
file system.

Dale Glad I didn't buy a lot of disk drives with Firewire Tronrud

On 12/12/2012 1:02 PM, Richard Gillilan wrote:

SanDisk advertises a Memory Vault disk for archival storage of photos that 
they claim will last 100 years.

(note: they do have a scheme for estimating lifetime of the memory, Arrhenius 
Equation ... interesting. Check it out: 
www.sandisk.com/products/usb/memory-vault/ and click the Chronolock tab.).

Has anyone here looked into this or seen similar products?

Richard Gillilan
MacCHESS



Re: [ccp4bb] archival memory?

2012-12-12 Thread Dale Tronrud

On 12/12/2012 3:19 PM, Bosch, Juergen wrote:

Hey Dale,

you really should get your personal RAID with hot swappable discs, since you 
don't like Firewire, how about Thunderbolt and a
Pegasus RAID with 6 bays ? If a drive fails you replace it with a new one.


   Last summer someone in the lab above ours decided they needed a full
sink of water.  Before this task was complete they decided they needed
to go home.  The resulting flood destroyed the contents of the desks of
two of our lab members.  That was a lot of paper that didn't make 100
years - including a Handbook of Chemistry and Physics that had almost
made 60.

   If the lab RAID had been under the waterfall it would have lost all
of its drives in one go.  I don't know how big a RAID number you have
to have to survive that, but RAID-5 isn't going to do it.

   I have run a flash drive through my washing machine a couple times
and it is still going strong so I have high hopes for solid-state
memory.  It will be several years before 1 TB SSD's drop in price
enough for the next move of my little archive.  The SanDisk Memory
Vault that started this thread maxes out at 16 GB.

Dale Tronrud



By the way if anybody has a functional DAT4 tape drive, could I send you one to 
read out a tape with some data ? If so, then off
list reply would be nice, thanks.

Jürgen

On Dec 12, 2012, at 5:22 PM, Dale Tronrud wrote:


   I don't believe there is a solution that does not involve active
management.  You can't write your data and pick up those media 25
years later and expect to get your data back -- not without some
heroic effort involving the construction of your own hardware.

   I have data from Brian Matthews' lab going back to the mid-1970's
and those data started life on 7-track mag tapes.  I've moved them
from there to 9-track 1600 bpi tapes, to 9-track 6250 bpi tapes, to
just about every density of Exabyte tape, to DVD, and most recently
to external magnetic hard drives (each with USB, Firewire, and eSATA
interfaces).  The hard drives are about five years old and so far
are holding up.  Last time I checked I could still read the 10 year
old DVD's.  I'm having real trouble reading Exabyte tapes.

   Write your data to some medium that you expect to last for at least
five years but anticipate that you will then have to move them to
something else.

   Instead of spending time working on the 100 year solution you should
spend your time annotating your data so that someone other than you
can figure out what it is.  Lack of annotation and editing is the
biggest problem with old data.

Dale Tronrud

P.S. If someone needs the intensities for heavy atom derivatives of
Thermolysin written in VENUS format, I'm your man.



On 12/12/2012 1:57 PM, Richard Gillilan wrote:

Better option? Certainly not TAPE or electromechanical disk drive. CD's and 
DVD's don't last nearly that long and James Holton
has pointed out.

I suppose there might be a cloud solution where you rely upon data just 
floating around out there in cyberspace with a life of
its own.

Richard

On Dec 12, 2012, at 4:41 PM, Dale Tronrud wrote:



  Good luck on your search in 100 years for a computer with a
USB port.  You will also need software that can read a FAT32
file system.

Dale Glad I didn't buy a lot of disk drives with Firewire Tronrud

On 12/12/2012 1:02 PM, Richard Gillilan wrote:

SanDisk advertises a Memory Vault disk for archival storage of photos that 
they claim will last 100 years.

(note: they do have a scheme for estimating lifetime of the memory, Arrhenius 
Equation ... interesting. Check it out:
www.sandisk.com/products/usb/memory-vault/ 
http://www.sandisk.com/products/usb/memory-vault/ and click the Chronolock 
tab.).

Has anyone here looked into this or seen similar products?

Richard Gillilan
MacCHESS



..
Jürgen Bosch
Johns Hopkins University
Bloomberg School of Public Health
Department of Biochemistry  Molecular Biology
Johns Hopkins Malaria Research Institute
615 North Wolfe Street, W8708
Baltimore, MD 21205
Office: +1-410-614-4742
Lab:  +1-410-614-4894
Fax:  +1-410-955-2926
http://lupo.jhsph.edu






Re: [ccp4bb] engh huber

2013-01-14 Thread Dale Tronrud
There was an update by EH in 2001 in the International Tables Vol F.
There are a small number of modifications to the 1991 values in the
update as well as the addition of several conformational variabilities.
If I understand correctly, Refmac and Phenix use the 2001 values,
with the only conformational variability being some changes with
cis-peptide bonds.  Shelxl still uses EH 1991.

Dale Tronrud

On 01/14/13 09:54, Ed Pozharski wrote:
 To what extent modern geometric restraints have been upgraded over
 original EnghHuber?  And where I can find a consensus set of values
 (with variances)?  
 
 For example, Fisher et al., Acta D68:800 discusses how histidine angles
 change with protonation, and refers to EnghHuber when it says that
 ND1-CE1-NE2 goes from 111.2 to 107.5 when histidine acquires positive
 charge (Fig.6).  But angle table (Table 3) in original EnghHuber from
 1991 does not have any 107.5 value and seems to suggest that the numbers
 should rather be 111.7+-1.3 and 108.4+-1.0, respectively.
 
 I understand that these values are derived from structural databases and
 thus can be frequently updated.  Is there some resource where most
 current values would be listed?
 
 Cheers,
 
 Ed.
 


Re: [ccp4bb] how many metal sites

2013-01-16 Thread Dale Tronrud
   Zn is a very electron rich atom so a 2.3 A resolution data set should
be a fine experiment to determine the number of fully occupied metal
sites.  It is always hard to be sure about screen shots of density, but
it looks to me that you only have evidence for one zinc here.

   In my opinion, it is not useful to build models that don't make
sense.  Your zinc cluster does not make chemical sense to me and
the atoms are not in the density.  I suspect that you built this
cluster, and not the obvious model with fewer zinc atoms, simply
because you wanted to match the magic number of four.  Use the
things you know with confidence as your guide.

Dale Tronrud

On 01/16/13 11:15, ruisher hu wrote:
 Hi, Dear All,
 
 I recently got a dataset about 2.3 A resolution, however, I got some
 trouble assigning the metal sites. It suppose to have multiple binding
 site(possibly four) around those four glu residues in the center (see
 the attached figure), however, it shows up a huge single positive
 density ,clustered in the binding center. The signal is pretty strong
 and I think zn is definitely there. When I tried to put four zns around,
 the geometry doesn't look very good and there is still some positive
 density in the center (although get weaker) and the bfactor of metals
 are high like 100. Does anyone know what's going on?Does it mean only
 one single site in the middle?Or maybe just metals are too mobile?
 What's the best way to tell how many metal sites are actually
 there?Which experiment can I use to test? Thanks very much.
 
 Best,
 
 R
 
 On Wed, Nov 7, 2012 at 9:29 AM, SD Y ccp4...@hotmail.com
 mailto:ccp4...@hotmail.com wrote:
 
 Dear all,
 
 I have a related question to the one I have posted low resolution
 and SG, on which I am still working based on the suggestions I have
 got.
 
 The model I have used, has Zn co-ordinated well in tetrahydral
 fashion by 3 cys and 1 His residues. They have  add Zn in to their
 experiment.
 In my 3.4 A structure  (I am still working on right SG), initial
 maps  show very strong positive density (sigma=6.5) at the place of
 Zn
 ( https://www.dropbox.com/s/4jd6gdor87ab9lj/Zn-coordination.png). I
 have not used Zn in my experiment. I could only suspect Tryptone and
 yeast extract which I used to make media.
 
 I would like to know how likely  this positive density belongs to
 Zn? How to reason the presence of Zn when its not been used?
 Is there is any way to confirm if its Zn. If this is not Zn, what
 else could it be? Any thing I could try to rule out or in Zn or
 other ions.
 I appreciate your help and suggestions.
 
 Sincerely,
 SDY
 
 


Re: [ccp4bb] CCP4 Update victim of own success

2013-04-11 Thread Dale Tronrud

FYI

   I have a small herd of computers here and find it cumbersome to ssh
to each and fire up ccp4i just to update the systems.  ccp4i takes a
while to draw all those boxes (particularly over ssh) and leaves files
behind in my disk areas on computers that I'm not likely to, personally,
run crystallographic computations.  I much prefer to simply run ccp4um
from the command line.

   In fact, I would rather put it in cron and forget about it -- and
I expect that is what --check-silent is for.  The usage statement,
however, doesn't explicitly say that this installs the new updates it
finds.  I'll have to experiment a bit.

Dale Tronrud

On 04/11/2013 05:17 AM, eugene.krissi...@stfc.ac.uk wrote:

Sorry that this was unclear. We assume that updater is used primarily from 
ccp4i, where nothing changed (and why it should be used from command line at 
all ?:)). The name was changed because it is reserved in Windows, which caused 
lots of troubles. Now it will stay as is.

Eugene

On 11 Apr 2013, at 05:16, James Stroud wrote:


On Apr 10, 2013, at 9:30 PM, 
eugene.krissi...@stfc.ac.ukmailto:eugene.krissi...@stfc.ac.uk 
eugene.krissi...@stfc.ac.ukmailto:eugene.krissi...@stfc.ac.uk wrote:

No, it got renamed to ccp4um :) That should have been written in update 
descriptions, was it not?


There was only one mention of ccp4um that I could find in all update 
descriptions that I found (6.3.0-020). I only figured out what information was trying to 
be communicated because of your message (see attachment).

James


um-what.png



On 11 Apr 2013, at 03:54, James Stroud wrote:

Hello All,

I downloaded a crispy new version of CCP4 and ran update until the update 
update script disappeared. Is the reason that CCP4 has reached its final update?

James






Re: [ccp4bb] problem with anisotropic refinement using refmac

2007-01-31 Thread Dale Tronrud

   As I see it, the size of the test set is a question of the desired
precision of the free R.  At the point of test set selection there is
variability between the many possible choices: you could happen to pick
a test set with a spuriously low free R or one with an unfortunately
high free R.  These variations don't indicate anything about the quality
of your model because you haven't created one yet.  It is just statistical
fluctuations.

   To investigate this I selected several structures I had worked on,
where I had an unrefined starting model.  For each structure I picked a
percentage for the size of the test set and started a loop of test set
selection and free R calculations (Since there had been no refinement
yet the R value for the whole data set is the true free R: all free R's
calculated from subsets are just estimates.  For each percentage of each
structure I selected 900 test sets.

   The result is that the variance of the free R estimate is not a function
of the size of the protein, the space group, the solvent content, the
magnitude of free R (only checked between about 35% and 55%), nor the
size of the test set measured as a percent.  It is simply a function of
the number of reflections in the test set.  As Axel said in his paper, a
test set of 1000 reflections has a precision of about 1%.  (and varies
with counting statistics: 1/Sqrt[n])

   If you have a test set of 1000 reflections and your free R estimate is
40% you have a 95% confidence that the true free R is between 43% and 37%,
if I recall my confidence intervals correctly.

   The open question is how these deviants track with refinement.  If you
luck out and happen to pick a test set with a particularly low free R
(estimate) does this mean that all your future free R's will look,
inappropriately, too good?  I suspect so, but I have not done the test
of performing 900 independent refinements with differing test sets.

   My short answer to the original question: The precision of a R free
estimate is determined by the number of reflections, not the percent of
the total data set.  Your 0.3% test set is as precise as a 10% test set
in HEWL.  (Even though the affect of leaving these reflections out of
the refinement will be quite different, of course.)

Dale Tronrud


Andreas Forster wrote:

Hey all,

let me give this discussion a little kick and see if it spins into outer 
space.


How many reflections do people use for cross-validation?  Five per cent 
is a value that I read often in papers.  Georg Zocher started with 5% 
but lowered that to 1.5% in the course of refinement.  We've had 
problems with reviewers once complaining that the 0.3% of reflections we 
used were not enough.  However, Axel Brünger's initial publication deems 
1000 reflections sufficient, and that's exactly what 0.3% of reflections 
corresponded to in our data set.


I would think the fewer observations are discarded, the better.  Can one 
lower this number further by picking reflections smartly, eg. avoiding 
symmetry-related reflections as was discussed on the ccp4bb a little 
while back?  Should one agonize at all, given that one should do a last 
run of refinement without any reflections excluded?




Andreas


On 1/31/07, *Georg Zocher* [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED] wrote:


First of all, I would like to thank you for your comments.

After consideration of all your comments, I conclude that there are
three possibilities.

1.) search for some particularly poorly-behaved regions using
parvati-server
   a.) refining the occupancy of that atoms and/or
   b.) tightening the restraints

Problems which have already been metioned:
If I tighten the restraints, the anisotropic model may not be
statistically justified, which seems to be the case.

Using all reflections may not help that much, because I chose a set
of 1.5% for Rfree (~1300 reflections) to get as much data as
possible for the refinement. For my first tries of anisotropic
refinement I used 5% of the reflections for Rfree but the same
problem arose, so that I decided to cut the Rfree to 1.5%.

2.) Using shelxl

3.) TLS with multi-groups
   Should be the safe way!?

I will try all the possiblities, but especially the tls refinement
seems to be a good option to be worthy to try.

Thanks for your helpful advices,

georg



[ccp4bb] Converting TNT HKL Files to MTZ using CCP4I in CCP4 6.0.2

2007-02-08 Thread Dale Tronrud

Hi,

   I have a basic operational problem.  I am trying to convert a pair of
TNT HKL format files into an MTZ file using the CCP4I interface. (I realize
this is probably not the most heavily exercised code in the package.)
When I run the task I get the following output.

#CCP4I VERSION CCP4Interface 1.4.4.2
#CCP4I SCRIPT LOG import
#CCP4I DATE 08 Feb 2007  22:59:11
#CCP4I USER dale
#CCP4I PROJECT TST
#CCP4I JOB_ID 17
#CCP4I SCRATCH /tmp/dale
#CCP4I HOSTNAME terbium.uoregon.edu
#CCP4I PID 18342


#CCP4I TERMINATION STATUS 0 Error from script 
/usr/local/ccp4-6.0.2/ccp4i/scripts/import.script: can't read fo: no such 
variable
#CCP4I TERMINATION TIME 08 Feb 2007  22:59:11
#CCP4I MESSAGE Task failed

   I'm afraid I'm not fluent in TCL/TK.  Can anyone tell me what
I should do to get this working?

Thanks,
Dale Tronrud


Re: [ccp4bb] Summary - Valid to stop Refmac after TLS refinement?

2007-04-04 Thread Dale Tronrud

Bernhard Rupp wrote:

People also felt that the RMSD bond/angle of 0.016/1.6 was still a little

high.

This was subject of a discussion before on the board and I still don't 
understand it:


If I recall correctly, even in highly accurate and precise
small molecule structures, the rmsd of corresponding
bonds and angles are ~0.014A and 1.8deg. 


It always seems to me that getting these values much below is not a sign
of crystallographic prowess but over-restraining them?

Is it just that - given good resolution in the first place - the balance 
of restraints (matrix weight) vs low R (i.e., Xray data) gives the best 
Rfree or lowest gap at (artificially?) lower rmsd?


Is that then the best model?

I understand that even thermal vibration accounts for about 1.7 deg 
angle deviation -  are lower rmsd deviations then a manifestation

of low temp? But that does not seem to be much of an effect, if
one looks at the tables from the CSD small mol data (shown in 
nicely in comparison to the 91 Engh/Huber data in Tables F, pp385). 
 


   This is an on-going topic of discussion so let me put in my two cents.

   We calculate libraries of ideal geometry based on precise, small
molecule structures.  When these small molecule crystal structures are
compared to our derived libraries they are found to contain deviations.
These deviations are larger than the uncertainty in these models and
are presumed to reflect real features of the molecule; perturbations
due to the local environment in the crystal.

   These same perturbations are present in our crystals and we should
expect to find deviations from ideal geometry on the same scale as
that seen in the precise models.  This expectation lead to the practice
in the 1980's of setting r.m.s. targets of 0.02A and 3 degrees for
agreement to bond length and angle libraries.

   While this seems quite reasonable, we are left with the question:
Are the deviations from ideal geometry we see in a particular model
in any way related to the actual deviations of the molecule in the
crystal?  The uncertainties (su's) of the bond lengths in a model based
on 4A diffraction data are huge compared to the absolute value of the
true deviation.  For example, if the model had a deviation from ideal
geometry of 0.02A but the uncertainty of the distance is 0.2A can we
say that we have detected a signal that is significantly different than
zero, the null hypothesis?

   If we have a model with a collection of deviations from ideal geometry
but we have no expectation that those deviations are indicative of the
true deviations of the molecule in the crystal, are those deviations
serving any purpose?  If they do not reflect any property of the crystal
they are noise and should be filtered out.

   By this argument a model based on 4A resolution diffraction data should
have no deviation from idea geometry while one based on 0.9A diffraction
data should have no restraints on ideal geometry since the deviations
are probably all real and significant (except for specific regions of
the molecule that have problems).

   The problem we all face is the vast area between these extremes,
compounded by our inability to calculate proper uncertainties for the
parameters of our models.  The free R is our current tool-of-choice when
it comes to attempting to judge the statistical significance of aspects
of our model, without performing proper statistical tests which we don't
know how to do.  If we allow our model the freedom to deviate from our
library and the free R improves a significant (??) amount then the
resulting deviations must have some similarity to the true deviations
in the crystal, but if the free R does not improve then the deviations
must not be related to reality and should be suppressed.  This is the
type of assumption we make whenever we use the free R to make a choice.

   What we end of doing is not making a yes/no decision but instead we
variably suppress the amplitude of the deviations from idea geometry
and that is harder to justify.  I think a reasonable argument can be
made, but I have already written too many words in this letter.  It doesn't
really matter because we left the road of mathematical rigor when we took
the R free path.

   Unfortunately, many people have ignored what Brunger said in Methods
in Enzymology about choosing your X-ray/geometry weight based on the
free R and just starting saying the rms bond length deviation must
be 0.007A.  The deviations from idea geometry of your model should be
no more or no less than what you can justifiably claim is a reflection
of the true state of the molecule in your crystal.

Dale Tronrud


Re: [ccp4bb] DANO from PDB

2007-06-09 Thread Dale Tronrud

   Wouldn't it be reasonable to use the sigma one calculates from the
sigmaA?  That sigma would reflect the uncertainty in the calculated
structure factor amplitude due to the uncertainty in the parameters
in your model.  Of course, one then realizes that you should down
weight you structure factors amplitudes with sigmaA too.  Then you
would have a set of structure factors amplitudes and sigmas that
reflects the uncertainties of your model.

   If you don't believe in the idea of sigmaA's cloud of possible
atoms and just want the structure factors of your PDB file, as though
you know all the parameters to infinite precision, your sigma would
only be non-zero because of uncertainties due to numerical problems
in the Fourier Transform.  These sigmas would be very small, in most
cases, and be determined by the method you used to perform the
calculation.  This is probably not a useful solution.

Dale Tronrud


Peter Adrian Meyer wrote:

I add a fake sigma column for each data column because so many

programs require one.

This is slightly tangential, but does anyone know of a good way to
generate semi-realistic sigma values for calculated/simulated data?

The best I've been able to do is borrow from an experimental dataset of
the same protein (after scaling), but that doesn't work unless you've got
an experimental dataset corresponding to your simulated one.  I also tried
a least-squares fit (following a reference I don't have in front of
me...this was a while ago), which didn't result in a good fit for our
data.

Pete

Pete Meyer
Fu Lab
BMCB grad student
Cornell University


Re: [ccp4bb] Ligand fitting in COOT and SHELX refinement

2007-06-22 Thread Dale Tronrud

U Sam wrote:

Hi
I would like to know following issue for a ligand.
A ligand of a long alkyl chain can have multiple conformation.
In coot in order to fit any protein residues into difference Density, we can select a 
specific rotamer conformation and refine.
For fitting ligand of above kind, how does it work out.


   For amino acids there are tens of thousands of examples from which
one can derive rotomer libraries.  There is no such luck with most
other compounds.  This is why Coot has special case code for handling
amino acids that does not understand your (or my) favorite molecules.

   Fortunately, Coot does not require such information to run its real
space refinement.  You do need a cif definition that includes, amongst
other things, the ideal bond lengths and bond angles.  You can work
with Coot to build the conformation of your molecule that fits your
density.  All conformations will be consistent with the same bonds
and angles, unless you have a very strange molecule.

Taking the PDB with ligand when we go to refine in SHELX, how restraints like (DFIX, DANG etc) are mentioned for such kind of ligand which can have multiple conformation (particularly for the long alkyl chain) and during refinement values can deviate a lot from a particular value taken from the literature. 


SHELXL will take whatever conformation you build and come up with a
model that is consistent with the values on the DFIX and DANG statements.
It should never produce a model that deviates from the literature values,
if you put those values on your DFIX and DANG statements.  The final
model will have a configuration similar to what you built in Coot.
Use Coot to make the big changes and SHELXL to fine tune.

   I have been refining some long chain hydrocarbons along with my
protein and have had no problems, once I was able to create the proper
definitions for Coot and SHELXL.  SHELXL is certainly easier to create
a library for, but you need both if you want to model build and refine.

   Building a cif with ideal geometry for Coot/Refmac is not an easy
task.  You need to understand your chemistry and the file format,
which is not well documented.

   You have several options:

1) You can sit down and figure out how to create a cif definition.
   This is hard to do but a valuable skill to acquire.

2) You can find a compound similar to yours for which there is a
   definition built into Coot and modify it for your needs.  You
   still need to understand the file format, but you can get away
   with less understanding because you are starting with something
   that works.

3) You can use web resources to find/create a file for you.  A number
   of options are available, none of which I would trust completely.
   The HIC-UP website is perhaps the most popular, but the values
   are quite unreliable.  These files can be used as a starting point
   but always verify...  The Elbow builder in Phenix is quite
   reasonable, but takes a bit of study to understand, and again,
   don't trust it.

   Remember the quote from the Harry Potter books: Never trust
anything you can't see where it keeps its brains.


I appreciate suggestion and comments.
Many Thanks
Sam
_
With Windows Live Hotmail, you can personalize your inbox with your favorite 
color.
www.windowslive-hotmail.com/learnmore/personalize.html?locale=en-usocid=TXT_TAGLM_HMWL_reten_addcolor_0607


Re: [ccp4bb] The importance of USING our validation tools

2007-08-23 Thread Dale Tronrud

   In the cases you list, it is clearly recognized that the fault lies
with the investigator and not the method.  In most of the cases where
serious problems have been identified in published models the authors
have stonewalled by saying that the method failed them.

   The methods of crystallography are so weak that we could not detect
(for years) that our program was swapping F+ and F-.

   The scattering of X-rays by bulk solvent is a contentious topic.

   We should have pointed out that the B factors of the peptide are
higher then those of the protein.

   It appears that the problems occurred because these authors were not
following established procedures in this field.  They are, as near as
I can tell, somehow immune from the consequences of their errors.
Usually the paper isn't even retracted, when the model is clearly
wrong.  They can dump blame on the technique and escape personal
responsibility.  This is what upsets so many of us.

   It would be so refreshing to read in one of these responses We
were under a great deal of pressure to get our results out before our
competitors and cut corners that we shouldn't have, and that choice
resulted in our failure to detect the obvious errors in our model.

   If we did see papers retracted, if we did see nonrenewal of grants,
if we did see people get fired, if we did see prison time (when the
line between carelessness and fraud is crossed), then we could be
comforted that there is practical incentive to perform quality work.


Dale Tronrud


Edwin Pozharski wrote:

Mischa,

I don't think that the field of nanotechnology crumbled when allegations 
against  Jan Hendrik Schon  (21 papers withdrawn, 15 in Science/Nature) 
turned out to be true.  I don't think that nobody trusts biologists 
anymore because of Eric Poehlman (17 falsified grants, 10 papers with 
fabricated data, 12 month in prison).  We are still excited to hear 
about stem cell research despite of what Hwang Woo-suk did or didn't 
do.  What recent events demonstrate is that in macromolecular 
crystallography (and in science in general) mistakes, deliberate or not, 
will be discovered. Ed.


Mischa Machius wrote:
Due to these recent, highly publicized irregularities and ample 
(snide) remarks I hear about them from non-crystallographers, I am 
wondering if the trust in macromolecular crystallography is beginning 
to erode. It is often very difficult even for experts to distinguish 
fake or wishful thinking from reality. Non-crystallographers will have 
no chance at all and will consequently not rely on our results as much 
as we are convinced they could and should. If that is indeed the case, 
something needs to be done, and rather sooner than later.  Best - MM


 


Mischa Machius, PhD
Associate Professor
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.; ND10.214A
Dallas, TX 75390-8816; U.S.A.
Tel: +1 214 645 6381
Fax: +1 214 645 6353




Re: [ccp4bb] Questions about diffraction

2007-08-24 Thread Dale Tronrud

Michel Fodje wrote:

Dear Crystallographers,
Here are a few paradoxes about diffraction I would like to get some
answers about:


...


3. What happens to the photon energy when waves destructively interfere
as mentioned in the text books. Doesn't 'destructive interference'
appear to violate the first and second laws of thermodynamics? Besides,
since the sources are non-coherent, how come the photon 'waves' don't
annihilate each other before reaching the sample? If they were coherent,
would we just end up with a single wave any how? With what will it
interfere to cause diffraction?



   For every direction where there is destructive interference and a
loss of energy there is a direction where there is constructive
interference that piles up energy.  If you integrate over all directions
energy is conserved.

   I'm not sure what your concern is about the second law.  The radiation
is spreading out into space and so entropy increases.

Dale Tronrud


Re: [ccp4bb] Questions about diffraction

2007-08-24 Thread Dale Tronrud

Michel Fodje wrote:

For every direction where there is destructive interference and a
loss of energy there is a direction where there is constructive
interference that piles up energy.  If you integrate over all directions
energy is conserved.


For the total integrated energy to be conserved, energy will have to be
created in certain directions to compensate for the loss in other
directions. So in a direction in which the condition is met, the total
will have to be more than the sum of the waves in that direction.

How about considering the possibility that all photons coming into the
sample are diffracted -- just in different directions. So that what is
happening is not constructive and destructive interference but a kind
sorting of the photons based on a certain property of the photons, maybe
the phase.


   You seem to be operating under the impression that there are two diffracting
waves that later destructively interfere.  All constructive and destructive
interference occurs at the point of scattering.  There is no energy that
heads off in a direction that later disappears - Nothing ever went in
that direction.

   You have the same problem with your idea of two waves, out of phase
but identical in wavelength, that scatter off an electron.  The two waves,
if they are coherent, would interfere with each other long before they
reach the electron and become a single wave.  If they are not coherent
they will interact with the scatter independently and produce incoherent
diffraction waves, which will sum by intensity independent of phase.

   I can get into deep trouble with this next point so I hope a physicist
jumps on me where I'm wrong.  All light sources are coherent to a degree.
A laser is pretty much 100% coherent and my pocket flashlight is hardly
coherent at all.  I seem to recall that there is a parameter called
the coherence length that measures the distance within a beam that
the light is coherent.  The coherence length of a rotating anode
X-ray generator is small but unit cells are smaller so there are plenty
of unit cells to form a nice diffraction pattern.

   Your second paragraph is just the Copenhagen Interpretation of the
wave function.  If you want to think of photons then the diffraction
wave we are talking about is the wave function and that function maps
the probability of finding a photon.   Wave/particle duality says we
can look at the experiment either way.

Dale Tronrud


Re: [ccp4bb] alternating strong/weak intensities in reciprocal planes - P622

2007-08-27 Thread Dale Tronrud

   On possibility for #5, the B factors all dropping to the lower limit
during refinement.  If you are including all of your low resolution data
(which you should) but have not used a model for the bulk solvent scattering
of X-rays (which would be bad) then you will observe this result.  The
refinement program will attempt to overestimate the amplitudes of the
high resolution Fc's to match the overestimated low resolution Fc's.

   Check you log files to ensure you bulk solvent correction is operating
correctly.

Dale Tronrud

Jorge Iulek wrote:

Dear all,

Please, maybe you could give some suggestions to the problem below.

1) Images show smeared spots, but xds did a good job integrating them. 
The cell is 229, 229, 72, trigonal, and we see alternating strong and 
weak rows of spots in the images (spots near each other, but rows more 
separated, must be by c*). They were scaled with xscale, P622 (no 
systematic abscences), R_symm = 5.3 (15.1), I/sigI = 34 (14) and 
redundancy = 7.3 (6.8), resolution 2.8 A. Reciprocal space show strong 
spots at h, k, l=2n and weak spots at h, k, l=2n+1 (I mean, l=2n 
intensities are practically all higher than l=2n+1 intensities, as 
expected from visual inspection of the images). Within planes h, k, 
l=2n+1, the average intensity is clearly and much *higher at high 
resolution than at low resolution*. Also, within planes h, k, l=2n, a 
subjective observation is that average intensity apparently does not 
decay much from low to high resolution. The data were trucated with 
truncate, which calculated Wilson B factor to be 35 A**2.


2) Xtriage points a high (66 % of the origin) off-origin Patterson peak. 
Also, ML estimate of overall B value of F,SIGF = 25.26 A**2.


3) I suspect to have a 2-fold NCS parallel to a (or b), halfway the c 
parameter, which is almost crystallographic.


4) I submitted the data to the Balbes server which using 
pseudo-translational symmetry suggested some solutions, one with a good 
contrast to others, with a 222 tetramer, built from a structure with 40 
% identity and 58% positives, of a well conserved fold.


5) I cannot refine below 49 % with either refmac5, phenix.refine or CNS. 
Maps are messy, except for rather few residues and short stretches near 
the active site, almost impossible for rebuilding from thereby. Strange, 
to me, is that all programs freeze all B-factors, taking them the 
program minimum (CNS lowers to almost its minimum). Might this be due to 
by what I observed in the reciprocal space as related in 1 ? If so, 
might my (intensity) scaling procedure have messed the intensities due 
to their intrinsic property to be stronger in alternating planes ? How 
to overcome this ?


6) I tried some different scaling strategies *in the refinement step*, 
no success at all.


7) A Patterson of the solution from Balbes also shows an off-origin 
Patteron at the same position of the native data, although a little lower.


8) Processed in P6, P312 and P321, all of course suggest twinning.

I would thank suggestions, point to similar cases, etc... In fact, 
currently I wondered why refinement programs take B-factor to such low 
values


Many thanks,

Jorge


Re: [ccp4bb] How to number atoms in a ligand

2007-10-08 Thread Dale Tronrud

Dear Joe,

   Atom labels are, in principle, arbitrary.  The molecule doesn't case
what we call its atoms.  To make the PDB more useful, it is handy if
all the people working with a particular compound use the same names for
their atoms.  If you find that someone has already deposited a structure
containing your compound you are expected to use the same names they did.
There are lists of compounds and naming conventions on the PDB web site.

   The small molecule literature doesn't count as precedence for naming
conventions, only what is in the PDB.  No one will hold you to the names
used in the small molecule structure paper.

   If you are the first to work with this compound in the macromolecular
world you are free to choose whatever names you want.  Please choose
something sensible as the rest of use will be stuck with your choice
forever.  Consistency is king here.  If a similar compound is in the
PDB, using atom names based on it would simplify comparisons.

Dale Tronrud


Zheng Zhou wrote:

Hi, all

I am a rookie in crystallography. I know this may be a little bit off 
topic. I have cocrystallized several compounds with my favorite protein. 
I found crystal structures for some of these chemicals. But the 
numbering systems are different in those original papers for the small 
molecules. Some numbering system has all the atoms from 1 to the end 
(C1-O3-O8-N9-C15), while others have numbers for each individual 
element. (C1-C12, O1-O2, N1). I was trying to search a unified theme for 
ligands in pdb. I even emailed [EMAIL PROTECTED] 
mailto:[EMAIL PROTECTED], but so far I haven't heard anything 
back. Could anyone give me some suggestions? Any help would be greatly 
appreciated.
 
Thanks,


Sorry to bother others

Joe



Re: [ccp4bb] SFALL grid

2007-10-11 Thread Dale Tronrud

   SFALL is calculating structure factors from the map you supplied,
so there is only one grid, the one you used when you created the
map in NCSMASK.

   The choice of sampling rates for maps to be Fourier transformed is
a deep topic.  The mathematical law is that you have to sample the
map at, at least, twice the frequency of the highest Fourier component
in the map.  This is, unfortunately, often misinterpreted as twice
the frequency of the highest component you are interested in.

   The fact that you are interested in, say, 2A structure factors
has nothing to do with the calculation of the Fourier transform of
your map.  All that matters is the frequencies that were present in
your map before you sampled it on the grid.

   Ten Eyck, (1977) Acta Cryst A33, 800-804 has a discussion of this
and provided the classic solution to this problem when the map to be
transformed is a calculated electron density map.  I presume you have
an NCS averaged map and the required interpolations introduce needs
of their own that are significant.  Gerard Bricogne has written on that
topic, also back in the 1970's, but I don't have the reference at hand.

   The manual for your NCS averaging program should give you guidance
on the choice of sampling rate based on its interpolation method.  If
you are not even sampling at twice the resolution you are interested
in you are sampling way too coarsely.

   All FFT based structure factor programs require that the sampling
rates along each axis be even.  They may have other required factors
depending on the space group, but they will be happy to inform you
if you make a choice it doesn't like.  They are also more efficient
when the prime factors of the sampling rates are small numbers.  Try
to stick with multiples of 2,3, and 5 if possible.

   Since the program has no way of knowing the highest resolution
component actually in the map before you sampled it on your grid,
it assumes that the map contains no components of higher resolution
then you asked it to produce.  All FFT programs will fail if you
sample your map courser than twice that frequency, as SFALL did for
you.

   That does NOT mean that twice the frequency you are interested in
is sufficient.  You MUST read your NCS averaging program's documentation
and if that doesn't tell you, complain the the program's author, and
read Gerard's papers on the matter.  NCS averaging a map that is only
sampled at twice the rate you are interested in will not be a useful
way to spend your time.

Dale Tronrud

whittle wrote:

Hi:

I am trying to generate structure factors from a mask/map that I made
with NCSMASK. I get the following error message in SFALL:

The program run with command: sfall HKLOUT
/tmp/whittle/109_5_7_1_mtz.tmp MAPIN
/home/whittle/projects/109_5/CCP4/center_50.msk 
has failed with error message

 SFALL:   Grid too small- NZ must be  2*Lmax+1

Which grid is this referring to? The grid used by SFALL or by NCSMASK
when I initially generated the map? How does one choose an appropriate
grid and extent for these programs?

Thanks for your help!
--James


Re: [ccp4bb] Unidentified ligand (electron density) found at active site

2007-12-06 Thread Dale Tronrud

Hi,

   Why is it that you are so reluctant to identify this compound
as Ser-Gly?  Its fit to the density is great.

   To wax historical, we saw unexpected density in the active site
of apo Thermolysin.  It appeared to be Val-Ala, but with refinement
it developed into Val-Lys.  It happens that Val-Lys are the last
two residues of the protein.  Residues 315 and 316 were present at
full occupancy in the crystal so I presume the peptide was clipped
off molecules that didn't crystallize.

   Of course proving that the density actually represents Ser-Gly,
or any other compound you decide upon is much harder than building
a model to fit the density.  What is harder than identifying a bit
of density, is coming up with an experiment to prove it.

Dale Tronrud

Ronaldo Alves Pinto Nagem wrote:

Dear CCP4bb users,

As suggested by some users, I am attaching to this email the electron
density of the unidentified ligand. As I mentioned before it looks like a
dipeptide GlySer, but we are still in doubt. Attempts to correlate with
the protein function are being done. One might see in the pictures that
the ligand coordinate a metal ion.

Cheers

Ronaldo.



-
Prof. Dr. Ronaldo Alves Pinto Nagem
Universidade Federal de Minas Gerais
Instituto de Ciências Biológicas
Departamento de Bioquímica e Imunologia
Av. Antônio Carlos, 6627 - Caixa Postal 486
Bairro Pampulha - CEP: 31270-901
Belo Horizonte, MG - Brasil
Tel: +55 31 3499-2626
Fax: +55 31 3499-2614
E-mail: [EMAIL PROTECTED]









Re: [ccp4bb] Sulfate ion on 2-fold axis

2008-01-09 Thread Dale Tronrud
Dear Jie,

   It also depends on whether you believe the SO4 sits
with its internal two-fold along the crystal's two-fold
axis.  If it does you should probably have a 0.5 occ
sulfur and two 1.0 occ oxygen atoms.  If the symmetry
is not obeyed you will have to have four 0.5 occ oxygen
atoms.

   Be careful, some refinement programs will not be able
to handle the bond length and angle restraints if you
only supply two oxygen atoms.  They will not allow
bonds between atoms and symmetry images of atoms.  If
you are using such a program you will have to supply
four oxygen atoms even if this is not what you would
otherwise do.

Dale Tronrud


Charlie Bond wrote:
 Hi Jie,
 Depending on your resolution, you may be forced use to use S(0.5) and
 4x(0.5) in order to restrain the SO4 to stay tetrahedral in refinement.
 Cheers,
 Charlie
 
 
 Jie Liu wrote:
 Dear all

 I have a sulfate ion sitting on a 2-fold axis. Should I put in pdb file
 one S atom with occu=0.5 and two O atoms with occu=1, or should
 I put one S and four O atoms all with occu=0.5?

 Thanks for your inputs.

 Jie

 .

 


Re: [ccp4bb] unbiased electron density map

2008-01-09 Thread Dale Tronrud

   A 2Fo-Fc map is simply an Fc map with two times the Fo-Fc map added
to it.  ( Fc + 2(Fo-Fc) = Fc + 2Fo - 2Fc = 2Fo - Fc )  The phase comes
from the Fc's.  The basic formulation is biased toward the model used
to calculate the Fc's.  You did, after all, start with a pure Fc map!

   Various techniques are used to reduce bias in these maps.  Usually
a technique that reduces bias in one kind of map reduces the bias in
the other, since they are so closely related.  The procedures I know
of work by changing the calculation of Fc (and the weights on the
individual reflections, which aren't mentioned when using the simple
name Fo-Fc but are there none-the-less) and since the Fc is in both
maps both maps are debiased.

   These methods reduce bias.  Unbiased is a stronger claim and if
you use that word you should state clearly how you know the bias is
gone.

   Your quote bring up another matter.  An initial map, i.e. before
refinement, is unbiased if it is an omit map.  If you have done no
refinement and you leave the interesting part out of the calculation
of Fc there can not be any bias in either the Fo-Fc or the 2Fo-Fc map.
This is why an initial map is more reliable for proving binding of
a compound, for example, than a bias reduced map after refinement.

Dale Tronrud

michael nelson wrote:
In my understanding, unbiased electron density map usually refers to the 
Fo-Fc map.But I have seen in some papers sentences like that The 
initial, unbiased 2F_o -F_c map is contoured at.I was a bit confused 
since I was told by my instructor that the 2Fo-Fc was usually biased.


Can anyone clear up this concept for me?
Mike


Looking for last minute shopping deals? Find them fast with Yahoo! 
Search. 
http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping 



Re: [ccp4bb] unbiased electron density map

2008-01-10 Thread Dale Tronrud

Dirk Kostrewa wrote:


Am 10.01.2008 um 01:53 schrieb Dale Tronrud:
...

   Your quote bring up another matter.  An initial map, i.e. before
refinement, is unbiased if it is an omit map.  If you have done no
refinement and you leave the interesting part out of the calculation
of Fc there can not be any bias in either the Fo-Fc or the 2Fo-Fc map.
This is why an initial map is more reliable for proving binding of
a compound, for example, than a bias reduced map after refinement.

...

But isn't this only true, if the model that is put in was not refined 
against a related data set before? If the new Fobs are related to the 
old Fobs (against which the model was refined before) then you carry any 
model bias over to the new data, because a simple omit-map using the old 
data will have model bias.




  Right you are, and it's an important point too.  My only defense
is that I said If you have done no refinement  I don't think
of the refinement of several models to data from isomorphous crystals
as separate refinements, but sometimes forget to mention this when
talking to others.  I should be more careful.

Dale Tronrud


Re: [ccp4bb] an over refined structure

2008-02-08 Thread Dale Tronrud

   I'm afraid I have to disagree with summary point (i): that
crystallographic and noncrystallographic symmetry are incomparable.
Crystallographic symmetry is a special case of ncs where the symmetry
happens to synchronize with the lattice symmetry.  There are plenty
of cases where this synchronization is not perfect and the ncs is
nearly crystallographic.

   For some reason this situation seems to be particularly popular
with P21 space group crystals with a dimer in the asymmetric unit.
Quite often the two-fold of the dimer is nearly parallel to the
screw axis resulting in a nearly C2 space group crystal.  These
crystals form a bridging case in the continuum between ncs, where
the symmetry is unrelated to the lattice symmetry, and those cases
where the unit cell symmetry is perfectly compatible with the
lattice.

   The only saving grace of the nearly centered ncs crystals is
that the combination of the crystal and noncrystallographic symmetry
brings the potential contamination of a reflection in the working
set back to itself.  Unless you have a very high copy number, and
a corresponding large G function, you can't have any feedback from
a working set reflection to a test reflection.

   Crystallographic symmetry is just a special case of noncrystallographic
symmetry, but our computational methods treat them in very different
ways.  This choice of ours creates a discontinuity in the treatment
of symmetry that is quite artificial, and I believe, is the root
cause of many of the problems we have with ncs in refinement and
structure solution.

Dale Tronrud

Dirk Kostrewa wrote:

Dear Dean and others,

Peter Zwart gave me a similar reply. This is very interesting 
discussion, and I would like to have a somewhat closer look to this to 
maybe make things a little bit clearer (please, excuse the general 
explanations - this might be interesting for beginners as well):


1). Ccrystallographic symmetry can be applied to the whole crystal and 
results in symmetry-equivalent intensities in reciprocal space. If you 
refine your model in a lower space group, there will be reflections in 
the test-set that are symmetry-equivalent in the higher space group 
to reflections in the working set. If you refine the 
(symmetry-equivalent) copies in your crystal independently, they will 
diverge due to resolution and data quality, and R-work and R-free will 
diverge to some extend due to this. If you force the copies to be 
identical, the R-work  R-free will still be different due to 
observational errors. In both cases, however, the R-free will be very 
close to the R-work.


2). In case of NCS, the continuous molecular transform will reflect this 
internal symmetry, but because it is only a local symmetry, the observed 
reflections sample the continuous transform at different points and 
their corresponding intensities are generally different. It might, 
however, happen that a test-set reflection comes _very_ close in 
reciprocal space to a NCS-related working-set reflection, and in such 
a case their intensities will be very similar and this will make the 
R-free closer to the R-work. If you do not apply NCS-averaging in form 
of restraints/constraints, these accidentally close reflections will be 
the only cases where R-free might be too close to R-work. If you apply 
NCS-averaging, then in real space you multiply the electron density with 
a mask and average the NCS-related copies within this mask at all 
NCS-related positions. In reciprocal space, you then convolute the 
Fourier-transform of that mask with your observed intensities in all 
NCS-related positions. This will force to make test-set reflections more 
similar to NCS-related working-set reflections and thus the R-free will 
be heavily based towards R-work. The range of this influence in 
reciprocal space can be approximated by replacing the mask with a sphere 
and calculate the Fourier-transform of this sphere. This will give the 
so-called G-function, whose radius of the first zero-value determines 
its radius of influence in reciprocal space.


To summarize: 
(i) One can't directly compare crystallographic and non-crystallographic 
symmetry
(ii) In case of NCS, I have to admit, that even if you do not apply 
NCS-restraints/constraints, there will be some effect on the R-free by 
chance. So, my original statement was too strict in this respect. But 
only if you really apply NCS-restraints/constraints, you force to bias 
the R-free towards the R-work with an approximte radius of the 
G-function in reciprocal space.


What an interesting discussion!

Best regards,

Dirk.

Am 07.02.2008 um 18:57 schrieb Dean Madden:


Hi Dirk,

I disagree with your final sentence. Even if you don't apply NCS 
restraints/constraints during refinement, there is a serious risk of 
NCS contaminating your Rfree. Consider the limiting case in which 
the NCS is produced simply by working in an artificially low 
symmetry space-group (e.g. P1, when the true symmetry is P2

Re: [ccp4bb] an over refined structure

2008-02-08 Thread Dale Tronrud

[EMAIL PROTECTED] wrote:
 Rotational near-crystallographic ncs is easy to handle this way, but
 what about translational pseudo-symmetry (or should that be
 pseudo-translational symmetry)? In such cases one whole set of spots is
 systematically weaker than the other set.  Then what is the
 theoretically correct way to calculate Rfree?  Write one's own code to
 sort the spots into two piles?
 Phoebe


Dear Phoebe,

   I've always been a fan of splitting the test set in these situations.
The weak set of reflections provide information about the differences
between the ncs mates (and the deviation of the ncs operator from a
true crystallography operator) while the strong reflections provide
information about the average of the ncs mates.  If you mix the two
sets in your Rfree calculation the strong set will tend to dominate
and will obscure the consequences of allowing you ncs mates too much
freedom to differ.

   Let's say you have a pseudo C2 crystal with the dimer as the ncs
pair and you are starting with a perfect C2 symmetry model.  The
initial rigid body refinement will cause the Rfree(weak) to drop
because the initial model had Fc's equal to zero for all these
reflections and the deviation from crystal symmetry allows nonzero
values to arise.

   Now you want to test if there are real differences between the
two copies.  If you allow variation between the two copies but
monitor the Rfree(strong) you are actually monitoring the quality
of the average of the two copies, and you basically have a two-fold
multimodel.  It is the same as putting two molecules at each site
in the crystal and forcing both models to have perfect ncs.

   Axel Brunger's Methods in Enzymology chapter indicates that a
two-fold multimodel is expected to have a lower Rfree than a single
model and we would expect in our imaginary crystal that the
Rfree(strong) will drop even if there is no real difference between
the ncs mates.  When you allow differences between the ncs mates
the Rfree(strong) will tend to drop even if those differences are
not real.

   The Rfree(weak) is a different story, however.  It is controlled
specifically by the differences between the two ncs mates and will
drop only if the refinement creates differences that are significant.
This is the statistic that can be used to determine the ncs weight.
(Or probably the log likelihood gain (weak))

   If you insist on mixing the strong and weak reflections in your
test set you have to design your null hypothesis test differently.
First you should do a refinement where you have two models at
each site, with exact ncs imposed.  The you do a refinement with one
copy at each site but allow differences between the ncs mates.
Compare the Rfree of each model to decide which is the better model.
There are exactly the same number of parameters in each model but
one allows the ncs to be violated and the other does not.

   Even so, the signal in the Rfree is mixed unless you split the
systematically weak from the systematically strong.

   If you have a general ncs and don't have weak and strong subsets
of reflections you still have to worry about the multimodel affect.
If a refinement that allows ncs violations does not drop the Rfree
by more that a two-fold multimodel with perfect ncs you cannot
justify breaking your ncs.  A drop in Rfree when you break ncs does
not necessarily mean that breaking ncs is a good idea.  You always
have to perform the proper null hypothesis test.

Dale Tronrud


At 01:05 PM 2/8/2008, Axel Brunger wrote:

In such cases, we always define the test set first in the high-symmetry
space group choice.  Then, if it is warranted to lower the 
crystallographic

symmetry and replace with NCS symmetry, we expand the test set
to the lower symmetry space group.  In other words, the test set itself
will be invariant upon applying any of the crystallographic or NCS 
operators,

so will be maximally free in these cases.   It is then also possible to
directly compare the free R between the high and low crystallographic
space group choices. 

Our recent Neuroligin structure is such an example (Arac et al., 
Neuron 56, 992-, 2007).
 


Axel




On Feb 8, 2008, at 10:48 AM, Ronald E Stenkamp wrote:

I've looked at about 10 cases where structures have been refined in 
lower
symmetry space groups.  When you make the NCS operators into 
crystallographic

operators, you don't change the refinement much, at least in terms of
structural changes.  That's the case whether NCS restraints have been 
applied
or not. In the cases I've re-done, changing the refinement program 
and dealing
with test set choices makes some difference in the R and Rfree 
values.  One
effect of changing the space group is whether you realize the copies 
of the
molecule in the lower symmetry asymmetric unit are identical or 
not.  (Where
identical means crystallographically identical, i.e., in the same 
packing
environments, subject to all the caveats about accuracy, precision, 
thermal

Re: [ccp4bb] an over refined structure

2008-02-08 Thread Dale Tronrud

Bart Hazes wrote:

Dale Tronrud wrote:

[EMAIL PROTECTED] wrote:
  Rotational near-crystallographic ncs is easy to handle this way, but
  what about translational pseudo-symmetry (or should that be
  pseudo-translational symmetry)? In such cases one whole set of 
spots is

  systematically weaker than the other set.  Then what is the
  theoretically correct way to calculate Rfree?  Write one's own 
code to

  sort the spots into two piles?
  Phoebe
 

Dear Phoebe,

   I've always been a fan of splitting the test set in these situations.
The weak set of reflections provide information about the differences
between the ncs mates (and the deviation of the ncs operator from a
true crystallography operator) while the strong reflections provide
information about the average of the ncs mates.  If you mix the two
sets in your Rfree calculation the strong set will tend to dominate
and will obscure the consequences of allowing you ncs mates too much
freedom to differ.


I haven't had to deal with this situation but my first impression is to 
use the strong reflections for Rfree. For the strong reflections, and 
any normal data, Rwork  Rfree are dominated by model errors and not 
measurement errors. For the weak reflections measurement errors become 
more significant if not dominant. In that case Rwork  Rfree will not be 
a sensitive measure to judge model improvement and refinement strategy.


A second and possibly more important issue arises with determination of 
Sigmaa values for maximum likelihood refinement. Sigmaa values are 
related to the correlation between Fc and Fo amplitudes. When half of 
your observed data is systematically weakened then this correlation is 
going to be very high, even if the model is poor or completely wrong, as 
long as it obeys the same pseudo-translation. If you only use the strong 
reflections for Rfree I expect that should get around some of the issue.


Of course it can be valuable to also monitor the weak reflections to 
optimize NCS restraints but probably not to drive maximum likelihood 
refinement or to make general refinement strategy choices.


Bart


Dear Bart,

   I agree that the way one uses the test set depends critically on the
question you are asking.  In my letter I was focusing on that aspect
of the pseudo centered crystal problem where the strong/weak divide can
be used to particular advantage.

   I have not thought as much about the matter of using the test set
to estimate the level of uncertainty in the parameters of a given model.
My gut response is that the strong/weak distinction is still significant.
Since the weak reflections contain information about the differences
between the two, ncs related, copies I suspect that a great many systematic
errors are subtracted out.

   For example, if your model contains isotropic B's when, of course,
the atoms move anisotropically, your maps will contain difference features
due to these unmodeled motions.  Since the anisotropic motions are
probably common to the two molecules, these features will be present in
the average structure described by the strong reflections but will be
subtracted out in the difference structure described by the weak
reflections.  This argument implies to me that the strong reflections
need to be judged by the Sigma A derived from the strong test set and
the weak reflections judged by the weak test set.

Dale Tronrud


Re: [ccp4bb] an over refined structure

2008-02-12 Thread Dale Tronrud
 of whether that attempt is appropriate or inappropriate,
every symmetry image of that atom will be pulled in the corresponding
way.  The symmetry related structure factors, both crystallographic
and noncrystallographic, will be affected in the same way and a
reflection in the test set will be tied to its mate in the working
set.

   In summary, this argument depends on two assertions that you can
argue with me about:

   1) When a parameter is being used to fit the signal it was designed
for, the resulting model develops predictive power and can lower
both the working and free R.  When a signal is perturbing the value
of a parameter for which is was not designed, it is unlikely to improve
its predictive power and the working R will tend to drop, but the free
R will not (and may rise).

   2) If the unmodeled signal in the data set is a property in real
space and has the same symmetry as the molecule in the unit cell,
the inappropriate fitting of parameters will be systematic with
respect to that symmetry and the presence of a reflection in the
working set will tend to cause its symmetry mate in the test set
to be better predicted despite the fact that this predictive power
does not extend to reflections that are unrelated by symmetry.
This bias will occur for any kind of error as long as that
error obeys the symmetry of the unit cell in real space.

   I'm sorry for the long winded post, but sometimes I get these
things stuck in my head and I can't get any work done until I get
it out.  I hope it helps, or at least is not complete nonsense.

Dale Tronrud


[ccp4bb] Calculating R-factor and maps from a Refmac model containing TLS downloaded from the PDB

2008-03-12 Thread Dale Tronrud

Hi,

   I am looking over a number of models from the PDB but have been
unable to reproduce the R-factors for any model that was refined
with Refmac and contains TLS parameters.  I usually can't get within
5% of the reported value.  On the other hand, I usually do pretty
well for models w/o TLS.

   An example is the model 1nkz.  The PDB header gives an R value
of 17% but even when I use tlsanal in CCP4i to generate a PDB with
anisotropic B's that mimic the TLS parameters I get an R value of
22.4% using SFCheck.  (I'm not implying that I suspect any problem
with 1nkz, in fact I have every reason to believe this is the great
model its published stats indicate.)

   I've found a CCP4 BB letter that stated that SFCheck does not
pay attention to anisotropic B's but that letter was dated 2002.
I hope this limitation has been removed, or at least the output
would mention this limitation.

   Setting up a refinement in Refmac involves a large overhead,
since even for zero cycles of refinement the program insists on
a complete stereochemical definition for the strange and wondrous
groups in this model.  I would just like to verify the R factor
and calculate a proper map for inspection in Coot.  Since I have
many models I would like to look at, I would like a simple procedure.

   I did set up a Refmac run for another model, for which I do
have all the .cif's required, but even after refinement I was not
close to the reported R.

   I see that the models I'm interested in are not present in the
Electron Density Server, so I suspect I'm not alone in fighting
this battle.

Any advice would be appreciated,
Dale Tronrud


Re: [ccp4bb] Summary: Calculating R-factor and maps from a Refmac model containing TLS downloaded from the PDB

2008-03-17 Thread Dale Tronrud

Hi again,

   I guess this is only a partial summary, since I still don't understand
all the issues this question raises.

Pavel Afonine reported that his extensive tests of the PDB reveals that
reproducing R values from models with TLS ADP's is a wide-spread and
serious problem.  The principal problems (IMHO) are

   1) Incorrect or illegal TLS definitions in the REMARK.

   2) Some files list in the ATOM B column the residual B after TLS
  has been accounted for while others list the total B (TLS and
  residual).  There is no clear indication in the PDB file which
  interpretation is being used.

Tassos, Eleanor, and others recommended taking the TLS definition from
the PDB header and running zero cycles of unrestrained refinement in
Refmac to get it to calculate R factors and Maps w/o the need to define
ideal geometry for co-factors.  I have yet to see this work, however
(See below)

Ulrich Baumann wrote to tell me of two of his PDB's that he knows will
give back the reported R values.  They are 2qua and 2qub.

I grabbed 2qua from the RCSB server, extracted the TLS groups with CCP4i,
and found that the TLS definitions were incorrect.  There is one polypeptide
in this model and three TLS groups.  The first and third group did not
have a residue range, while the second group defined a residue range in
the middle of the peptide.  I made the assumption that the first and
third TLS groups were intended to cover the beginning and end of the
peptide and corrected the .tls file.

I loaded this into Refmac and asked for zero cycles of unrestrained
refinement and got an R value of 19.4%.  The PDB file says it should
be 17.3%.  I then asked Refmac to run 10 cycles of TLS and 10 cycles
of restrained refinement and got an R value of 17.5%.  Good enough.

From this result I infer that Refmac is unable to calculate the original
ADP's given this PDB file and TLS definition.  It can reconstruct them
via refinement, basically ignoring the B values of the PDB file.

This particular PDB entry appears to contain in its B column the
residual B's.

I also tried entry 2qub, but with less luck.  This model has seven
peptides and 30 TLS groups.  The first seven TLS groups defined in
the header of the PDB cover each of the seven chains, while the other
23 groups had no residue range.  I can guess that the intension was
to have five TLS groups for each of the seven chains, but without
additional information from Dr. Baumann, I'm unable to even start
trying to reproduce R values and calculate maps.

So...  1) Pavel is correct, there are many clear errors in the TLS
REMARKs of PDB entries.  2) It seems necessary to ask Refmac to
recreate the ADP description for a PDB entry from scratch, assuming
the TLS group definition can be deduced from the PDB header.  This,
currently, requires refinement which requires .cif's for the unusual
groups.

If CCP4I could ask Refmac to perform only TLS/B refinement, holding
positions fixed, the need for detailed .cifs would be greatly reduced.
I have no desire to move the atoms anyway.

Better yet, if someone could find out what Refmac is expecting to find
in its starting PDB (what it wants in the B column) one could add
a tool to CCP4I that could convert a PDB entry to what Refmac wants
w/o refinement.  Since there appear to be two varieties of entries
one could try both possibilities and choose the one with the lowest
R value.

I have to close with additional problems, I'm afraid.  I can't run
the required refinement on 1nkz to test TLS/B refinement but
I have tried it on 3bsd, where I have a good .cif for the Bchl-a
groups.  When I pull out the TLS definition, and perform 10 cycles
of TLS and 10 cycles of restrained refinement I get an R value of
20.2% while the entry asserts that the correct value is 17.8%.  The
final TLS parameters look, by eye, pretty similar to the deposited
ones, so I don't know what is going on here.

Dale Tronrud



Dale Tronrud wrote:

Hi,

   I am looking over a number of models from the PDB but have been
unable to reproduce the R-factors for any model that was refined
with Refmac and contains TLS parameters.  I usually can't get within
5% of the reported value.  On the other hand, I usually do pretty
well for models w/o TLS.

   An example is the model 1nkz.  The PDB header gives an R value
of 17% but even when I use tlsanal in CCP4i to generate a PDB with
anisotropic B's that mimic the TLS parameters I get an R value of
22.4% using SFCheck.  (I'm not implying that I suspect any problem
with 1nkz, in fact I have every reason to believe this is the great
model its published stats indicate.)

   I've found a CCP4 BB letter that stated that SFCheck does not
pay attention to anisotropic B's but that letter was dated 2002.
I hope this limitation has been removed, or at least the output
would mention this limitation.

   Setting up a refinement in Refmac involves a large overhead,
since even for zero cycles of refinement the program insists on
a complete

Re: [ccp4bb] Friedel vs Bijvoet

2008-06-26 Thread Dale Tronrud

   There was a mistake in the letter that listed the Bijvoet pairs
for a monoclinic space group and that is confusing you.  Let me
try.

   The equivalent positions for a B setting monoclinic are

h,k,l; -h,k,-l.

   The Friedel mates for the general position (h,k,l) are (-h,-k,-l).
This means that the equivalent positions also have Friedel mates at
h,-k,l.

   The Bijvoet mates of h,k,l are therefore, according to the
definitions given in previous letters, -h,-k,-l; and h,-k,l.
There are more Bijvoet mates to a reflection then Fridel mates.

   A centric reflection is a reflection that is BOTH a symmetry equivalent
reflection AND a Bijvoet mate to some other reflection.  This is a
very small subset of all reflections.

   Every reflection has one Friedel mate and has N Bijvoet mates,
where N is the number of equivalent positions.  Only a small number
of reflections are centric (with the limiting case of only F000).

Dale Tronrud

Bernhard Rupp wrote:

Let's try this again, with definitions, and pls scream if I am wrong:

a) Any reflection pair hR = h forms a symmetry related pair.
   R is any one of G point group operators of the SG. 
   This is a set of reflections (S). Their amplitudes

   are invariably the same. They do not even show up
   as individual pairs in the asymmetric unit of the reciprocal 
   space.

   NB: their phases are restricted but not the same.

b) a set h=-h (set F) exist where reflections may or may not
   carry anomalous signal. They form the centrosymmetrically related wedge
   of the asymmetric unit of reciprocal space.

c) a centric reflection (set C) is defined as
   hR=-h 
   and cannot carry anomalous signal. Example zone h0l in PG 2.

   As Ian Tickle pointed out, the CCP4 wiki is wrong:
   Centric reflections in space group P2 and P21 are thus 
those with 0,k,0. Not so; an example listing is attached at the end. 
   
d) therefore, some e:F exist that carry AS (F.ne.C) 
   and some that do not carry AS (F.el.C).


I hope we can agree on those facts.

Now for the name calling:

(S) is simply the set of symmetry related reflections, defined as hR=h.
(F) is the set of Friedel pairs, defined as h=-h.
(C) are centric reflections, defined as hR=-h.

Thus, only if (F.ne.C), anomalous signal. I thought those 
are Bijvoet pairs. They are, but it may not be the definition

of a Bijvoet pair.

Try 1:

Bijvoet pair is F(h) and any mate that is symmetry-related to F(-h), 
e.g., F(hkl) and F(-h,k,-l) in monoclinic.


hkl is not related to -hk-l via h = -h. Only h0l is, and those are (e:C).
So, I cannot quote follow that, probably try 1 is not a good definition.

Try 2:

I've always thought that a Bijvoet pair is any pair for which an 
anomalous difference could be observed. 


Good start. I subscribe to that.


This includes Friedel pairs (h  h-bar)


Good. That's the definition of F.


but it also includes pairs of the form h  h', where h'
is symmetry-related to h-bar. 


Ooops. That is the definition of a centric reflection.


Thus Friedel pairs are a subset of all possible Bijvoet pairs.


Cannot see that. I still maintain that Bijvoet pairs are
a subset of Friedel pairs (which does include Pat's definition). 
I fail to see anything else but Friedel pairs in my list 
of reflections - some of them carry AS (F.ne.C) and some 
don't (F.el.C).


B = F.ne.C. 


Seems to be a necessary and sufficient condition,
in agreement with Pat's definition (though not the explanation).

But - isn't that exactly what I said from the beginning?

A Bijvoet pair is an acentric Friedel pair...

Or - where are any other Bijvoet pairs hiding? Where did I miss them?

(NB: Absence of anisotropic AS assumed  -let's not go there)

See reflection list P2 (hkl |F| fom phi 2theta stol2) 
last 3 items: centric flag, epsilon, m(h)


   0   0   1   993.54 1.00   179.9965.61 0.581   1   1   2
   0   0  -1   993.54 1.00   179.9965.61 0.581   1   1   2
   1   0   0  1412.58 1.00 0.1438.22 0.0001711   1   1   2
  -1   0   0  1412.58 1.00 0.1438.22 0.0001711   1   1   2
   0   0   2  3279.49 1.00   180.3132.80 0.0002323   1   1   2
   0   0  -2  3279.49 1.00   180.3132.80 0.0002323   1   1   2
   1   0   1   379.89 1.00   180.2530.36 0.0002712   1   1   2
  -1   0  -1   379.89 1.00   180.2530.36 0.0002712   1   1   2
  -1   0   2  1355.06 1.00 0.1327.97 0.0003195   1   1   2
   1   0  -2  1355.06 1.00 0.1327.97 0.0003195   1   1   2
   0   1   0  2432.85 1.0021.0924.35 0.0004216   0   2   1
   0  -1   0  2434.14 1.00   339.6524.35 0.0004216   0   2   1
   0   1   1   621.36 1.00   101.6722.83 0.0004797   0   1   2
   0  -1  -1   623.27 1.00   258.4922.83 0.0004797   0   1   2
   1   0   2   319.68 1.00   359.9822.65 0.0004874   1   1   2
  -1   0  -2   319.68 1.00   359.9822.65 0.0004874   1   1   2
   0   0   3   426.17 1.00   180.9921.87 0.0005227   1

Re: [ccp4bb] Friedel vs Bijvoet

2008-06-26 Thread Dale Tronrud

Bernhard Rupp wrote:

I quote from these pages:

Bijvoet pairs are Bragg reflections which are true symmetry 
equivalents to a Friedel pair. These true symmetry equivalents 
have *equal amplitudes, even in the presence of anomalous scattering*.


   This is poorly worded.  I would change it to

A Bijvoet MATE IS A Bragg reflection which IS A true symmetry
equivalent to THE Friedel MATE OF SOME OTHER REFLECTION. These true symmetry 
equivalents
have *equal amplitudes, even in the presence of anomalous scattering*.

   Note that the Bijvoet mate is symmetry related to the Friedel mate
not the original reflection.

Dale Tronrud



Sounds more like centric or perhaps simply symmetry related to me.

A few lines below:

 A Bijvoet difference refers to the difference in measured 
 amplitude for a Bijvoet pair


I don't think you can have it both ways ??

BR


Re: [ccp4bb] D-Amino acids to L-Amino acids

2008-08-02 Thread Dale Tronrud

   First you should look into why your chiral centers flipped.  In
my experience the most common cause is that the neighboring peptide
bond needs to be flipped.

   If you just want to flip a chiral center in Coot, I think the
easiest way is to real space refine the residue and before accepting
the result, drag the CA to the side you want.  You may have over-drag
to get it to stay.  It only takes a moment.

   But don't assume that your refinement program is just doing something
stupid.  Look for the primal cause.

Dale Tronrud

Yusuf Akhter wrote:

Hi Everybody,

I am refining structure of a protein at 3 Angstrom. I am doing model building in
Coot.
After several rounds of refinement using Refmac when I tried to run PROCHECK on
my partially build model I found that some of the residues are D-amino acids.

How to change these D-amino acids to L-amino acids??

Is there any option in Coot for that??


Thanks in advance,
yusuf



Re: [ccp4bb] Refmac5 and dual conformation of a dual conformation

2009-02-02 Thread Dale Tronrud
   I have battled a similar piece of structure in a recent project.  There
is no software in crystallography that can handle hierarchical alternative
conformations w/o tricks.  I was refining in Shelxl but the same trick
will work elsewhere.  I had to define a new residue type, with two heads.
The A conformation had atoms for both heads.  The B conformation didn't
have atoms that couldn't be seen.  I was missing the entire side chain
in that conformation so the SC had no atoms.

   Besides creating the new geometry restraint library, you will have to
ensure that no bad contacts are flagged between the two heads.

   In Shelxl you can tie all the occupancies together properly, with
a couple annoyances.  In other programs you are on your own.

   Another possibility is to create four conformations for the entire
stretch but then you'll have to have the program keep many pairs of
atoms superimposed.  I don't know if Refmac has that feature.

   You will, of course, have difficulties when you deposit this thing
with the PDB.  There is no way to correctly represent your model in
a PDB file, nor I believe in mmCIF.  It may be reasonable to switch
to the four conformation model for deposition, since you will not
have to worry about enforcing the superposition of atoms any longer.
That may be clearer to people who use the model at a later date.

Dale Tronrud

Andy Millston wrote:
 I am currently trying to refine a structure where a 5 residue stretch of
 a chain is in 2 conformations. Oddly enough, 1 of these 5 residues is in
 dual conformations in both the conformations! Is there a conventional
 nomenclature for defining such dual-dual conformations?
 
 Refmac5 does not accept the intuitive way of naming such an atom.
 
 For example: A normal dual conformer GLY would be named as AGLY and BGLY
 in PDB file and this is acceptable to Refmac5
 
 When I name a dual-dual GLY as AAGLY and BAGLY, Refmac5 fails! Any
 idea, WHY??
 
 Thank you!
 
 Here is the error log:
 
 Logical name: ATOMSF, Filename: /programs/ccp4-6.0.2/lib/data/atomsf.lib
 
 ***
 * Information from CCP4Interface script
 ***
 The program run with command: refmac5 XYZIN /home/../myfile.pdb XYZOUT
 ...tmp HKLIN mtz HKLOUT tmp LIBOUT ..._56_lib.cif
 has failed with error message
 fmt: read unexpected character
 apparent state: internal I/O
 last format: (I4)
 lately reading sequential formatted internal IO
 ***
 
 
 #CCP4I TERMINATION STATUS 0 fmt: read unexpected character apparent
 state: internal I/O last format: (I4) lately reading sequential
 formatted internal IO
 
 


Re: [ccp4bb] comparison of maps, intensities and other basics

2009-02-05 Thread Dale Tronrud
   A map file stores a density value for each point on a grid.  The
units and nature of that item is not defined in the format of the map.
A map can store any number of things.  The actual values are defined
by the process that created the map file.

   For electron density maps you will find that some contain values
measured in e/A^3, others contain values that are normalized Z scores
(The standard deviation of the variation about the mean is set to
1.0), or just a bunch of numbers with arbitrary and mysterious units.

   One tends to use e/A^3 when trying to relate the map to expected
electron density or to compare one map to another.  A normalized map
is useful if you are interested in the frequency that a density value
of that magnitude appears in the map. (Is this value common or rare?)
One uses arbitrary values if one has an attachment to honesty.

   Calculating an electron density map in units of e/A^3 is not an
easy task.  The diffracted intensities are not measured, themselves,
in real units.  Their magnitude only has meaning as intensities
relative to the other intensities in the same dataset.   For the map
to be expressed in units of e/A^3 the diffraction intensities must
be expressed in units of e/Unit Cell (at least that is the convention).
This is a hard problem and many papers have been written on the topic.

   If you have a well refined and complete model for the contents of
the crystal you can use the calculated diffraction pattern as a template
to scale the observed intensities and calculate maps in e/A^3, but
this is an approximation as no model is complete or completely correct.

   The other big issue is that we cannot measure the one reflection
that defines the average of the electron density in the crystal.  It
happens to always hit the beamstop.  Because of this problem our maps
usually have an average value of zero, which is of course wrong.  Even
when the density values are expressed in e/A^3 the intention is that
each value in the map must have a number added to it to achieve the
true value at that point.  At least it's the same number everywhere
in the map, although we don't know its value.

   Because of these issues and uncertainties, when maps are compared
they are usually compared using a correlation coefficient.  The
correlation coefficient is relatively unaffected by these scaling
problems and will usually give the same answer when given any of
the kinds of maps I described.

   If you want a more detailed comparison of electron density values
you really have to get into the details of each of the datasets and
scaling that was applied to ensure that your results are meaningful.

   Estimating the error bars of an electron density map is another
enormous problem. As you would expect, it depends critically on the
origin of the map.  The error analysis of a map calculated from MAD
phasing is quite different than that of a map calculated using a
refined model as a reference.

   One complication is that the error level is not necessarily the
same everywhere in the map. In addition the errors at different
regions of the map are not independent.  The correlation of deviations
at different regions of the map are likely more important to any
analysis then any simple overall error bar.

   However, if you insist on an error level, my best guess would be
to identify the regions of bulk solvent and calculate the rms deviation
from the mean there.  Since these regions should be flat, and deviations
from the mean must be due to something that does not represent election
density. We might as well call it error.

Dale Tronrud


Peter Schmidtke wrote:
 Dear CCP4BB List Members,
 
 first of all I am not a crystallographer, but I would like to get some
 things clear, things I did not find in Crystallography Made Crystal Clear
 and on the internet for now. 
 
 I am trying to read electron density maps in the EZD format. These maps
 contain scaled values of electron density and size and shape of the unit
 cell. How can I convert the values of intensities (what is the unit of
 these values?) to the probabilities you can see in coot for example (1.03
 electron / A^3), 
 Once I have achieved this conversion, can I compare densities of different
 maps of different proteins? If not directly, is there a way to do so?
 
 Last, is there a way to know the experimental error made on intensity
 values of a map?
 
 Thanks in advance.
 
 


Re: [ccp4bb] 3ftt and gremlins

2009-03-12 Thread Dale Tronrud
   This thread has evolved into two different topics.  Just to
clarify:

1)  There is a need for additional validation of structure factor
depositions.

   My recollection is that the output of SF Check is available to
the depositor via ADIT on the RCSB site.  I have found that report
to be quite helpful in checking for gross errors in my structure
factor files.

   The Electron Density Server performs similar checks.  It shows
that the R value for 3ftt is 6.4% with a correlation coefficient
between Fo and Fc of 0.996.

   The EDS flags entries as interesting if the calculated R value
is more than 5% higher than the reported R value.  Maybe it should
also note when the R value is more than 5% lower.

   The tools for validating structure factors exist but perhaps could
be put more in the face of the depositor to more strongly encourage
that they be looked at.

2) It would be useful to have a central repository of raw diffraction
   images.

   Most of the discussion on this point is the technical difficulty of
storing this quantity of data.  What has not been mentioned is the
much greater difficulty of validating these images.  You may think
the images for an entry have been deposited only to find out that
the investigator's wedding photos were accidentally deposited instead.

   Validating that the images correspond to the claimed structure
will be an enormous task;  probably more difficult than coming up
with enough hard drives to store them all.

Dale Tronrud

Frank von Delft wrote:
 Gerard Bricogne wrote:
  Looking forward to the archiving of the REAL data ... i.e. the
 images.
 Using any other form of data is like having to eat out of someone
 else's
 dirty plate!
   
 That may be so -- but if I'm hungry now, I just pop it in the sink -- I
 don't publish a call for tenders on an industrial-scale dish-washer,
 call up the architects and engineers to redesign the room, re-lay the
 plumbing, vamp up my electricity transformer and install a new drainage
 system.
 
 Which doesn't mean the industrial-scale washer isn't necessary;  but
 honestly, can't we start by just washing the plate??
 
 phx.


Re: [ccp4bb] Is it possible to mutate a reversible epimerase into an inreversible one?

2010-05-18 Thread Dale Tronrud
Hi,

   I'm more of a Fourier coefficient kind of guy, but I thought that a
ΔG of zero simply corresponded to an equilibrium constant of one.  You
can certainly have reversible reactions with other equilibrium constants.
In fact I think irreversible reactions are simply ones where the
equilibrium constant is so far to one side that, in practice, the reaction
always goes all the way to product.

   As Randy pointed out the enzyme cannot change the ΔG (or the equilibrium
constant).  You could drive a reaction out of equilibrium by coupling it
to some other reaction which itself is way out of equilibrium (such as
ATP hydrolysis in the cell) but I don't think that's a simple mutation of
your enzyme.  ;-)

Dale Tronrud

On 05/18/10 00:31, Vinson LIANG wrote:
 Dear all,
  
 Sorry for this silly biochemistory question.  Thing is that I have a
 reversible epimerase and I want to mutate it into an inreversible one.
 However, I have been told that the ΔG of a reversible reaction is zero.
 Which direction the reaction goes depends only on the concentration of
 the substrate.  So the conclusion is,
  
 A: I can mutate the epimerase into an inreversible one. But it has no
 influence on the reaction direction, and hence it has little mean.
  
 B: There is no way to change a reversible epimerase into an inversible one.
  
 Could somebody please give me some comment on the two conclution?
  
 Thank you all for your time.
  
 Best,
  
 Vinson
 
 
  


Re: [ccp4bb] Is it possible to mutate a reversible epimerase into an inreversible one?

2010-05-21 Thread Dale Tronrud
   I think we are having a problem with the definition of reversible
and irreversible.  By Lijun's definition the reaction is irreversible
because it proceeds from far from equilibrium toward equilibrium.  That
situation is more a property of the system than the enzyme.  If you make
the enzyme 1000 times faster the reaction will proceed more quickly toward
equilibrium despite the fact that the reverse reaction is also 1000 times
faster.  The reverse reaction doesn't matter when there is no product to
act upon.

  The original question was about having a reversible epimerase and I
want to mutate it into an inreversible(sic) one.  Clearly the poster
is talking about a property of the enzyme.  I interpreted this question
to be a request to differentially change the forward and backward reaction
rates, but I could be mistaken.  Maybe the original poster could clarify
the question.

Dale Tronrud

On 05/21/10 13:53, Lijun Liu wrote:
 If I understand what you are saying, I think it is too.

 You imply that asymmetry in the enzyme results in two isomerase
 pathways. This may be true, but it has no consequence on the prospects
 for irreversibility. To avoid confusion, let's call these pathways D
 and S. Both the D and S pathways would have their own kf and kr
 kinetic constants such that kf_D/kr_D = kr_S/kf_S = Keq, which
 reflects the dG of the reaction. When the dG is close to zero for the 
 isomerase reaction (which I assume here), then you can't make it
 irreversible.
 
 This is not the case, at least in part.  Such kind of enzymes, if no
 cofactor-needed, use the identical intermediate for the mirror symmetric
 reaction.  For the D -- Intermediate -- S reaction, the enzyme uses
 the same pathway.  Enzymes, for example, glutamate racemase and
 aspartate racemase, use a kind of psudo-mirror symmetric alignment at
 the active site to adapt the binding of D or S isomer in the half A.S.,
 respectively.  Other 3 atoms associated to the chiral center keeps fixed
 relative conformation during the inversion.
 
 Standard dG(0) of such a reaction is 0.  However, at the time when
 enzyme works (for example, cell needs D-ASP in an almost pure L-ASP
 environment), the racemase moves L-Asp to D-Asp, in this regard, the dG
 of the reaction (not standard) is not 0.
 
 Your last sentence means:  for a reaction (assuming dG(0) = 0 like
 racemic reaction) almost reaches EQ (dG ~ 0), you cannot make it
 irreversiblethis is true.   Just please do not forget: such kind of
 enzymes work when the D -- S EQ is highly broken by nature (dG  0)
 [not dG(0)].
 
 Hopefully I explained clearly!
 
 Lijun
 
 

 James


 All natural epimerases, isomerase and racemases use a mechanism based
 on L-amino acids to deal with a mirror-symmetric (quasi-, sometimes)
 reaction.  In another word, these enzymes use a non-mirror symmetric
 structure to deal with a mirror-symmetric reaction, which itself
 causes the asymmetric kinetics for different direction, though the dG
 is 0.  The Arrhenius Law k = A*exp(-dE/RT) should be understood like
 this: a mutation's effect to dE will be symmetric as Dale pointed
 out.  However, the effects on A are asymmetric.   A is related to
 intramolecular diffusion, substrate- and product-binding affinity,
 etc.  That is why with mutation these enzymes changed their kinetics
 on two directions differently.  Please check glutamate racemase,
 alanine racemase, aspartate racemase, DAPE epimerase, if you are
 interested.  Never a 1000 to 1000 relation!

 Thus, mutation is possible to make one direction more favored---the
 point is you need the correct hit.  Of course, such an experiment is
 never a Maxwell's demon.

 Lijun




 On May 19, 2010, at 8:51 AM, Maia Cherney wrote:

 You absolutely right, I thought about it.

 Maia

 Marius Schmidt wrote:
 Interestingly, Maxwell's demon pops up here, wh... ,
 don't do it.




 If you change the reaction rate in one direction 1000  times slower
 than
 in the other direction, then the reaction becomes practically
 irreversible. And the system might not be at equilibrium.

 Maia

 R. M. Garavito wrote:

 Vinson,

 As Dale and Randy pointed out, you cannot change the #916G of a
 reaction
 by mutation: enzyme, which is a catalyst, affects only the
 activation
 barrier (#916E double-dagger).  You can just make it a better (or
 worse) catalyst which would allow the reaction to flow faster (or
 slower) towards equilibrium.  Nature solves this problem very
 elegantly by taking a readily reversible enzyme, like an
 epimerase or
 isomerase, and coupling it to a much less reversible reaction which
 removes product quickly.  Hence, the mass action is only in one
 direction.  An example of such an arrangement is the triose
 phosphate
 isomerase (TIM)-glyceraldehyde 3-phosphate dehydrogenase (GAPDH)
 reaction pair.  TIM is readily reversible (DHA = G3P), but G3P is
 rapidly converted to 1,3-diphosphoglycerate by GAPDH.   The
 oxidation
 and phosphorylation reactions

Re: [ccp4bb] Far to good r-factors

2010-06-01 Thread Dale Tronrud
   This would be a possible explanation, and certainly is a problem with
low resolution refinements, but the free R indicates that overfitting
is not the problem here.  (I'm assuming that the proper choice of test
set has been made in this case.)  In my experience, for very isomorphous
pairs of structures, when a high resolution model is used as the starting
point for a low resolution refinement, even the R values before refinement
will be very good and that means fitting the noise can't be the cause.

   Our methods today are simply not as good at fitting low resolution data
in the absence of high resolution data as they are in its presence.

Dale Tronrud

On 06/01/10 04:51, Ian Tickle wrote:
 On Mon, May 31, 2010 at 9:15 PM, Dale Tronrud det...@uoxray.uoregon.edu 
 wrote:
   One of the great mysteries of refinement is that a model created using
 high resolution data will fit a low resolution data set much better than
 a model created only using the low resolution data.  It appears that there
 are many types of errors that degrade the fit to low resolution data that
 can only be identified and fixed by using the information from high
 resolution data.
 
 Is it such a mystery?  Isn't it just a case of overfitting to the
 experimental errors in the low res data if you tried to use the same
 parameterization  restraint weighting as for the high res refinement?
  Consequently you are forced to use fewer parameters and/or higher
 restraint weighting at low res which obviously is not going to give as
 good a fit.
 
 Cheers
 
 -- Ian


Re: [ccp4bb] MR on low resolution soaking data.

2010-06-07 Thread Dale Tronrud
   You haven't given much detail to work with so I can only
guess about your problem.  A Wilson B of 20 for a 4 A data set
is ridiculous, but the uncertainty in the Wilson B calculation
at 4 A is enormous, so what might be a more reasonable statement
would be to say your Wilson B calculates to 20 +_ 300 A^2 and
the true value would be in that range.  I don't think a precise
Wilson B is important for MR so I wouldn't worry about it.

   An R value of .7 after MR is very large.  Its size implies
a systematic problem with your model - I would be looking for
a second monomer.  You haven't said anything about the structure
of your monomer.  Often a ligand will bind in the cleft between
two domains, and the domains move relative to each other upon
binding.  You may have to perform separate searches for each
domain or construct a range of trial models with different
angles between the domains.

   Don't worry about the ligand until you solve the protein
structure.  Whether you see it in the end will depend on how
big it is and how good your 4 A data are.  Of course, it's
possible that it doesn't bind at all.

Dale Tronrud

On 06/07/10 12:17, yang li wrote:
 Dear colleagues,
  
   We are now trying to soak some ligands into a protein, which is
 about 60kd in size and the structure has been solved
 before. But the  molecular replacement cannot give a right solution.
 Below is some contrast of the data:
 
 Native  2A   P212121   monomer
 Soaked4A   F222 monomer (more than 70% solvent) or
 dimer(more possible) 
 
 I wonder if it is possible to find the ligand in the case of such low
 resolution, provided the ligand is not so small. What facts
 could probably lead to the failure of MR? Molrep gave a model of monomer
 but the rfree is as high as 0.7, while phaser could
 get no result. I tried phenix.explore_metric_symmetry to find the two
 spacegroups are not compatible, and the Rmerge of the 
 data seems reasonable. 
 One more question is: the wilson B of the data is lower than 20 from
 ccp4. Is it common for a 4A data? Since I donnot have 
 the experience of handling this low resolution data yet.  
 By the way, any suggestions about refinement methods in low resolution
 will be appreciated!
 
 Best wishes
 Yang 


[ccp4bb] New Version of the Protein Geometry Database Operational

2010-06-07 Thread Dale Tronrud
A new version of the Protein Geometry Database (PGD) has just been released.
This version includes

 - The ability to compose queries and analyze the behavior of side chain
   chi angles.

 - Structures released in the wwPDB up to April 8, 2010 consisting of
   roughly 18,000 nonredundant protein chains from crystal structures.
   That's over 1.8 million residues!

The PGD enables users to easily and flexibly query information about the
conformation alone, the backbone geometry alone, and the relationships
between them. The capabilities the PGD provides are valuable for assessing
the uniqueness of observed conformational or geometric features in protein
structure as well as discovering novel features and principles of protein
structure.  So if you observe a certain conformation or geometric feature
and wonder how unusual it is, the PGD may be able to provide the answer.

Queries can be based on amino acid type, secondary structure,
phi/psi/omega/chi angles, B factors and main chain bond lengths and
angles.  Queries for motifs of up to 10 residues in length can be made.
Once a query has been made, plots can be drawn to show the relationship
between any pair of conformational angles and/or main chain bond lengths
or angles.  In addition, the results of the query can be downloaded for
local analysis.

The PGD server is available at http://pgd.science.oregonstate.edu/

For more information please read
http://pgd.science.oregonstate.edu/static/pdf/Berkholz-PGD-2010.pdf

Happy hunting


Re: [ccp4bb] Odd loop stabilised by an cation

2010-07-02 Thread Dale Tronrud
   You can look for similar loops in other structures in the PDB using
the Protein Geometry Database (http://pgd.science.oregonstate.edu/).
The search page allows you to specify phi/psi ranges for loops up to
ten amino acids long and the Browse Results page will list out the
ID codes and residues of any matches found.

   If you need any detailed assistance using this server, I'd be
happy to help.

Dale Tronrud

On 07/02/10 05:44, Domen Zafred wrote:
 Dear all,
 
 There is an odd loop on the surface of my structure. Three back-bone
 oxygen atoms are turned in the same direction the structure is
 stabilized by an cation and water molecules.
 Also, the ion is probably partly occupied (as discussed in the recent
 post of Ivan Xaravich). The pictures in crossed-eye stereo are in the
 attachment. Electron densities are at 1.8 and 3.5 sigma.
 I have two problems regarding this loop:
 
 Is such a loop something known or common, or is it unique? How could I
 find structure with a similar feature?
 
 Is there a smart guess for finding out the right ion? Mg is the smallest
 of all and there is still some red density. Ca on the other hand is more
 common in cells and the puzzle is whether it is a small ion or is it
 bigger, but with lover occupancy.
 
 Any suggestions, comments or answers will be greatly appreciated.
 
 Regards,
 
 Domen Zafred


Re: [ccp4bb] How to make fft-map more physically meaningful?

2010-07-08 Thread Dale Tronrud

Edward A. Berry wrote:

Hailiang Zhang wrote:

Hi there:

I found that the grid values in the map file generated by CCP4-fft
generally has a mean value of ~0, and of course there will be lots of
negative values. This apparently is not the real physics, since the
electron density has to be positive everywhere (hope I am right). Can
somebody give me any hint how to convert the fft map file which has mean
value of 0, to a more physically meaningful map which has positive
densities everywhere? (I thought about offsetting the whole map by the
minimum negative values to make everything positive, but I doubt it is
right).

Best Regards, Hailiang


Actually taking the minimum value as zero might be a good approximation,
as long as the resolution is high enough so there are gaps in the protein
too small to be solvent-filled but large enough to be resolved from
surrounding density.

Maps from FFT will always have average value zero unless you include the 
0,0,0 reflection: the transform is a sum of sin and cos terms, all of 
which have

zero value when integrated over the unit cell, except the cos(0.X) term.
So any linear combination of these terms will average to zero if it doesn't
include the zero order term.

The 0,0,0 reflection is hard or impossible to measure because it gets
mixed up with the undiffracted beam. But it is easy to calculate, because
the integral of unity against the electron density is just the average
electron density times the volume, or the total number of electrons.
So if you know the total number of electrons in the unit cell,
you can divide by the unit cell volume to get the average
electron density (OK, I guess that is obvious) and add it to the
zero-average FFT map. This assumes the map is on an absolute scale,
which won't be quite true, so your idea of offsetting the minimum
to zero may be more satisfactory.

Ed


   Ed is right, of course.  Just remember to include ALL the electrons
in the unit cell - both those of the protein and those of the solvent,
ordered and disordered.

Dale Tronrud


Re: [ccp4bb] Mysterious density

2010-07-09 Thread Dale Tronrud
   Cyclized DTT can look similar to this blob.  Of course the sulfur
atoms would make one end of the blob more dense than the other.

Dale Tronrud

On 07/09/10 05:12, Nick Quade wrote:
 Dear CCP4 community,
 
 I have solved the structure of a protein in complex with DNA. But,
 inside the protein there seems to be a ligand binding pocket with some
 strong density
 (*http://picasaweb.google.de/113264696790316881054/Desktop#). *The
 protein was in Tris buffer, with some NaCl, MgCl2 and DTT and
 crystallized in Li2SO4 with MES. What could this density be? I can
 exclude MES as crystals grown with citrate buffer have the same density.
 So I guess it might be something I co-purified or perhaps some
 degradation product of the DNA? The electron density in the pictures is
 at 1.5sigma.
 
 Thanks in advance.
 
 Nick


Re: [ccp4bb] Deposition of riding H

2010-09-15 Thread Dale Tronrud
   While I am sympathetic to Ethan's and George's arguments, what
is missing in the world as it stands is a section in PDB files that
encode the parameters and rules used to generate the riding hydrogen
atoms for that particular model.  George has his favorite hydrogen
atoms to build, his favorite bond lengths for placing them (and good
arguments for his selections) and one could, I suppose, look them
up in the documentation for Shelxl, but they should be encoded in
the PDB file to allow automatic regeneration of the hydrogen atoms.

   An explicit listing of the rules for generation is particularly
needed since all these matters can, and often are, modified by the
user.  I know that in my refinements I manually move the hydrogen
from one nitrogen to the other in a couple Histidine side chains,
and have created my own rules for hydrogen generation in co-factors.

   CIF tags will have to be agreed upon (and that's always a fun
job) that would allow the description of the details of the various
hydrogen atom generation schemes that are in use, or may be used
in the future.   It would also be handy to have a reference implementation,
available under some forgiving license, that would materialize the
hydrogen atoms given the PDB header information, and would reproduce
the exact model refined, for any of the refinement programs.

   This is a worthwhile goal, but a tall order.  Until this
infrastructure is in place I think the hydrogen atoms have to be
included in the PDB file.  Otherwise it's the same as saying that
I've refined TLS ADP's but not saying what the TLS parameters were
nor listing the atoms in each TLS group.

Dale Tronrud

P.S. George: Do you think hydrogen atoms generated by the HFIX 137
command should be deposited?  They are placed based on the electron
density map with the dihedral angle of the methyl group becoming a
parameter of the model -- a parameter not recorded anywhere other
than in the hydrogen atom locations.


On 09/14/10 12:41, George M. Sheldrick wrote:
 
 Even though SHELXL refinements often involve resolutions of 1.5A or 
 better, I discourage SHELXL users from depositing their hydrogen 
 coordinates. There are three reasons:
 
 1. The C-H, N-H and O-H distances required to give the best fit to 
 the electron density are significantly shorter than those required 
 for molecular modeling and tests on non-bonded interactions (or
 located by neutron diffraction). It is ESSENTIAL to recalculate 
 them hydrogens at longer distances before using MolProbity and other 
 validation software. 
 
 2. There is considerable confusion concerning the names to be assigned
 to the hydrogens. This is not made easier by the application of a
 chirality test to -CH2- groups!
 
 3. O-H hydrogens are particularly difficult to 'see' and the geometrical
 calculation of their positions is often ambiguous. The same applies
 to the protonation states of histidines and carboxylic acids. In 
 addition such hydrogen positions are often disordered.
 
 For refinement I recommend including C-H and N-H but not O-H hydrogens.
 For very high resolution structures this reduces Rfree by 0.5-1.0% and
 clearly improves the model. At all resolutions the antibumping 
 restraints involving hydrogens are useful. 
 
 George
 
 Prof. George M. Sheldrick FRS
 Dept. Structural Chemistry,
 University of Goettingen,
 Tammannstr. 4,
 D37077 Goettingen, Germany
 Tel. +49-551-39-3021 or -3068
 Fax. +49-551-39-22582
 
 
 On Tue, 14 Sep 2010, Dr. Mark Mayer wrote:
 
 Here's one for the community, which I'll post to both Phenix and CCP4 BBs.

 Where does the crystallographic community stand on deposition of coordinates
 with riding hydrogens?
 Explicit H are required for calculating all atom clash scores with 
 Molprobity,
 and their use frequently gives better geometry (especially at low 
 resolution).
 Phenix uses explicit riding H for refinement, and outputs these in the 
 refined
 PDB. Refmac also uses riding H but does not output H coordinates.

 While depositing a series of structures refined at 1.4 - 2.75 A with Phenix
 got the following email from the RCSB, who asked I resupply coordinates
 without H for two of the structures. Since we can't see H even at 1.4 Å I
 don't understand why an arbitrary cut off of 1.5 Å was chosen, and also why
 explicit H atoms used in refinement and geometry validation should be 
 stripped
 from the file.

 FROM RCSB

 We encourage depositors not to use hydrogens in the final PDB file for
 the low resolution structures ( 1.5 A). Please provide an updated PDB
 file. We request you to use processed PDB file as a starting point for
 making any corrections to the coordinates and/or re-refinement.
 --

 Mark



Re: [ccp4bb] Map density level

2010-09-16 Thread Dale Tronrud
   The main advantage of contouring in absolute units is consistency.
The density for a water molecule with a B factor of 20 A^2 will look
about the same even if the noise level of one map is higher than
another.  (Within limits, of course)  This means that the actual value
you contour at isn't as important as the choice of the same value all
the time.  You want to train your eye for what good density looks
like at the level you use.

   I've picked 0.36 e/A^3 (don't get me started on units!) for density
maps and 0.18 e/A^3 for difference maps as my personal values.  The
former is usually about 1 sigma and the latter about 3 sigma but
of course the sigmas float based on other factors.  I will look at
the map contoured at lower levels when looking for atoms at lower
than full occupancy, but I always start at the same place.

   When first learning model building it is useful to leave out a
water molecule.  That way you can see what a good water molecule
looks like in your current difference map and can compare other
potential water molecules to it.

Dale Tronrud

On 09/16/10 10:13, Nathaniel Clark wrote:
 Hi,
 It can, just do
 fm-mode
 select rmsd
 
 I am curious though, I have heard that it is 'better' to build in
 units of absolute density, but I couldn't find any values.  Does any
 one have a suggestion as to what absolute electron density setting is
 'correct' for an Fo-Fc difference map?  Or do you just eyeball it?
 Nat
 
 On Thu, Sep 16, 2010 at 1:03 PM, Hailiang Zhang zhan...@umbc.edu wrote:
 Hi,

 I generated a map using FFT, and tried to display it in O. By comparing
 with coot, I found that the level in O seems to be the absolute electron
 density instead of the sigma level. I am sorry I ask a question more
 related to O: can O draw the map by a given sigma level instead of the
 absolute density, just like coot?

 Thanks!

 Best Regards, Hailiang



Re: [ccp4bb] embarrassingly simple MAD phasing question (another)

2010-10-14 Thread Dale Tronrud
   Just to throw a monkey wrench in here (and not really relevant to
the original question)...

   I've understood that, just as the real part of F(000) is the sum
of all the normal scattering in the unit cell, the imaginary part
is the sum of all the anomalous scattering.  This means that in the
presence of anomalous scattering the phase of F(000) is not zero.

   It is also the only reflection who's phase is not affected by
the choice of origin.

Dale Tronrud

On 10/13/10 22:38, James Holton wrote:
 An interesting guide to doing phasing by hand is to look at direct
 methods (I recommend Stout  Jensen's chapter on this).  In general
 there are several choices for the origin in any given space group, so
 for the first reflection you set about trying to phase you get to
 resolve the phase ambiguity arbitrarily.  In some cases, like P1, you
 can assign the origin to be anywhere in the unit cell.  So, in general,
 you do get to phase one or two reflections essentially for free, but
 after that, things get a lot more complicated.
 
 Although for x-ray diffraction F000 may appear to be mythical (like the
 sound a tree makes when it falls in the woods), it actually plays a very
 important role in other kinds of optics: the kind where the wavelength
 gets very much longer than the size of the atoms, and the scattering
 cross section gets to be very very high.  A familiar example of this is
 water or glass, which do not absorb visible light very much, but do
 scatter it very strongly.  So strongly, in fact, that the incident beam
 is rapidly replaced by the F000 reflection, which looks the same as
 the incident beam, except it lags by 180 degrees in phase, giving the
 impression that the incident beam has slowed down.  This is the origin
 of the index of refraction.
 
 It is also easy to see why the phase of F000 is zero if you just look at
 a diagram for Bragg's law.  For theta=0, there is no change in direction
 from the incident to the scattered beam, so the path from source to atom
 to direct-beam-spot is the same for every atom in the unit cell,
 including our reference electron at the origin.  Since the structure
 factor is defined as the ratio of the total wave scattered by a
 structure to that of a single electron at the origin, the phase of the
 structure factor in the case of F000 is always no change or zero.
 
 Now, of course, in reality the distance from source to pixel via an atom
 that is not on the origin will be _slightly_ longer than if you just
 went straight through the origin, but Bragg assumed that the source and
 detector were VERY far away from the crystal (relative to the
 wavelength).  This is called the far field, and it is very convenient
 to assume this for diffraction.
 
 However, looking at the near field can give you a feeling for exactly
 what a Fourier transform looks like.  That is, not just the before-
 and after- photos, but the during.  It is also a very pretty movie,
 which I have placed here:
 
 http://bl831.als.lbl.gov/~jamesh/nearBragg/near2far.html
 
 -James Holton
 MAD Scientist
 
 On 10/13/2010 7:42 PM, Jacob Keller wrote:
 So let's say I am back in the good old days before computers,
 hand-calculating the MIR phase of my first reflection--would I just
 set that phase to zero, and go from there, i.e. that wave will
 define/emanate from the origin? And why should I choose f000 over f010
 or whatever else? Since I have no access to f000 experimentally, isn't
 it strange to define its phase as 0 rather than some other reflection?

 JPK

 On Wed, Oct 13, 2010 at 7:27 PM, Lijun Liulijun@ucsf.edu  wrote:
 When talking about the reflection phase:

 While we are on embarrassingly simple questions, I have wondered for
 a long
 time what is the reference phase for reflections? I.e. a given phase
 of say
 45deg is 45deg relative to what?

 =
 Relative to a defined 0.

 Is it the centrosymmetric phases?

 =
 Yes.  It is that of F(000).

 Or a  theoretical wave from the origin?

 =
 No, it is a real one, detectable but not measurable.
 Lijun


 Jacob Keller

 - Original Message -
 From: William Scottwgsc...@chemistry.ucsc.edu
 To:CCP4BB@JISCMAIL.AC.UK
 Sent: Wednesday, October 13, 2010 3:58 PM
 Subject: [ccp4bb] Summary : [ccp4bb] embarrassingly simple MAD phasing
 question


 Thanks for the overwhelming response.  I think I probably didn't
 phrase the
 question quite right, but I pieced together an answer to the question I
 wanted to ask, which hopefully is right.


 On Oct 13, 2010, at 1:14 PM, SHEPARD William wrote:

 It is very simple, the structure factor for the anomalous scatterer is

 FA = FN + F'A + iFA (vector addition)

 The vector FA is by definition always +i (90 degrees anti-clockwise)
 with

 respect to the vector FN (normal scattering), and it represents the
 phase

 lag in the scattered wave.



 So I guess I should have started by saying I knew f'' was imaginary, the
 absorption term, and always needs to be 90 degrees in phase ahead

Re: [ccp4bb] quantum diffraction

2010-10-15 Thread Dale Tronrud
On 10/15/10 12:38, Bart Hazes wrote:
 The photon moves through the crystal in finite time and most of the time
 it keeps going without interacting with the crystal, i.e. no
 diffraction. However, if diffraction occurs it is instantaneous, or at
 least so fast as to consider it instantaneous. In some cases a
 diffracted photon diffracts another time while passing through the
 remainder of the crystal. Or in Ruppian terms, a poof-pop-poof-pop
 event. If you listen carefully you may be able to hear it.
 

   The photon both diffracts and doesn't diffract as it passes through
the crystal and it diffracts into all the directions that match the Bragg
condition.  The wave function doesn't collapse to a single outcome until
the detector measures something - which in the scheme of things occurs
long after the photon left the crystal.

   The photon also interacts with the electrons for as long as the
wave functions overlap.  You have to solve the time-dependent Schrodinger
equation to get the details.  In all the the QM classes I've had they
start by writing the time-dependent equation and then immediately
erasing it - never to be mentioned again.  All the rest of the term was
spent with the time-independent equation and the approximation of the
instantaneous quantum jump.  If you assume that nothing changes with
time the only way to model changes is with discontinuities.

Dale

 Bart
 
 On 10-10-15 12:43 PM, Jacob Keller wrote:
 but yes, each photon really does interact with
 EVERY ELECTRON IN THE CRYSTAL at once.

 A minor point: the interaction is not really at once, is it? The
 photon does have to move through the crystal over a finite time.

 JPK
 


[ccp4bb] Enforcing ncs on water molecules

2010-11-15 Thread Dale Tronrud
Hi

   I'm refining my first structure with a significant amount
of ncs and am not looking forward to my usual, manual, editing
of the water model. Could someone point me in the direction of
a program that will encourage my water to obey the ncs?

   What I have in mind is to, first, find each cluster of water
molecules related by the ncs.  Then if some threshold is not
reached, say only 1/3rd of the sites are occupied for a
particular cluster, kill that cluster.  If more than some
threshold are occupied but less than 100%, fill in the missing
water molecules.

   I would also like to reset the waters in each cluster to
the average location.

   Has someone already written something along these lines?
If so, I would rather not duplicate their effort.

Thanks in advance,
Dale Tronrud


Re: [ccp4bb] Space group and R/Rfree value

2010-12-01 Thread Dale Tronrud
   It is not at all unusual for a biological homodimer to sit on a
crystallographic two-fold symmetry axis.  It is also not unusual
for such a dimer to sit entirely in the asymmetric unit.  This
cannot be used to identify the space group.

   The space group is determined by the diffraction data.  The
difference between C2221 and P212121 is that many of the reflections
predicted for P212121 have intensity equal to zero in C2221.  Since
you have a confusion between these two, I presume the P212121 model
has a pseudotranslational symmetry of (0.5,0.5,0.0).  This pseudo-
translational symmetry should be reported by xtriage, and will
mislead the twin detection tests.

   To determine which of these choices is the correct space group
you do not perform refinement, you look at the diffraction pattern
to see if there are non-zero intensities for the spots that must
be zero if the space group were C2221.  In P212121 with pseudo-C
centering these spots will be weak but observable.

   I am not surprised that your refinement in P212121 gives higher
R values than C2221.  In P212121 with pseudo-C symmetry half of
the reflections are weak and will have low signal/noise ratio.
With the assumption of C centering these weak reflections are
discarded and the R values will go down.  Your goal is not to
reduce the R values, but to fit the data.  If these reflections
have non-zero intensities you must integrate them and add them
to your refinement.

Dale Tronrud


On 12/01/10 08:31, Xiaopeng Hu wrote:
 Dear Dr. Kelly Daughtry,
 
 Thanks for your help. The enzyme I am working on now functions as a dimmer 
 and the active site is located at the interface. In previously published 
 homology structures, there is one dimmer in the ASU and the dimmer has a 
 tight NCS.  With C2221, the dimmer formed by symmetry mates fits the homology 
 dimmer very very well. It is hard for me to understand how a enzyme can has 
 such a crystallographic dimmer.
 
 I am not good with Phenix, so I only tried xtriage to check the data set. 
 With C2221, the twin test gives a good Z score which is much smaller than the 
 critic 3.5, while with P212121, the Z score is high (10). I didn't go 
 further.
 
 The maps look just the same between the two space groups.
 
 - 原始邮件 - 发件人: Kelly Daughtry kddau...@bu.edu 收件人: Xiaopeng Hu 
 huxp...@mail.sysu.edu.cn 发送时间: 星期四, 2010年 12 月 02日 上午 12:05:08 主题: Re: 
 [ccp4bb] Space group and R/Rfree value
 
 Sorry, I meant with the P212121 refinement. You mentioned that it is probably 
 twinned. Including the twin law in refinement with the P212121 data should 
 help lower your R and Rfree values.
 
 
 If you have already included the twin law in your phenix refinement for the 
 P212121 data, and R and Rfree can not be lowered, I would suggest that your 
 data probably is C2221. Also, the fact that C2221 is not twinned while 
 P212121 is twinned is an indicator to me that C2221 is probably correct as 
 well.
 
 
 I wouldn't exclude C2221 as the real space group for not having the desired 
 dimer. I have had tetrameric proteins crystallize with one mol / ASU, trimers 
 with one mol/ ASU. If you turn on symmetry mates, do you see your intended 
 dimer with the C2221 data?
 
 
 Last question/suggestion: Do the maps look the same between the two space 
 groups? I would assume that the P212121 calculated maps are somewhat worse 
 than the C2221 maps.
 
 
 With space group identity problems like these, you have to let the data tell 
 you what is the correct space group. And from the looks of it, C2221 is the 
 way to go.
 
 
 Best of luck, Kelly *** 
 Kelly Daughtry, Ph.D. Post-Doctoral Fellow, Raetz Lab Biochemistry Department 
 Duke University Alex H. Sands, Jr. Building 303 Research Drive RM 250 Durham, 
 NC 27710 P: 919-684-5178 
 ***
 
 
 
 2010/12/1 Xiaopeng Hu  huxp...@mail.sysu.edu.cn 
 
 
 1: No, the data reduction software didnt find twin and C2221 works well, so I 
 never tried twin in refinement. 2: C2221 gives out a monomer in the ASU.
 
 - 原始邮件 - 发件人: Kelly Daughtry  kddau...@bu.edu  收件人: Xiaopeng Hu 
  huxp...@mail.sysu.edu.cn  发送时间: 星期三, 2010年 12 月 01日 下午 11:32:42 主题: Re: 
 [ccp4bb] Space group and R/Rfree value
 
 
 
 
 Just to clarify, did you use the twin law in the phenix refinement? Also, is 
 the C2221 solution a monomer or dimer in the ASU?
 
 
 
 *** Kelly Daughtry, Ph.D. 
 Post-Doctoral Fellow, Raetz Lab Biochemistry Department Duke University Alex 
 H. Sands, Jr. Building 303 Research Drive RM 250 Durham, NC 27710 P: 
 919-684-5178 ***
 
 
 
 2010/12/1 Xiaopeng Hu  huxp...@mail.sysu.edu.cn 
 
 
 Dear all,
 
 I am working on a data-set (2.3A) and the space group problem bothers me a 
 lot.The space group of the data-set could be C2221 or P212121, since our 
 protein functions

Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein atoms

2010-12-10 Thread Dale Tronrud
   The proper occupancy for an atom on a special position depends on
how one defines the meaning of the number in that column.  In the
past, refinement programs, at least I know mine did, simply expanded
all atoms in the coordinate file by the symmetry operators to determine
the contents of the unit cell.  With that operation the occupancy of
the atoms on special positions had to be reduced.  It is certainly
true that there are 1/3 the number of atoms in the unit cell represented
by ZN D  31 than, for example, the CA of residue 50.

   Most modern refinement programs try to handle this automatically,
since users proved unreliable at detecting this condition and modifying
their coordinate files.  They use the interpretation that the site
is fully occupied but there are only 1/3 the number of these sites
than sites at general positions.

   Personally I find it disturbing to have the occupancy of B  31
set to 0.33 and that of D  31 set to 1.00 simply because of an
insignificant shift in the position of the atom.

Dale Tronrud

On 12/10/10 13:53, Ian Tickle wrote:
 Good point Colin!  2-Zn insulin is of course a classic example of
 this, where the two independent Zn2+ ions both sit on the
 crystallographic 3-fold in R3.  It doesn't matter whether you count
 the metal ion as part of the protein or not: if I understand Gloria's
 original question correctly, all that matters is that the atom/ion is
 present in the crystal structure.
 
 In fact here are some extracts from the PDB entry (4INS):
 
 REMARK 375 ZNZN B  31  LIES ON A SPECIAL POSITION.
 REMARK 375 ZNZN D  31  LIES ON A SPECIAL POSITION.
 REMARK 375  HOH B 251  LIES ON A SPECIAL POSITION.
 REMARK 375  HOH D  44  LIES ON A SPECIAL POSITION.
 REMARK 375  HOH D 134  LIES ON A SPECIAL POSITION.
 REMARK 375  HOH D 215  LIES ON A SPECIAL POSITION.
 REMARK 375  HOH D 269  LIES ON A SPECIAL POSITION.
 
 HETATM  835 ZNZN B  31  -0.002  -0.004   7.891  0.33 10.40  ZN
 HETATM  836 ZNZN D  31   0.000   0.000  -8.039  0.33 11.00  ZN
 HETATM  885  O   HOH B 251  -0.023  -0.033  11.206  0.33 21.05   O
 etc
 
 Hmmm - but shouldn't the occupancy of the Zn be 1.00 if it's on the
 special position (assuming it's not disordered), though the first Zn
 above and the water do appear to be disordered since they're not
 actually on the special position.  Fractional occupancy always implies
 some kind of disorder: occupancy = 1/3 of an atom on a special
 position would imply occupancy disorder, i.e. it's randomly present in
 only 1/3 of the unit cells.
 
 -- Ian
 
 
 On Fri, Dec 10, 2010 at 1:11 PM, Colin Nave colin.n...@diamond.ac.uk wrote:
 Does one regard the metal atom in a metalloprotein as being part of the 
 protein?

 If so, a shared metal could occupy a special position in a dimer for example.

 In Acta Cryst. (2008). D64, 257-263 Metals in proteins: correlation between 
 the metal-ion type, coordination number and the amino-acid residues involved 
 in the coordination I. Dokmanic, M. Sikic and S. Tomic ( 
 http://scripts.iucr.org/cgi-bin/paper?S090744490706595X ) it says there are 
 25 cases of metal atoms in special positions.

 Also Acta Cryst. (2002). D58, 29-38 The 2.6 Å resolution structure of 
 Rhodobacter capsulatus bacterioferritin with metal-free dinuclear site and 
 heme iron in a crystallographic `special position' D. Cobessi, L.-S. Huang, 
 M. Ban, N. G. Pon, F. Daldal and E. A. Berry ( 
 http://scripts.iucr.org/cgi-bin/paper?S0907444901017267 ) though the 
 'special position' is justifiably in quotation marks in this example as 
 disorder is present.

 Colin



Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein atoms

2010-12-15 Thread Dale Tronrud
Dear Ian,

   I think you are putting too much importance on the numerical
instability of an atom's position when refining with full matrix
refinement.  When developing TNT's code for calculating second
derivatives I found that building into the calculation the effects
of such an atom overlapping its own, symmetry related, electron
density eliminated the instability and no constraints to special
positions were required.  I was only working with block diagonal
second derivatives with one block per atom but I don't see any
reason the proper calculation would not work with the full matrix.
The electron density of an atom near a special position is nearly
that of one far away.  It is not reasonable that a proper
calculation would blow up for one and not the other.  The key is
doing the proper calculation.

   It's true that the proper calculation of the atomic block for
an atom near a special position took more time than the calculation
for all the other atoms in the model.  You can't just calculate
generic look-up tables that apply to all atoms.  The reward of the
full calculation is that all the complications you describe disappear.
An atom that sits 0.001 A from a special position is not unstable
in the least.  It does, of course, have to have an occupancy of
1/n.  I always avoid programing tests of a == b for real numbers
because the round-off errors will always bite you at some point.
This means that a test of an atom exactly on a special position
can't be done reliably in floating point math.

   Your preferred assumption is that any atom near enough to
a special position is really on the special position and should
have an occupancy of one.  My assumption is that no atom is every
EXACTLY on the special position and if they are close enough to
their symmetry image to forbid coexistence the occupancy should
be 1/n.  I think either assumption is reasonable but, of course,
prefer mine for what I consider practical reasons.  It helps that
I have to code to make mine work.

Dale Tronrud

On 12/15/10 08:54, Ian Tickle wrote:
 Hi Herman
 
 What makes an atom on a special position is that it is literally ON
 the s.p.: it can't be 'almost on' the s.p. because then if you tried
 to refine the co-ordinates perpendicular to the axis you would find
 that the matrix would be singular or at least so badly conditioned
 that the solution would be poorly defined.  The only solution to that
 problem is to constrain (i.e. fix) these co-ordinates to be exactly on
 the axis and not attempt to refine them.  The data are telling you
 that you have insufficient resolution so you are not justified in
 placing the atom very close to the axis; the best you can do is place
 the atom with unit occupancy exactly _on_ the axis.  It's only once
 the atom is a 'significant' distance (i.e. relative to the resolution)
 away from the axis that these co-ordinates can be independently
 refined.  Then the data are telling you that the atom is disordered.
 If you collected higher resolution data you might well be able to
 detect  successfully refine disordered atoms closer to the axis than
 with low resolution data.  So it has nothing to do with the programmer
 setting an arbitrary threshold.  This would have to be some
 complicated function of atom type, occupancy, B factor, resolution,
 data quality etc to work properly anyway so I doubt that it would be
 feasible.  Instead it's determined completely by what the data are
 capable of telling you about the structure, as indeed it should be.
 
 My main concern was the conflict between some program implementations
 and the PDB and mmCIF format descriptions on this issue.  For example
 the PDB documentation says that the ATOM record contains the occupancy
 (where this is defined in the CIF/mmCIF documentation).  If it had
 intended that it should contain multiplicity*occupancy instead then
 presumably it would have said so.
 
 Cheers
 
 -- Ian
 
 On Wed, Dec 15, 2010 at 4:01 PM,  herman.schreu...@sanofi-aventis.com wrote:
 Dear Ian,

 In my view, the confusion arises by NOT including the multiplicity into the 
 occupancy. If we make the gedanken experiment and look at a range of crystal 
 structures with a rotationally disordered water molecule near a symmetry 
 axis (they do exist!) then as long as the water molecule is sufficiently far 
 from the axis, it is clear that the occupancy should be 1/2 or 1/3 or 
 whatever is the multiplicity. However, as the molecule approaches the axis 
 at a certain moment at a certain treshold set by the programmer of the 
 refinement program, the molecule suddenly becomes special and the occupancy 
 is set to 1.0. So depending on rounding errors, different thresholds etc. 
 different programs may make different decisions on whether a water is 
 special or not.

 For me, this is confusing.

 Best regards,
 Herman

 -Original Message-
 From: CCP4 bulletin board [mailto:ccp...@jiscmail.ac.uk] On Behalf Of Ian 
 Tickle
 Sent: Wednesday, December 15, 2010 3

Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein atoms

2010-12-16 Thread Dale Tronrud
On 12/16/10 03:06, Ian Tickle wrote:
 Dale
 
 The reward of the
 full calculation is that all the complications you describe disappear.
 An atom that sits 0.001 A from a special position is not unstable
 in the least.
 
 That's indeed a very interesting observation, I have to admit that I
 didn't think that would be achievable.  But there must still be some
 threshold of distance at which even that fails?  Presumably within
 rounding error?  Or are you saying (I assume you aren't!) that you can
 even refine all co-ordinates of an atom exactly on a special position?
  Say the x and z co-ordinates of an atom at (0,y,0) in monoclinic?
 Presumably the atom would have to be given a random push one way or
 the other (random number generators are generally not a feature of
 crystallographic refinement programs, with the obvious exception of
 simulated annealing!)?
 

   To be frank, I wrote this code about 15 years ago, it works, and
I've not given any thought to atoms on special positions since.  I'll
have to go back to my notes and code to dig up the exact method.
Anyone with a copy of TNT can look up the code.  I am, however, not
in the least concerned about what happens when an atom falls exactly
on a special position because I just don't think that any part of a
protein model can be considered exact.  If I have a model with two
atoms, of occ=1/2 each, sitting 0.0001 A apart - it fits the density
and I think everyone knows what that model means, or at least they
should. If you decide to shove them each 0.5 A and call them a
single atom with occ=1, your model will fit the density just as well
and I have no problem with that either.

   By the way, the refinement issue has nothing to do with special
positions.  The instability you observe occurs any time you build two
atoms into the same bit of density.  If your model has two atoms, at
a general position, with exactly the same coordinates the Normal
matrix will have a singularity.  The problem doesn't come up much
because we normally choose not to build such models.  It can be an
issue in models with disorder where different conformations interpenetrate
each other but the stereochemical restraints usually come to the
rescue then.

 ?  I always avoid programing tests of a == b for real numbers
 because the round-off errors will always bite you at some point.
 This means that a test of an atom exactly on a special position
 can't be done reliably in floating point math.
 
 Obviously common sense has to be applied here and tests for strict
 floating-point equality studiously avoided.  But this is very easily
 remedied, my optimisation programs are full of tests like
 IF(ABS(X-Y).LT.1E-6) THEN ... and I'm certain so are yours (assuming
 of course you still program in Fortran!).  This implies that in the
 case that an atom is off-axis and disordered you have to take care not
 to place it within say a few multiples of rounding error of the axis,
 since then it might be indeed be confused with one 'on' the special
 position.  However if someone claims that an atom sits within say
 10*rounding error of an axis as distinct from being on the axis, then
 a) there's no way that can be proved, and b) it would be
 indistinguishable from being on the s.p. and the difference in terms
 of structure factors and maps would be insignificant anyway, so it may
 as well be on-axis.

   If the difference is insignificant, it may as well be off-axis.  I
guess if the difference is insignificant it just comes down to personal
preferences.

Dale Tronrud

 
 I think this is how the Oxford CRYSTALS software (
 http://www.xtl.ox.ac.uk/crystals.html ), which has been around for at
 least 30 years, deals with this issue, so I can't accept that it can't
 be made to work, even if I haven't got all the precise details
 straight of how it's done in practice.
 
   Your preferred assumption is that any atom near enough to
 a special position is really on the special position and should
 have an occupancy of one.  My assumption is that no atom is every
 EXACTLY on the special position and if they are close enough to
 their symmetry image to forbid coexistence the occupancy should
 be 1/n.  I think either assumption is reasonable but, of course,
 prefer mine for what I consider practical reasons.  It helps that
 I have to code to make mine work.
 
 Whichever way it's done is only a matter of convention (clearly both
 ways work just as well), however I would reiterate that my main
 concern here is that convention and practice appear to have parted
 company in this particular instance!
 
 Cheers
 
 -- IAn


Re: [ccp4bb] Fwd: [ccp4bb] Wyckoff positions and protein atoms

2010-12-16 Thread Dale Tronrud
On 12/16/10 06:47, Ian Tickle wrote:
 
 For the sake of argument let's say that 0.02 Ang is too big to be
 rounding error.  So if you see that big a shift then the intention of
 the refinement program (or rather the programmer) which allowed such a
 value to be appear in the output should be that it's real.  If the
 intention of the user was that the atom is actually on axis then the
 program should not have allowed such a large shift, since it will be
 interpreted as 'much bigger than rounding error' and therefore
 'significantly off-axis'. 

   I would certainly hope that no one believes that the precision of
the parameters in a PDB file are significant to the level of round-off
error!  It's bad enough that a small number of people take the three
decimal points of precision in the PDB file seriously.  When a person
places an atom in a model they aren't stating a believe that that is
the EXACT location of the atom, only that they believe the center of
the locus of all equivalent atoms in the crystal falls near that spot.
If it's 0.02 A from a special position (and the SU of the position is
larger than that) then it might be on the special position and it might
not.

   If I come across one of your models and you have an atom exactly on
a special position (assuming you're able to do that with three decimal
points in a PDB file) I'd still assume that you only intend that there
is an atom in the vicinity of that point and it might be exactly on
the axis but it might be a little off.  All structural models are fuzzy.

Dale Tronrud


Re: [ccp4bb] Noisy difference maps with high solvent content?

2011-01-28 Thread Dale Tronrud
Hi,

   This sort of problem can occur if you are missing your lowest resolution
data and/or your model for the bulk solvent is inappropriate.  You might
want to double check these issues.

   With 80% solvent you have to be careful when choosing your contour level.
If you are a fan of normalized maps and contouring based on sigma (and it's
not a sigma by the way) you should be aware that the normalization factor is
calculated over the protein and all that empty space and will be smaller than
one calculated for equivalent protein density in a low solvent crystal.
Plus/minus 3 contours will be lower and the significance of features will
be inflated.

   One way to calibrate a contour level would be to leave out a known good
bit of model and calculate your difference map.  Then select a contour level
that shows the understood omission well.  Other peaks that show up at that
level are errors as significant as the one you created.

Dale Tronrud

On 01/28/11 12:29, Todd Geders wrote:
 Greetings CCP4bb,
 
 *Short version:  
 
 Very noisy difference maps from a crystal with extremely high solvent
 content, seeking advice on how best to handle such high solvent content
 to eliminate noise in difference maps.
 *
 *http://strayxray.com/images/coot.jpg*
 
 Long version:
 
 I'm having trouble with a 3.0Å dataset from a crystal with 80% solvent
 content.  The space group is P4132 and I'm quite confident the high
 solvent content is real (there is a species-specific set of helices
 extending into the solvent channels that appears to prevent tighter
 packing).
 
 I was able to get a MR solution using a structurally related enzyme, but
 the difference maps are terribly noisy (see link).  There are lots of
 negative density in empty spaces between well-defined 2Fo-Fc electron
 density.  
 
 http://strayxray.com/images/coot.jpg
 
 The 2Fo-Fc density actually looks fairly good.  The initial MR maps had
 clear density correlating to the sequence differences between the MR
 model and the crystallized protein.  After fixing the model as best I
 could, the refinement statistics are R/Rfree of 27.5/30.3 with a
 data/parameters ratio of 1.7.
 
 The mosaicity ranges from 0.15-0.3, data were collected with 0.5°
 oscillations and 180° of data were collected.  
 
 http://strayxray.com/images/diffraction.jpg
 
 Since the crystals appeared to suffer from radiation decay (based on
 scaling statistics), I only use the first 40° of data (which still gives
 around 8-fold redundancy).  Using more minimal wedges of data or more
 data does not noticeably make better or noisier maps.
 
 Any advice on improving the maps?  Could the noisy maps be due to the
 extraordinarily high solvent content?
 
 I'd appreciate any advice or comments.
 
 ~Todd Geders


[ccp4bb] Ken Olsen, Founder of Digital Equipment Corporation, Died Sunday

2011-02-08 Thread Dale Tronrud

   I see in the news that Ken Olsen has died.  Although he was
not a crystallographer I think we should stop for a moment to
remember the profound impact the company that this man founded
had on our field.

   My first experience in a crystallography lab was as an undergraduate
in M. Sundaralingam's lab in Madison Wisconsin.  While I never had
the opportunity to use them, his two diffractometers were controlled
by the ubiquitous PDP-8 computers.  I had more experience with his
main computer, which was either a PDP-11/34 or 35 (Ethan help me out!).
This was connected to a Vector General graphics display running software
called UWVG.  Having the least stature in the lab I got the midnight
to 4am time slot for model building.  The computer took about 10
minutes to compute and contour each block of map, covering about
three residues.  While waiting I would crawl under the DECwriter and
nap.  The computer would stop rattling when the map was up and that
would wake me.

   When I joined the Matthews lab in Oregon they had a VAX 11/780.
What an amazing machine!  It had 1 MB of RAM and could run a million
instructions in a second.  It only took 48 hours to run a cycle of
refinement with PROLSQ, that is, if no one else used the computer.
These specs don't sound like much but this computer was really a
revolution for computational crystallography.  That a single lab
could own a computer of such power was unheard of before this.
It wasn't just that the computer had so much RAM (We later got it
up to its max of 4 MB.) but the advent of virtual memory made
program design so much easier.  You could simply define an array
of 100,000 elements and not have to worry about finding where in
memory, mixed in with the operating system, utility programs, and
other users' software, you could find an unused block that big.

   Digital didn't invent virtual memory, but the VAX made it
achievable for regular crystallographers.  Through most of the 1980's
you didn't have to worry about getting your code to run on other
computers - Everyone had access to a VAX.

   In the 1990's DEC came out with the alpha CPU chip which really
broke ground for performance.  These things screamed when in came
to running crystallographic software.  In 1999 the lab bought
several of the 666 MHz models.  It was about four years before
Intel came out with a chip that would match these alphas on my
crystallography benchmark and they had to be clocked at over 2 GHz
to do it.

   Yes, Digital lost out in the competition of the marketplace, and
Ken Olsen was pushed out of the company well before the end.  But
what a ride it was.  What great computers they were and what great
science was done on them!

Dale Tronrud


Re: [ccp4bb] what to do with disordered side chains

2011-03-30 Thread Dale Tronrud
   Standardization is great!  That is why the way we describe positions,
occupancy, and B factors has already been standardized. The core of this
discussion is that some people want to use these parameters to describe
details other than position, occupancy and motion.  Since all the parameters
on ATOM/HETATM records are already defined with great specificity, if
you want the model to contain additional information you will have to
define new parameters, or some way to specify the information you want
to include using other, existing, records more adequate to the task
(e.g. SIGATM).

Dale Tronrud

On 03/30/11 11:32, Frank von Delft wrote:
 
 I'm amazed at the pedestal people put their precious coordinates on --
 isn't the first thing you learn about MX that our models are rubbish
 parametrizations of the actual content of the crystal?  And thus they
 will remain as long as we have the R-factor gap, and no amount of
 coordinate-sigmas or dark-density will change that.
 
 What we *are* trying to do is communicate something, and the bedrock of
 communication is /convention/ - also known as standardization.  What
 is science other than one large standardization exercise?   So yes,
 standardization is *exactly* what is needed:  when the same phenomenon
 is described in so many different ways by different people, what that
 indicates is not that they have different opinions, it indicates only
 that everybody has to second-guess what their audience will understand. 
 But once we've laid down a convention, the guessing stops and both
 speaker and listener know what the hell is being said.
 
 phx.
 
 
 
 
 
 On 30/03/2011 19:04, James Holton wrote:

 I'm afraid this is not a problem that can be solved by
 standardization. 

 Fundamentally, if you are a scientist who has collected some data (be
 it diffraction spot intensities, cell counts, or substrate
 concentration vs time), and you have built a model to explain that
 data (be it a constellation of atoms in a unit cell, exponential
 population growth, or a microscopic reaction mechanism), I think it is
 generally expected that your model explain the data to within
 experimental error.  Unfortunately, this is never the case in
 macromolecular crystallography, where the model-data disagreement
 (Fobs-Fcalc) is ~4-5x bigger than the error bars (sigma(F)).

 Now, there is nothing shameful about an incomplete model, especially
 when thousands of very intelligent people working over half a century
 have not been able to come up with a better way to build one.  In
 fact, perhaps a better name for the disordered side chain problem
 would be dark density?  This name would place it properly amongst
 dark matter, dark energy and other fudge factors introduced to try
 and explain why our standard model is not consistent with
 observation?  That is, dark density is the stuff we can't see, but
 nonetheless must be there somewhere.

 Whatever it is, I personally do hold a vain belief that perhaps
 someday soon the problem of dark density will be solved, and that
 presently instituting a policy requiring that all macromolecular
 models from this day forward remain at least as incomplete as
 yesterday's models is not a very good idea.  I say: if you think there
 is something there then you should build it in, especially if it is
 important to the conclusions you are trying to make.  You can defend
 your model the same way you would defend any other scientific model:
 by using established statistics to show that it agrees with the data
 better than an alternative model (like leaving it out).  It is YOUR
 model, after all!  Only you are responsible for how right it is.

 I do appreciate that students and other novices may have a harder time
 defining surfaces and measuring hydrogen bond lengths in these pesky
 floppy regions, but perhaps their education would be served better
 by learning the truth sooner than later?

 -James Holton
 MAD Scientist


 On 3/30/2011 9:26 AM, Filip Van Petegem wrote:
 Hello Mark,

 I absolutely agree with this.  The worst thing is when everybody is
 following their own personal rules, and there are no major guidelines
 for end-users to figure out how to interpret those parts.  I assume
 there are no absolute guidelines simply because there isn't any
 consensus among crystallographers... (from what we can gather from
 this set of emails...). On the other hand, this discussion has flared
 up many times in the past, and maybe it's time for a powerful
 dictator at the PDB to create the law...

 Filip Van Petegem



 On Wed, Mar 30, 2011 at 8:37 AM, Mark J van Raaij
 mjvanra...@cnb.csic.es mailto:mjvanra...@cnb.csic.es wrote:

 perhaps the IUCr and/or PDB (Gerard K?) should issue some
 guidelines along these lines?
 And oblige us all to follow them?
 Mark J van Raaij
 Laboratorio M-4
 Dpto de Estructura de Macromoleculas
 Centro Nacional de Biotecnologia - CSIC
 c/Darwin 3, Campus Cantoblanco
 E-28049 Madrid, Spain

Re: [ccp4bb] The meaning of B-factor, was Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud

   While what you say here is quite true and is useful for us to
remember, your list is quite short.  I can add another

3) The systematic error introduced by assuming full occupancy for all sites.

There are, of course, many other factors that we don't account for
that our refinement programs tend to dump into the B factors.

   The definition of that number in the PDB file, as listed in the mmCIF
dictionary, only includes your first factor --

http://mmcif.rcsb.org/dictionaries/mmcif_std.dic/Items/_atom_site.B_iso_or_equiv.html

and these numbers are routinely interpreted as though that definition is
the law.  Certainly the whole basis of TLS refinement is that the B factors
are really Atomic Displacement Parameters.   In addition the stereochemical
restraints on B factors are derived from the assumption that these parameters
are ADPs.  Convoluting all these other factors with the ADPs causes serious
problems for those who analyze B factors as measures of motion.

   The fact that current refinement programs mix all these factors with the
ADP for an atom to produce a vaguely defined B factor should be considered
a flaw to be corrected and not an opportunity to pile even more factors into
this field in the PDB file.

Dale Tronrud


On 3/31/2011 9:06 AM, Zbyszek Otwinowski wrote:

The B-factor in crystallography represents the convolution (sum) of two
types of uncertainties about the atom (electron cloud) position:

1) dispersion of atom positions in crystal lattice
2) uncertainty of the experimenter's knowledge  about the atom position.

In general, uncertainty needs not to be described by Gaussian function.
However, communicating uncertainty using the second moment of its
distribution is a widely accepted practice, with frequently implied
meaning that it corresponds to a Gaussian probability function. B-factor
is simply a scaled (by 8 times pi squared) second moment of uncertainty
distribution.

In the previous, long thread, confusion was generated by the additional
assumption that B-factor also corresponds to a Gaussian probability
distribution and not just to a second moment of any probability
distribution. Crystallographic literature often implies the Gaussian
shape, so there is some justification for such an interpretation, where
the more complex probability distribution is represented by the sum of
displaced Gaussians, where the area under each Gaussian component
corresponds to the occupancy of an alternative conformation.

For data with a typical resolution for macromolecular crystallography,
such multi-Gaussian description of the atom position's uncertainty is not
practical, as it would lead to instability in the refinement and/or
overfitting. Due to this, a simplified description of the atom's position
uncertainty by just the second moment of probability distribution is the
right approach. For this reason, the PDB format is highly suitable for the
description of positional uncertainties,  the only difference with other
fields being the unusual form of squaring and then scaling up the standard
uncertainty. As this calculation can be easily inverted, there is no loss
of information. However, in teaching one should probably stress more this
unusual form of presenting the standard deviation.

A separate issue is the use of restraints on B-factor values, a subject
that probably needs a longer discussion.

With respect to the previous thread, representing poorly-ordered (so
called 'disordered') side chains by the most likely conformer with
appropriately high B-factors is fully justifiable, and currently is
probably the best solution to a difficult problem.

Zbyszek Otwinowski




- they all know what B is and how to look for regions of high B
(with, say, pymol) and they know not to make firm conclusions about
H-bonds
to flaming red side chains.


But this knowledge may be quite wrong.  If the flaming red really
indicates
large vibrational motion then yes, one whould not bet on stable H-bonds.
But if the flaming red indicates that a well-ordered sidechain was
incorrectly
modeled at full occupancy when in fact it is only present at
half-occupancy
then no, the H-bond could be strong but only present in that
half-occupancy
conformation.  One presumes that the other half-occupancy location
(perhaps
missing from the model) would have its own H-bonding network.



I beg to differ.  If a side chain has 2 or more positions, one should be a
bit careful about making firm conclusions based on only one of those, even
if it isn't clear exactly why one should use caution.  Also, isn't the
isotropic B we fit at medium resolution more of a spherical cow
approximation to physical reality anyway?

   Phoebe






Zbyszek Otwinowski
UT Southwestern Medical Center at Dallas
5323 Harry Hines Blvd.
Dallas, TX 75390-8816
Tel. 214-645-6385
Fax. 214-645-6353


Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud

On 3/31/2011 10:14 AM, Jacob Keller wrote:
 What do we gain? As Dale pointed out, we are already abusing either occupancy, B-factor or delete the side chain to compensate 
for our inability to tell the user that the side chain is disordered. With your proposal, we would fudge both occupancy and 
B-factor, which in my eyes is even worse as fudging just one of the two.



 We gain clarity to the non-crystallographer user: a b-factor of 278.9
 sounds like possibly something real. A b-factor of exactly 1000 does
 not. Both probably have the same believability, viz., ~zero. Also,
 setting occupancy = zero is not fudging but rather respectfully
 declining to comment based on lack of data. I think it is exactly the
 same as omitting residues one cannot see in the density.


   These things are never clear unless there is a solid definition of
the terms you are using.  I don't think you can come up with an out of
band value for the B factor that doesn't have a legitimate meaning as
an atomic displacement parameter for someone.  How large a B factor you
can meaningfully define depends on your lower resolution limit.  People
working with electron microscopy or small angle X-ray scattering could
easily build models with ADPs far larger than anything we normally
encounter.

   In addition, you can't define 1000 as a magic value since the PDB
format will only allow values up to 999.99, and I presume maintaining
the PDB format is one of your goals.  Of course, you could choose -99.99
as the magic value but that would break all of our existing software
and I presume you don't want that either.  Actually defining any value
for the B factor as the magic value would break all of our software.
The only advantage of a large, positive, number is that it would create
bugs that are more subtle.

   The fundamental problem with your solution is that you are trying to
cram two pieces of information into a single number.  Such density always
causes problems.  Each concept needs its own value.

   You could implement your solution easily in mmCIF.  Just create a new
tag, say _atom_site.imaginary_site, which is either true or false
for every atom.  Then everyone would be able to either filter out the fake
atoms or leave them in, without ambiguity or confusion.

   If you object that the naive user of structural models wouldn't know
to check this tag - they aren't going to know about your magic B factor
either.  You can't out-think someone who's not paying attention.  At
some point you have to assume that people being paid to perform research
will learn the basics of the data they are using, even if you know that
assumption is not 100% true.

Dale Tronrud


Re: [ccp4bb] what to do with disordered side chains

2011-03-31 Thread Dale Tronrud
 and
reads it back in at the start of the next cycle.  There can be no
difference between the meaning of the parameters in memory and on disk.)

   A great deal of what we do with our models depends on the details of
the definitions of these parameters.  Adding extra meanings and special
cases causes all sorts of problems at all levels of use.




either.  You can't out-think someone who's not paying attention.  At
some point you have to assume that people being paid to perform research
will learn the basics of the data they are using, even if you know that
assumption is not 100% true.


Well, the problem is not *should* but *do*. Should we print bilingual
danger signs in the US? Shouldn't we assume that people know English?
But there is danger, and we care about sparing lives. Here too, if we
care about the truth being abused or missed, it seems we should go out
of our way.


   I've not advocated doing nothing.  I've advocated that the solution
we choose should be clearly defined and that definition be consistent
with past definitions (as much as possible) and consistent with the
principles of data structures created by the people who study such things.

   We *should* go out of our way to make a solution to this common problem.
The solution we choose should be one that actually solves the problem
and not simply creates more confusion.

Dale Tronrud

P.S. I just Googled occupancy zero.  The top hit is a letter from
Bernhard Rupp recommending that occupancies not be set to zero :-)


Re: [ccp4bb] what to do with disordered side chains

2011-04-03 Thread Dale Tronrud

   Clearly there are strong feelings held by the advocates of the
several solutions to the problem of what to do about atoms that
cannot be reliably placed based on the electron density map.  I
certainly understand since I passionately support my own favorite
solution.

   Why is it that a community of generally reasonable people keep
coming back to this same issue and yet fail to find a solution
that can reach some kind of consensus?

   My 2 cents on this, more fundamental, issue:

   A model created by someone who believes that all atoms (for a
residue with any atoms) must be built will contain two kinds of
atoms.  Those placed based on the appearance of the electron
density and those placed in some convenient location simply to
fill out the atom count.  I think most everyone agrees that a
full residue is a convenience for some users of our models.  What
those of us who favor partial models want is an absolutely clear
distinction between the two classes of atoms.

   All this needs is a bit.  Literally, one bit of data that flags
those atoms added to the model simply to complete the set.

   Why can't we come to a solution that satisfies?  Because we
continue to use a non-extensible file format that does not allow
us a place to put this bit.

   Some people want to put the bit in the occupancy column by
defining a special value (occ=0) that would be the flag.  Some
people want to put it in the B factor column by defining a special
value there (a couple possibilities here, B=1000.00, B=500.00,
B varying but larger than that of any atom built into density).

   The B factor and occupancy columns in the PDB file have been
precisely defined back when the mmCIF dictionary was created and
to change their definitions now would require opening that process
again.  I am pretty sure that committee in charge will never allow
a definition for these items that includes the phrase ... except when
the value is equal too   You can't run a database that way.

   Each piece of information has to have its own tag and definition.
Once it is defined we can embrace the task of educating software
developers and our collaborators who use our models in its meaning.

   There is just no place to put this bit in a PDB format file.
mmCIF - its trivial.  PDB format - no way.  As long as we insist that
this format is the preferred means of distributing our models we
will continue to return to this argument again and again with no
possibility of coming to a solution.

Dale Tronrud

P.S. I've even thought about using the model of the REMARK statement,
where all sorts of information have been added by the hack of
standardized remarks.  I thought that one could create a
standardized footnote that would mark the atoms as imaginary.
I found that, unfortunately, footnotes were removed from the PDB
format many years ago.



On 4/3/2011 11:01 AM, Boaz Shaanan wrote:

The original posting that started this thread referred to side-chains, as the 
subject still suggests. Do you propose to omit only side-chain atoms, in which 
case you end up with different residues, as pointed out by quite a few 
people,or do you suggest also to omit the main-chain atoms of the problematic 
residues ?

Besides, as mentioned by Phoebe and others, many users (non-crystallographers) 
of PDB's know already  the meaning of the B-factor and will know how to 
interpret a very high B. It is our task (the crystallographers) to enllighten 
those who don't know what the B column in a PDB entry stands for. I certainly 
do and I'm sure many of us do so too. I voted for high B and would vote for it 
again, if asked.

 Cheers,

Boaz


Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710




From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] On Behalf Of Bernhard Rupp 
(Hofkristallrat a.D.) [hofkristall...@gmail.com]
Sent: Sunday, April 03, 2011 7:42 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] what to do with disordered side chains

Thus my feeling is that if one does NOT see the coords in the electron
density, they should NOT be included, and let someone else try to model
them in, but they should be aware that they are modeling them.
Joel L. Sussman

Concur.  BMC p 680 ‘How to handle missing parts’

Best wishes, BR

On 3 Apr 2011, at 06:15, Frances C. Bernstein wrote:

Doing something sensible in the major software packages, both
for graphics and for other analysis of the structure, could
solve the problem for most users.

But nobody knows what other software is out there being used by
individuals or small groups.  And the more remote the authors
of that software are from protein structure solution the more
likely it is that they have not/will not properly handle atoms
with zero occupancy or high B values, for example.

I am absolutely positive that there is software

Re: [ccp4bb] what to do with disordered side chains

2011-04-04 Thread Dale Tronrud

   The definition of _atom_site.occupancy is

 The fraction of the atom type present at this site.
  The sum of the occupancies of all the atom types at this site
  may not significantly exceed 1.0 unless it is a dummy site.

When an atom has an occupancy equal to zero that means that the
atom is NEVER present at that site - and that is not what you
intend to say.  Setting the occupancy to zero does not mean that
a full atom is located somewhere in this area.  Quite the opposite.

   (The reference to a dummy site is interesting and implies to
me that mmCIF already has the mechanism you wish for.)

   Having some experience with refining low occupancy atoms and
working with dummy marker atoms I'm quite confident that you can
never define a B factor cutoff that would work.  No matter what
value you choose you will find some atoms in density that refine
to values greater than the cutoff, or the limit you choose is so
high that you will find marker atoms that refine to less than the
limit.  A B factor cutoff cannot work - no matter the value you
choose you will always be plagued with false positives or false
negatives.

   If you really want to stuff this bit into one of these fields
you have to go all out.  Set the occupancy of a marker atom to -99.99.
This will unambiguously mark the atom as an imaginary one.  This
will, of course, break every program that reads PDB format files,
but that is what should happen in any case.  If you change the
definition of the columns in the file you must mandate that all
programs be upgraded to recognized the new definitions.  I don't
know how you can do that other than ensuring that the change will
cause programs to cough.  To try to slide it by with a magic value
that will be silently accepted by existing programs is to beg for
bugs and subtle side-effects.

   Good luck getting the maintainers of the mmCIF standard to accept
a magic value in either of these fields.

   How about this: We already have the keywords ATOM and HETATM
(and don't ask me why we have two).  How about we create a new
record in the PDB format, say IMGATM, that would have all the
fields of an ATOM record but would be recognized as whatever the
marker is for dummy atoms in the current mmCIF?  Existing programs
would completely ignore these atoms, as they should until they are
modified to do something reasonable with them.  Those of us who
have no use for them can either use a switch in the program to
ignore them or just grep them out of the file.  Someone could write
a program that would take a model with only ATOM and HETATM records
and fill out all the desired IMGATM records (Let's call that program
WASNIAHC, everyone would remember that!).

   This solution is unambiguous.  It can be represented in current
mmCIF, I think.  The PDB could run WASNIAHC themselves after deposition
but before acceptance by the depositor so people like me would not
have to deal with them during refinement but would be able to see
them before our precious works of art are unleashed on the world.

   Seems like a win-win solution to me.

Dale Tronrud


On 4/3/2011 9:17 PM, Jacob Keller wrote:

Well, what about getting the default settings on the major molecular
viewers to hide atoms with either occ=0 or bcutoff (novice mode?)?
While the b cutoff is still be tricky, I assume we could eventually
come to consensus on some reasonable cutoff (2 sigma from the mean?),
and then this approach would allow each free-spirited crystallographer
to keep his own preferred method of dealing with these troublesome
sidechains and nary a novice would be led astray

JPK

On Sun, Apr 3, 2011 at 2:58 PM, Eric Bennetter...@pobox.com  wrote:

Most non-structural users are familiar with the sequence of the proteins they 
are studying, and most software does at least display residue identity if you 
select an atom in a residue, so usually it is not necessary to do any cross 
checking besides selecting an atom in the residue and seeing what its residue 
name is.  The chance of somebody misinterpreting a truncated Lys as Ala is, in 
my experience, much much lower than the chance they will trust the xyz 
coordinates of atoms with zero occupancy or high B factors.

What worries me the most is somebody designing a whole biological experiment around an 
over-interpretation of details that are implied by xyz coordinates of atoms, even if 
those atoms were not resolved in the maps.  When this sort of error occurs it is a level 
of pain and wasted effort that makes the pain associated with having to build 
back in missing side chains look completely trivial.

As long as the PDB file format is the way users get structural data, there is really no 
good way to communicate atom exists with no reliable coordinates to the user, 
given the diversity of software packages out there for reading PDB files and the 
historical lack of any standard way of dealing with this issue.  Even if the file format 
is hacked there is no way to force all the existing

  1   2   3   >