from:"Nat Echols"

Re: [ccp4bb] extracting PHENIX structures

2015-01-25 Thread Nat Echols

On Sun, Jan 25, 2015 at 12:25 AM, Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:

 Here, I subtracted the Phenix and Refmac entries from the total Phenix
 count, because it seems likely that Phenix was used for other purposes than
 refinement.


Actually, based on my skimming of methods sections in recent publications,
it is relatively common for users to run multiple refinement programs over
the course of a project.  In fact it's far more common than the
combinations in the software field indicate, because the PDB deposition
process will automatically use whatever is indicated by the submitted PDB
file - i.e. the last program run - and users usually won't update this to
include previously run programs.  One consequence of this is that CNS ends
up being underrepresented in the software field, because it tends to be
used earlier in refinement (DEN, for example), with Phenix or Refmac
substituted when the model is closer to convergence.  The other consequence
is that you can't assume that entries tagged as just Phenix OR Refmac were
really refined with only one program rather than a combination.

The moral: when you deposit your structures, please indicate all software
used, not just the very last program that you ran.  (And, of course,
include the relevant citations in the actual publication, which sadly
doesn't always happen.)

-Nat

Re: [ccp4bb] chloride or water

2015-01-21 Thread Nat Echols

On Wed, Jan 21, 2015 at 10:17 AM, Keller, Jacob kell...@janelia.hhmi.org
wrote:

  I see your point about not knowing that it’s a chloride, but I think you
 would agree that it is certainly more likely a chloride than map-noise, and
 perhaps more likely than water as well. Would you agree that chloride is
 the best guess, at least?


No, I think I don't know is the most honest and scientifically robust
answer.  For those who insist on annotating every density blob, UNX atoms
are the PDB's officially supported method for doing so (unless this has
changed recently), or UNK/UNL for unknown amino acids and ligands.  These
are not without their own problems but they at least make both the presence
of an atom and the uncertainty about its identity explicit.

Since the PDB is certainly tainted by structures modeled in accordance with
 the “most likely” outlook, one now has to be cautious about all structures.


This is true, but everyone else is just as sloppy is a poor excuse for
further polluting the database.

-Nat

Re: [ccp4bb] chloride or water

2015-01-21 Thread Nat Echols

On Wed, Jan 21, 2015 at 12:16 AM, Engin Özkan eoz...@uchicago.edu wrote:

  Carbon in chloride's coordination sphere? To me, it looks like you have
 serious vdW violations, and neither water nor chloride could go there.


Halides can interact with carbon too - discussed in Dauter  Dauter (2001)
- although I think this is more common with iodide than chloride.  But this
instance is totally unconvincing without anomalous data.  It would be
better to leave it entirely empty than to put in something wildly
speculative - there are far too many spurious chlorides in the PDB already,
which of course makes it even more difficult to come up with general rules
about binding patterns.

-Nat

Re: [ccp4bb] chloride or water

2015-01-21 Thread Nat Echols

On Wed, Jan 21, 2015 at 9:05 AM, Keller, Jacob kell...@janelia.hhmi.org
wrote:

  Not sure why there is this level of suspicion about the poor halide when
 waters generally get assigned so haphazardly. I would say that there are
 probably more “wrong” waters in the PDB than wrong chlorides, but there’s
 not much fuss about that.


Great, so leave it empty instead of just making something up.  Perhaps
future generations will figure out a more rigorous and quantitative method
for handling such features than guessing based on screenshots posted to a
mailing list.  At this resolution water placement is difficult to justify
anyway - and since neither the scattering properties nor the coordination
distances are especially accurate, trying to assign chemical identity in
the absence of any supporting information (for example anomalous data) is
especially futile.

(Although at least in this case the resolution is an obvious red flag - to
a crystallographer, anyway - indicating that any lighter ions shouldn't be
taken very seriously.  Other biologists, of course, may be more trusting.)

-Nat

Re: [ccp4bb] chloride or water

2015-01-21 Thread Nat Echols

On Wed, Jan 21, 2015 at 12:24 PM, Keller, Jacob kell...@janelia.hhmi.org
wrote:

I think this will probably never happen, but maybe there could be a
 confidence value associated with each atom in structures a posteriori,
 although it might be difficult to find the right automatable criteria for
 this value. The element would be assigned by being the most likely one, and
 confidence assigned thereafter. Too much of a pain to implement, probably,
 and maybe not worth the trouble. Perhaps, though, Nat’s program could be
 used to do this. [ doi:10.1107/S1399004714001308
 http://dx.doi.org/10.1107/S1399004714001308 ].


Unfortunately that implementation isn't really quantitative in the sense
you describe - not for lack of interest on our part, but rather because
coming up with a unified score based on many disparate and arguably
incomplete criteria is quite difficult.  The initial goal was to go after
the low-hanging fruit of ions that are relatively obvious and/or whose
identity is well-known, which would otherwise need to be placed manually.
But as we tried to make clear in the paper, this isn't a substitute for
common sense and prior knowledge.

(By the way, for what it's worth, I think both Na and Cl are simultaneously
over- and under-represented in the PDB - there are many spurious atoms, but
at least as many that were overlooked and labeled as water.  From the
perspective of a methods developer, however, the false positives are much
more of a pain to deal with in an automated workflow.)

-Nat

Re: [ccp4bb] 3 letter code for pyridoxine

2014-12-31 Thread Nat Echols

On Wed, Dec 31, 2014 at 10:11 AM, Faisal Tarique faisaltari...@gmail.com
wrote:

 I request you to please tell me the three letter code for pyridoxine..


You can find the appropriate residue code for any molecule previously
deposited in the PDB - which includes pyridoxine - by simply searching for
the conventional name on the RCSB PDB web site (and probably the other PDB
sites too, but I'm less familiar with them).

-Nat

Re: [ccp4bb] unknown densities

2014-12-08 Thread Nat Echols

On Mon, Dec 8, 2014 at 7:24 PM, Keller, Jacob kell...@janelia.hhmi.org
wrote:

  FYI, sometimes native nucleotides can make it through protein
 purifications if binding is tight.


This is especially true for G-proteins, since tight binding to GDP is an
essential part of their function.  I don't know what to expect from E. coli
proteins, but human Ras co-purifies with GDP at close to 1:1.

At low resolution, it might be easiest to superimpose the closest
nucleotide-bound homolog and see how well the density aligns with the
nucleotide.

-Nat

Re: [ccp4bb] Consensus on modeling unidentifiable lipids and detergents in membrane protein structures?

2014-11-06 Thread Nat Echols

On Thu, Nov 6, 2014 at 5:20 PM, Oliver Clarke oc2...@columbia.edu wrote:

 I wonder if a solution might be to create new residues containing alkyl
 chains of various lengths, named something like U01, U02, U(n), where N is
 the length of the alkyl chain that fits the density. Sort of similar to the
 way one might leave UNK residues in a peptide of unidentified sequence. Or
 maybe there is a better way of doing this? Would appreciate any suggestions.


This would be UNL, which is a catch-all for unidentified ligands. The
best example I can think of is PDB ID 3arc, which is a relatively
high-resolution structure of Photosystem II containing a number of
partially ordered mystery lipids.  This is a bit of a mess because they're
a mix of unbranched alkyl chains and branched esters, so the chemistry is
not internally consistent, but at least the scattering types are defined
and the UNL designation makes the ambiguous identity explicit.  I think
this is the generally accepted convention.  I wouldn't worry too much about
nomenclature; you can just generate restraints for a generic ligand of this
type with as many carbon atoms as you need - for instance, using a SMILES
string like this:

CCC

and hopefully that will be sufficient for all of the fragments in your
structure.

-Nat

Re: [ccp4bb] water at the same exactly position

2014-10-29 Thread Nat Echols

On Wed, Oct 29, 2014 at 8:53 PM, luzuok luzuo...@126.com wrote:

 I think it is better for COOT to solve this issue.


Coot already can be used to solve this issue - I think the automation is
somewhat lacking, but it's vastly preferable to anything involving a text
editor or shell commands.

1. Load molecule and electron density maps in Coot
2. From the Validate menu, select Check/Delete waters...
3. Just select for waters with very close distances, for example 0.2Å; I've
attached a screenshot of what it should look like.
4. This will give you a list of overlapping waters - then you just need to
delete one of each pair.  (It doesn't matter which one - the waters will be
renumbered later anyway.)

Alternately, you can set Action to Delete, which is much less effort,
but that will delete both copies.  If you are just going to run a program
(or Coot function) to place more waters automatically (my preference), this
won't matter, but if they're atoms you really care about, you should delete
them manually.

-Nat

[image: Inline image 1]

Re: [ccp4bb] R/Rfree gap with pseudotranslational symmetry

2014-09-26 Thread Nat Echols

On Fri, Sep 26, 2014 at 8:17 PM, Kimberly Stanek kas...@virginia.edu
wrote:

  Before refinement in phenix the R/Rfree gap is rather small, however even
 after one round of refinement I am finding that this gap increases to
 almost 0.06. I have a feeling that the high symmetry present has something
 to with this R/Rfree gap but was hoping some of you may have some helpful
 suggestions for how to deal with it.


It's normal for the R/R-free gap to increase during the first round of
refinement in molecular replacement - in fact, unless you are solving a
near-identical crystal form and keeping the original R-free flags, this is
almost guaranteed to happen.  MR will use all reflections and the limited
refinement Phaser does uses very coarse parameterization (rigid-body and
group B-factor), so the R-free will usually be quite low and sometimes even
lower than R-work.  Restrained refinement will immediately start to open
the gap, but if it's working properly, it won't keep expanding throughout
refinement.  At this resolution a gap in the range of 0.02-0.04 would be
normal - less than this is unusual.

My guess is you just need to change the relative weights of the X-ray
target and geometry restraints so that the latter are stronger.  Also, use
NCS restraints if you aren't already.

-Nat

Re: [ccp4bb] Problem regarding twin detection and refinement

2014-09-18 Thread Nat Echols

The tests that output twinning fractions are *not* diagnostic for twinning;
they merely estimate what the twin fraction would be if the data were in
fact twinned, which can only be decided on the basis of abnormal intensity
statistics.  (Any version of Xtriage since July should state this more
clearly, since we've seen so many users make this mistake in the past.)
I'm not sure what you mean by the L-test plot being sigmoidal - usually
this is a diagnostic feature of the NZ plot.  If you could post images of
these plots (Xtriage will let you save them, probably ccp4i loggraph will
too), that might help.

Given the fact that it scales in P622 despite the ASU being too small, it
may be the case that it really is twinned (I'm not exactly sure how to
interpret the Refmac results), but you need to absolutely rule out other
possibilities before resorting to twinned refinement.  It is certainly
possible to solve such a structure but very tricky to refine without
fooling yourself.  At 4Å resolution you will already have a difficult time
coping with model bias in the 2Fo-Fc map, and twin refinement will make
this even worse (whether or not you actually have twinning).

-Nat




On Wed, Sep 17, 2014 at 11:23 PM, Sudipta Bhattacharyya 
sudiptabhattacharyya.iit...@gmail.com wrote:

 Dear Community,

 Recently, we could solve a structure of DNA/protein complex through MR
 phasing. The data was initially indexed and scaled in P622 space group,
 however owing to the incompatibility of a single DNA/protein complex to fit
 in the ASU, it could not be solved in that space group (according to
 solvent content analysis). Since it was an indication of possible twining
 event, we tried MR in P321, P312 and P6 space groups respectively and
 finally got a very good solution in P65. According to phenix xtriage
 analysis, the data may be a near perfect merohedrally twinned one (twining
 fraction 0.425, Britton analysis;  0.468 H test; 0.478 ML method; with a
 possible twin operator h,-h-k,-l), however, the L test rather suggest no
 such twining event (but the graph of acentric observed, appeared slightly,
 sigmoidal compared to the straight acentric theoretical in the L test). The
 same phenomenon happened when we checked the data for possible twining in
 Truncate (twin fraction: L test: No; H test, 0.42, Murray Rust, 0.35;
 Britton ML, 0.47; possible twin operator: h+k,-k,-l).  On the other hand,
 while refining the data in Refmac5 with intensity based twin option ON,
 Refmac5 suggested a perfect merohedral twining with the fraction of
 0.49/0.50.

 In the context of these confusing situation, now my questions are -

 1. Is the data twinned or not?
 2. With such a high twining fraction, is it solvable?
 3. What refinement programs will be the best choices for refinement of
 such a twinned data?
 4. In one of the tutorials of Refmac5 it has been suggested, in such twin
 refinement cases, to choose Rfree set in higher space group (in our case
 P6522) then expand it to lower space group (in our case P65), could anybody
 please let me know how to do that in CCP4 or else?
 5. It is a data of 4A resolution could anyone tell me what final R/Rfree
 one could expect from a 4A data (although it may sound a dumb question...)

 Any help will be highly appreciated.

 With my best regards,
 Sudipta.

 Sudipta Bhattacharyya,
 Postdoctoral Research Fellow,
 Colorado State University, Fort Collins,
 Colorado, USA.

Re: [ccp4bb] Coot - way to move the center pointer to a specific x,y,z coordinate?

2014-09-09 Thread Nat Echols

In Python scripting (Calculate menu):

set_rotation_centre(x, y, z)

I assume there's a Scheme equivalent.

-Nat

On Tue, Sep 9, 2014 at 1:17 PM, Alejandro Virrueta 
alejandro.virru...@yale.edu wrote:

 Does anyone knowhow to move the center pointer to a specific x,y,z
 coordinate? Or to place some kind of marker at a specific x,y,z coordinate?

 Thanks,
 Alex

Re: [ccp4bb] Reliable criteria to tell Anomalous or not?

2014-09-04 Thread Nat Echols

On Thu, Sep 4, 2014 at 4:05 PM, CPMAS Chen cpmas...@gmail.com wrote:

 Do you guys have some recommendation of the criteria? phenix reported
 anomalous measurability, CCP4/aimless has RCRanom. Sometimes, they are not
 consistent.


The measurability isn't always useful - it's definitely correlated with
how easy it will be to find sites and experimentally phase the data, but
it's very dependent on the sigmas being estimated accurately.  I'm not sure
what RCRanom is, but both Aimless and the new version of Xtriage (if using
unmerged data as input) will report CC(anom), which is like CC1/2 for
anomalous differences, and that should be pretty reliable.

-Nat

Re: [ccp4bb] random half data sets

2014-08-13 Thread Nat Echols

On Tue, Aug 12, 2014 at 10:28 PM, Keller, Jacob kell...@janelia.hhmi.org
wrote:

 A somewhat similar question, with a quick answer I hope: when programs
 output CC's of 1/2 datasets, are several random halvings compared/averaged,
 and if not, does this make a difference, or are the scores so similar
 there's no point?


The latter, I think.  It probably only matters for data where you have a
lot of erroneous observations (like ice rings) or at the fringes where
there's almost no signal anyway.  In my hands, a dataset with mulitplicity
of 3.9 has an outer shell with a CC1/2 between 0.497 and 0.503 depending on
random seed, which isn't worth worrying about.

-Nat

Re: [ccp4bb] correlated alternate confs - validation?

2014-07-23 Thread Nat Echols

On Wed, Jul 23, 2014 at 3:25 AM, MARTYN SYMMONS 
martainn_oshioma...@btinternet.com wrote:

 The practice at the PDB after deposition used to be to remove water
 alternate position indicators - although obviously to keep their partial
 occupancies.


This has not been my experience - see for example:

http://www.rcsb.org/pdb/files/2GKG.pdb

which was deposited as a PDB file, not mmCIF.

-Nat

Re: [ccp4bb] Proper detwinning?

2014-07-09 Thread Nat Echols

On Wed, Jul 9, 2014 at 5:14 PM, Chris Fage cdf...@gmail.com wrote:

 Despite modelling completely into great electron density, Rwork/Rfree
 stalled at ~38%/44% during refinement of my 2.0-angstrom structure
 (P212121, 4 monomers per asymmetric unit). Xtriage suggested twinning,
 with |L| = 0.419, L^2 = 0.245, and twin fraction = 0.415-0.447.
 However, there are no twin laws in this space group. I reprocessed the
 dataset in P21 (8 monomers/AU), which did not alter Rwork/Rfree, and
 in P1 (16 monomers/AU), which dropped Rwork/Rfree to ~27%/32%. Xtriage
 reported the pseudo-merohedral twin laws below.
 ...
 Performing intensity-based twin refinement in Refmac5 dropped
 Rwork/Rfree to ~27%/34% (P21) and ~18%/22% (P1). Would it be
 appropriate to continue with twin refinement in space group P1?


It sounds like you have pseudo-symmetry and over-merged your data in
P212121.  I would try different indexing for P21 before giving up and using
P1 (you may be able to just re-scale without integrating again, but I'm
very out of date); the choice of 'b' axis will be important.  If none of
the alternatives work P1 may be it, but I'm curious whether the intensity
statistics still indicate twinning for P1.

-Nat

Re: [ccp4bb] twin or untwinned

2014-07-04 Thread Nat Echols

On Thu, Jul 3, 2014 at 7:50 AM, Nat Echols nathaniel.ech...@gmail.com
wrote:

 On Thu, Jul 3, 2014 at 6:53 AM, Dirk Kostrewa kostr...@genzentrum.lmu.de
 wrote:

 yes - unfortunately, in my hands, phenix.xtriage reads the XDS_ASCII.HKL
 intensities as amplitudes, producing very different output statistics,
 compared both to the XDS statistics and to an mtz file with amplitudes
 created from that XDS file.


 This is incorrect.  It does read it correctly as intensities - the
 confusion probably arises from the fact that Xtriage internally converts
 everything to amplitudes immediately, so that when it reports the summary
 of file information, it will say xray.amplitude no matter what the input
 type was (the same will also be true for Scalepack and MTZ formats).
 However, the data will be converted back to intensities as needed for the
 individual analyses.  Obviously this isn't quite ideal either since the
 original intensities are preferable but for the purpose of detecting
 twinning I hope it will be okay.  In any case the incorrect feedback
 confused several other users so it's gone as of a few weeks ago, and the
 current nightly builds will report the true input data type.  (The actual
 results are unchanged.)

 Tim: I have no reason to think we handle unmerged data poorly; I'm not
 sure who would have told you that.  In most cases they will be merged as
 needed upon reading the file.  I'm a little concerned that you're getting
 such different results from Xtriage and pointless/aimless, however.  Could
 you please send me the input and log files off-list?  Dirk, same thing: if
 you have an example where XDS and Xtriage are significantly in
 disagreement, the inputs (and logs) would be very helpful.  In both cases,
 I suspect the difference is in the use of resolution cutoffs and
 absolute-scaled intensities in Xtriage versus other programs, but I'd like
 to be certain that there's not something broken.


I stand corrected: unmerged XDS files (but not other formats) were not
being handled appropriately in Xtriage; this was fixed several weeks ago,
so the nightly builds should behave as expected.

-Nat

Re: [ccp4bb] twin or untwinned

2014-07-03 Thread Nat Echols

On Thu, Jul 3, 2014 at 6:53 AM, Dirk Kostrewa kostr...@genzentrum.lmu.de
wrote:

 yes - unfortunately, in my hands, phenix.xtriage reads the XDS_ASCII.HKL
 intensities as amplitudes, producing very different output statistics,
 compared both to the XDS statistics and to an mtz file with amplitudes
 created from that XDS file.


This is incorrect.  It does read it correctly as intensities - the
confusion probably arises from the fact that Xtriage internally converts
everything to amplitudes immediately, so that when it reports the summary
of file information, it will say xray.amplitude no matter what the input
type was (the same will also be true for Scalepack and MTZ formats).
However, the data will be converted back to intensities as needed for the
individual analyses.  Obviously this isn't quite ideal either since the
original intensities are preferable but for the purpose of detecting
twinning I hope it will be okay.  In any case the incorrect feedback
confused several other users so it's gone as of a few weeks ago, and the
current nightly builds will report the true input data type.  (The actual
results are unchanged.)

Tim: I have no reason to think we handle unmerged data poorly; I'm not sure
who would have told you that.  In most cases they will be merged as needed
upon reading the file.  I'm a little concerned that you're getting such
different results from Xtriage and pointless/aimless, however.  Could you
please send me the input and log files off-list?  Dirk, same thing: if you
have an example where XDS and Xtriage are significantly in disagreement,
the inputs (and logs) would be very helpful.  In both cases, I suspect the
difference is in the use of resolution cutoffs and absolute-scaled
intensities in Xtriage versus other programs, but I'd like to be certain
that there's not something broken.

thanks,
Nat

Re: [ccp4bb] Lysine coordinated ions

2014-07-01 Thread Nat Echols

On Tue, Jul 1, 2014 at 3:10 PM, Katherine Sippel katherine.sip...@gmail.com
 wrote:

 My google-fu has failed me once again so I am turning to the collective
 knowledge of the bb. I'm working on a blobology challenge at the moment and
 have hit a wall. Is anyone aware of an ion that coordinates to lysine and
 prefers octahedral geometry. The mystery ion seems to have perfect
 octahedral geometry with bond distances of ~2.1 angstrom, but the only
 direct side chain interaction is to a lysine NZ, the rest are waters.


Lysine can coordinate a cation if the chemical environment is favorable -
usually this means a high-pH buffer (what was the pH of your crystals?).
The same is true for N-termini; I may be able to dig up a published example
of this.  (I think it is effectively impossible for Arg, however.)  These
interactions are certainly exceedingly rare (and I doubt they are ever
present in vivo), but if the nitrogen loses a proton the lone pair will be
able to coordinate a compatible ions.  Since magnesium can be coordinated
by the nitrogen histidine it seems like the most likely candidate - but I
would still be very, very careful before assigning it, especially if the
only other coordinating atoms are waters.

-Nat

Re: [ccp4bb] PDB Storage of Diffraction Images

2014-05-16 Thread Nat Echols

On Fri, May 16, 2014 at 7:12 AM, esse...@helix.nih.gov wrote:

 Short of storing images, which is the ultimate preservation of primary
 information, I have always been puzzled by the fact that the PDB only
 stores
 unique reflections i.e. no Friedel pairs even when provided. Is this
 outdated
 perhaps ? I remember that my deposited SFs in the past where reduced to not
 contain Friedel pairs. If there had been a concern about increasing the
 storage space by actually less than twice the space for unique SFs, this
 may
 be invalid today and is still far less than the space required for images.
 However, it is possible that the information content in Friedel pairs is
 deemed insignificant compared to their extra costs. I for one would
 appreciate
 having access to Friedel pairs very much.


They definitely store Friedel pairs!  Maybe you're confused by the layout
of the mmCIF file, which (like MTZ) usually lists just the unique
(non-anomalous) indices, but with separate values for F+/F- when they are
available.  I've been making extensive use of anomalous data depositions -
unfortunately there aren't as many as we would like, either because many
people do not realize that this is useful information even when the
experiment was not specifically looking for anomalous signal, or because
the complexity of PDB deposition discourages providing the most complete
data.

An even more useful improvement would be to make deposition of unmerged
intensities straightforward - the JCSG does this somehow but it is
non-trivial for the average user.  Hopefully this will also change soon.

-Nat

Re: [ccp4bb] PDB passes 100,000 structure milestone

2014-05-15 Thread Nat Echols

On Thu, May 15, 2014 at 9:53 AM, Patrick Shaw Stewart patr...@douglas.co.uk
 wrote:

 It seems to me that the Wikipedia mechanism works wonderfully well.  One
 rule is that you can't make assertions yourself, only report pre-existing
 material that is attributable to a reliable published source.


This rule would be a little problematic for annotating the PDB.  It
requires a significant amount of effort to publish a peer-reviewed article
or even just a letter to the editor, and none of us are being paid to write
rebuttals to dodgy structures.

-Nat

Re: [ccp4bb] PDB passes 100,000 structure milestone

2014-05-15 Thread Nat Echols

That is an extraordinary case, and it certainly took a huge amount of
work.  What about structures that are obviously wrong based on inspection
of the density, but no one has bothered to challenge yet?  The TWILIGHT
database helps some, if that counts, but it doesn't catch everything.

-Nat



On Thu, May 15, 2014 at 10:48 AM, Patrick Shaw Stewart 
patr...@douglas.co.uk wrote:


 I may be missing something here, but I don't think you have to rebut
 anything.  You simply report that someone else has rebutted it.  Along the
 lines of

 Many scientists regard this published structure as unreliable since a
 misconduct investigation by the University of Alabama at Birmingham has
 concluded that it
 was, more likely than not, faked [1]

 [1] http://www.nature.com/news/2009/091222/full/462970a.html






 On 15 May 2014 18:00, Nat Echols nathaniel.ech...@gmail.com wrote:

 On Thu, May 15, 2014 at 9:53 AM, Patrick Shaw Stewart 
 patr...@douglas.co.uk wrote:

 It seems to me that the Wikipedia mechanism works wonderfully well.  One
 rule is that you can't make assertions yourself, only report pre-existing
 material that is attributable to a reliable published source.


 This rule would be a little problematic for annotating the PDB.  It
 requires a significant amount of effort to publish a peer-reviewed article
 or even just a letter to the editor, and none of us are being paid to write
 rebuttals to dodgy structures.

 -Nat




 --
  patr...@douglas.co.ukDouglas Instruments Ltd.
  Douglas House, East Garston, Hungerford, Berkshire, RG17 7HD, UK
  Directors: Peter Baldock, Patrick Shaw Stewart

  http://www.douglas.co.uk
  Tel: 44 (0) 148-864-9090US toll-free 1-877-225-2034
  Regd. England 2177994, VAT Reg. GB 480 7371 36

Re: [ccp4bb] PDB passes 100,000 structure milestone

2014-05-14 Thread Nat Echols

On Wed, May 14, 2014 at 10:26 AM, Mark Wilson mwilso...@unl.edu wrote:

 Getting to Eric's point about an impasse, if the PDB will not claim the
 authority to safeguard the integrity of their holdings (as per their
 quoted statement in Bernhard's message below), then who can?


I think this may in part boil down to a semantic dispute over the meaning
of integrity.  I interpreted it to mean integrity (and public
availability) of the data as deposited by the authors, which by itself is
quite a lot of work.  Safeguarding the integrity of the peer-review process
is supposed to be the job of the journals, some of which - unlike the PDB -
are making a tidy profit from our efforts.  Since they justify this profit
based on the value they supposedly add as gatekeepers, I don't think it's
unreasonable for us to expect them to do their job, rather than leave it to
the PDB annotators, who surely have enough to deal with.

I do share some of the concern about 2hr0, but I am curious where the line
should be drawn.  This is an extraordinary case where the researcher's
institution requested retraction, but I think everyone who's been in this
field for a while has a list of dodgy structures that they think should be
retracted - not always with justification.

-Nat

Re: [ccp4bb] TER in PDB file

2014-05-13 Thread Nat Echols

On Tue, May 13, 2014 at 9:20 PM, Felix Frolow mbfro...@post.tau.ac.ilwrote:

 Phenix does even more, it adds TER after ions and ligands, so again manual
 messing is needed.
 However they may have a jiffy to fix it.


phenix.sort_hetatms will remove them for you, although why this problem
was apparently beyond the capability of the PDB itself to handle is a
mystery.  When I encountered this a couple of years ago I spent longer
arguing with the annotator than writing the code.  Fortunately mmCIF
deposition does indeed seem to work more smoothly.

-Nat

Re: [ccp4bb] PyMol and Schrodinger

2014-04-23 Thread Nat Echols

On Wed, Apr 23, 2014 at 8:43 AM, Cygler, Miroslaw
miroslaw.cyg...@usask.cawrote:

 I have inquired at Schrodinger about the licensing for PyMol. I was
 surprised by their answer. The access to PyMol is only through a yearly
 licence. They do not offer the option of purchasing the software and using
 the obtained version without time limitation. This policy is very different
 from many other software packages, which one can use without continuing
 licensing fees and additional fees are only when an upgrade is needed. At
 least I believe that Office, EndNote, Photoshop and others are distributed
 this way.
 I also remember very vividly the Warren’s reason for developing PyMol, and
 that was the free access to the source code. He later implemented fees for
 downloading binary code specific for one’s operating system but there were
 no time restrictions on its use.
 As far as I recollect, Schrodinger took over PyMol distribution and
 development promising to continue in the same spirit.  Please correct me
 if I am wrong.
 I find the constant yearly licensing policy disturbing and will be looking
 for alternatives. I would like to hear if you have had the same experience
 and what you think about the Schrodinger policy.


This is no different than the licenses that for-profit companies are
required to purchase for most crystallography software.  In fact, it's
actually considerably more liberal than most software*, because as Jim
notes you can still obtain (and redistribute) most of the source code for
free.  From what I can tell Schrodinger has continued to make improvements
to the open-source core; some of the newer features (and the native Mac
GUI) are proprietary, but that was true ten years ago.

-Nat

(* although I believe ccp4mg is truly open-source like Coot, and unlike
CCP4 etc. which still require a license for commercial use.  Or am I
misinformed?)

Re: [ccp4bb] metal ion coordination

2014-04-23 Thread Nat Echols

On Wed, Apr 23, 2014 at 6:15 AM, World light bsub...@btk.fi wrote:

 This discussion is very informative to fresher like me. Moreover, with
 most of the reading suggested in this discussion I read about the
 positively charged metal ions like Na, Ca, Mg and many more. I am curious
 about Cl in specific which could occur as a result of salt used in
 different crystallization condition. Any information for the Cl ion
 co-ordiantes?


From Dauter  Dauter (2001) (http://www.ncbi.nlm.nih.gov/pubmed/11250204):

The coordination geometry of halide ions is not specific. . . Halide ions
usually accept hydrogen bonds from various donor groups from the protein
and neighboring water molecules. In addition, they make van der Waals
contacts with non-polar protein atoms. . . The halide anions are monoatomic
and polarizable, and consequently able to engage in both polar and
hydrophobic interactions. . . Of the sites that are best for phasing, most
contain halide ions that are hydrogen-bonded to amide nitrogen atoms,
either from the protein mainchain or asparagine and glutamine sidechains.
In addition, good sites often make ionic pairs with arginine or lysine
residues. Sometimes, hydrogen bonds to the hydroxyl groups of threonine or
serine residues can also be observed. All halide ions are in contact with
water molecules, which can be ordered or in the bulk solvent region.

I believe chloride is a little more predictable in this respect than the
other halides (especially iodine).  Also worth quoting:

All halide ions share their sites with water molecules. Their
coordination, appearance in electron-density maps and behavior during
structure refinement is almost identical to that of fully occupied water
molecules and only rarely is it possible to differentiate bromide or
chloride ions from waters, especially if the sites are partially occupied.
These ions can, however, be easily identified by their anomalous scattering
signal.

-Nat

Re: [ccp4bb] anomalous signal for Mg and Calcium

2014-04-21 Thread Nat Echols

On Mon, Apr 21, 2014 at 3:36 PM, Faisal Tarique faisaltari...@gmail.comwrote:

 Just in the continuation of my previous mail i again want to ask few
 question on the metalloprotiens..Apart from factors like occupancy, B
 factor, coordination sphere and metal ion-ligand distances to distinguish
 Mg or calcium, can anomalous signal  tell the identity and the type of
 metal ion bound to the protein,  specifically in the case of Mg and Calcium.


Short answer: if you see a peak in the anomalous difference map, it's
almost certainly calcium, but if you don't see a peak, you still can't rule
out calcium.

Longer answer: magnesium almost never has observable anomalous signal at
the wavelengths we normally use for data collection.  The exception is if
you collect extremely redundant data; Wayne Hendrickson has a very
convincing example of this (I saw it in a talk, but I'll see if I can find
a reference).  Calcium anomalous signal depends on the data quality, but
with good data and full occupancy it can show up in the anomalous
difference map even at the SeMet K edge (~0.9794Å).  However, this is not
guaranteed, especially if it's not very tightly bound.  At 2.6Å resolution
it may be more difficult to distinguish, especially if you have other
stronger anomalous scatterers.  Collecting very redundant data will help a
lot.

.An anomalous data analyzed through Xtriage (phenix) gives a signal of
 0.097 with Magnesium while the same gives a signal of 0.1062 with Caclium (
 both data sets showing Anomalous flag as true )..can anybody shed some
 light on which is more true ??


I don't understand this - what exactly is the difference between the
datasets?  Anyway, that number is really not intended to be interpreted
this way.


 the data has maximum resolution of 2.6A and i had kept Mg atom at the
 active site (  protein was incubated with 5mM MgCl2)..just because it is
 not matching a typical octahedral geometry and exact metal ion-oxygen
 distance as represented by Cambridge structural database (CSD) my reviewer
 has asked me to check anomalous signal for both Mg and Ca and ( he is
 expecting that scattering metal ion it to be Ca ) give appropriate reason
 for putting Mg there..please give suggestions.


In addition to the anomalous maps, check the difference map (Fo-Fc) and
B-factors after refinement with either element at full occupancy.  If it is
correctly identified, the difference map should be relatively flat and the
B-factor should be similar to the coordinating atoms.  Negative difference
map peaks and/or a high B-factor suggest that the element is too heavy;
positive peaks and/or low B-factors indicate the opposite.

-Nat

Re: [ccp4bb] EDS server - R-value

2014-04-04 Thread Nat Echols

On Fri, Apr 4, 2014 at 1:57 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote:

 A more up-to-date reason is that programs calculate R values very
 differently. If you take a PDB file refined with program X and put it
 into program Y you easily get discrepancies greater than 5%.


This is actually pretty rare - usually it's only 1-2% at most.
Discrepancies like 16.5% versus 30.9% usually indicate that there's
something wrong or misleading in the annotation of the entry, and often
mean that you can't even reproduce the R-factor with the specified program.

-Nat

Re: [ccp4bb] EDS server - R-value

2014-04-04 Thread Nat Echols

On Fri, Apr 4, 2014 at 9:36 AM, Alastair Fyfe af...@ucsc.edu wrote:

 The topic brings up a question that I've been wondering about for some
 time, perhaps someone can enlighten me. Why is it not standard practice to
 deposit  map coefficients along with structure factors ? Unlike image
 deposition there are no significant storage or file format issues. This
 would  preserve a record of the final refinement used for publication,
 bypassing the impossible task of recording/reconstructing  the program
 version and options used.


There *are* file format issues, they're just very silly.  I think the
problem is that the PDB deposition service ignores most columns in MTZ
files, even with standard labels that have not changed for years.  If you
deposit the reflections as mmCIF instead, and use the designated mmCIF
dictionary items for your map coefficients (or Fcalc, phases, etc.), it
will preserve them.  For instance:

http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=structfactstructureId=4OW3

I still don't think this solves the problem of faithfully recording the
refinement protocol - how do you know what method was used to calculate the
maps?

-Nat

Re: [ccp4bb] EDS server - R-value

2014-04-04 Thread Nat Echols

On Fri, Apr 4, 2014 at 10:39 AM, Alastair Fyfe af...@ucsc.edu wrote:

 Reconstructing the refinement may be necessary in some cases but  there
 are other applications (pdb-wide map statistics, development of map
 analysis tools, quick model vs map checks) where access to the depositor's
 final map would be sufficient.


I think these kinds of bulk analyses will be less effective if the maps are
not calculated consistently.  For instance, the question of how to handle
missing reflections can make a big difference.

Perhaps the coefficients are in fact included in many of the available
 mmCIF  files? I should check..


No, because most of those mmCIF files were probably converted from MTZ
format by the deposition server(s).

-Nat

Re: [ccp4bb] Add an atom in Coot

2014-03-18 Thread Nat Echols

On Tue, Mar 18, 2014 at 6:59 PM, Remie Fawaz-Touma remiefa...@gmail.comwrote:

 how do you place the pointer if there is no bond there? (just density) I
 am trying to connect 2 sugars creating 2 bonds to one oxygen that I have to
 add (oxygen does not exist now).


On my Mac, I can change the pointer position by holding down the Control
key and left-dragging the mouse.  I would be surprised if this didn't work
on Linux too - not sure about Windows.

Editing the PDB file by hand is very risky and far too much work for what
you want to accomplish.

-Nat

Re: [ccp4bb] Validity of Ion Sites in PDB

2014-03-06 Thread Nat Echols

On Thu, Mar 6, 2014 at 11:45 AM, Keller, Jacob kell...@janelia.hhmi.orgwrote:

 I was curious whether there has been a rigorous evaluation of ion binding
 sites in the structures in the pdb, by PDB-REDO or otherwise. I imagine
 that there is a considerably broad spectrum of habits and rigor in
 assigning solute blobs to ion X or water, and in fact it would be difficult
 in many cases to determine which ion a given blob really is, but there
 should be at least some fraction of ions/waters which can be shown from the
 x-ray data and known geometry to be X and not Y. This could be by small
 anomalous signals (Cl and H2O for example), geometric considerations, or
 something else. Maybe this does not even matter in most cases, but it might
 be important in others...


A couple of references:

http://www.ncbi.nlm.nih.gov/pubmed/18614239
http://www.ncbi.nlm.nih.gov/pubmed/24356774

Anecdotally, it is not difficult to find incorrect structures; in fact, one
of mine has magnesium ions at crystal contacts with big Fo-Fc and
anomalous map peaks, and I know it's not the only such structure in the
PDB.  However, while there are plenty of examples that are clearly wrong,
it is difficult to come up with strict rules that apply to typical MX data
- at 3Å resolution (or even better), a native Zn-binding site might have
bond valences that are just awful.  This doesn't mean the metal assignment
is wrong.  The placement of waters alone has a huge impact on such
calculations.

-Nat

Re: [ccp4bb] Table in NSMB

2014-02-18 Thread Nat Echols

On Tue, Feb 18, 2014 at 8:19 AM, Jan van Agthoven janc...@gmail.com wrote:

 I'm filling out my table for NSMB, about a structure of protein ligand
 bound to a receptor. They ask for 3 different lines regarding number
 of atoms  bfactor. 1) Protein 2) Ligand/Ion 3) Water.
 Does my protein ligand belong to Protein or  Ligand/Ion?


Why not list them each explicitly?  In my experience the recommended table
of crystallography statistics for most journals is just a suggestion, not a
strict format.  If you leave out information they might complain, but
surely they won't object if you include additional details.  (They usually
just exile it to the unformatted supplementary materials anyway.)

-Nat

Re: [ccp4bb] Sister CCPs

2014-02-13 Thread Nat Echols

One comment (not a complaint) on all this: it seems like the same questions
get asked over and over again.  If there is a good place for a general
crystallography FAQ list it is well past time for one to be put together -
or maybe it just needs to be better advertised?  At a minimum, for instance:

- what cryoprotectant should I use?
- how do I get big single crystals?
- how do I improve diffraction?
- how can I tell if I've solved my structure?
- why is my R-free stuck?
- is pick random statistic suitable for publication?

Some of the other common queries (name my blob!) still need to be handled
on a case-by-case basis, but it would be much more efficient for everyone
if the standard answers were collected somewhere permanent.

-Nat



On Thu, Feb 13, 2014 at 7:05 AM, Eugene Valkov eugene.val...@gmail.comwrote:

 I absolutely agree with Juergen.

 Leaving aside methods developers, who are a completely different breed,
 there is no such thing as a crystallographer sitting in a dark room
 solving structures all day. If there are, these are anachronisms destined
 for evolutionary demise.

 More and more cell biologists, immunologists and all other kinds of
 biologists are having a go at doing structural work with their molecules of
 interest themselves without involving the professionals. Typically, they
 learn on the job and they need advice with all kinds of things ranging from
 cloning and protein preps through to issues with tetartohedrally-twinned
 data and interpreting their structures.

 So, a modern structural biologist is one who is equipped for the wet lab
 and has some idea of how to go about solving structures. CCP4BB is a
 wonderful resource that is great for both the quality of the advice offered
 to those that seek it and for the variety of topics that are addressed in
 the scope of structural biology. I have learnt greatly from reading posts
 from very skilled and knowledgeable scientists at this forum and then
 implemented these insights into my own research. I am very grateful for
 this.

 In short, please do not discourage your colleagues, particularly very
 junior ones, from posting to the CCP4BB. Some of the questions may appear
 quaint or irrelevant but it is easy to simply ignore topics that are of no
 interest!

 Eugene


 On 13 February 2014 14:41, Bosch, Juergen jubo...@jhsph.edu wrote:

 Let me pick up Eleanor's comment:
 is there something like a crystallographer today ? I mean in the true
 sense ?
 I think as a crystallographer you won't be able to survive the next
 decade, you need to diversify your toolset of techniques as pointed out in
 this article
 http://www.nature.com/naturejobs/science/articles/10.1038/nj7485-711a

 And I'm not quite sure how software developers see themselves, as I would
 argue they are typically maybe not doing so much wet lab stuff related to
 crystallography (I may be wrong here) but rather code these days.

 What type of crystallographer is a software developer ?

 I think like our beloved crystals we come in different flavors. And we
 need to train the next generation of students with that perspective in mind.

 Just my two cents on a snowy day (30cm over night)

 Jürgen
 ..
 Jürgen Bosch
 Johns Hopkins University
 Bloomberg School of Public Health
 Department of Biochemistry  Molecular Biology
 Johns Hopkins Malaria Research Institute
 615 North Wolfe Street, W8708
 Baltimore, MD 21205
 Office: +1-410-614-4742
 Lab:  +1-410-614-4894
 Fax:  +1-410-955-2926
 http://lupo.jhsph.edu

 On Feb 13, 2014, at 6:41 AM, Eleanor Dodson eleanor.dod...@york.ac.uk
 wrote:

 I agree with Frank - it keeps crystallographers modest to know how
 challenging wet lab stuff still is..
 Eleanor

 On 12 February 2014 19:23, Robbie Joosten robbie_joos...@hotmail.com
 wrote:

 It's not an e-mail bulletin board, but Researchgate seems to be quite
 popular for wet lab questions. IMO the QA section of the social network
 is
 a bit messy. That said, the quality seems to improve gradually.

 Cheers,
 Robbie

 Sent from my Windows Phone
 
 Van: Paul Emsley
 Verzonden: 12-2-2014 19:23
 Aan: CCP4BB@JISCMAIL.AC.UK
 Onderwerp: Re: [ccp4bb] Sister CCPs


 On 12/02/14 15:59, George Sheldrick wrote:

 It would be so nice to have a 'sister CCP' for questions aboud wet-lab
 problems that have nothing to do with CCP4 or crystallographic
 computing, The is clearly a big need for it, and those of us who try
 to keep out of wet-labs would not have to wade though it all.



 FWIW, the remit of CCP4BB, held at jiscmail-central, is describes as:

 /The CCP4BB mailing list is for discussions on the use of the CCP4
 suite, and macromolecular crystallography in general./



 Thus wet-lab questions are not off-topic (not that anyone recently
 described them as such).

 Having said that, Jiscmail mailing lists are easy to set-up (providing
 that you can reasonably expect that the mailing list will improve
 knowledge sharing within the UK

Re: [ccp4bb] Cryo solution for crystals grown in magnesium formate

2013-12-16 Thread Nat Echols

On Mon, Dec 16, 2013 at 1:36 PM, Xiao, Junyu jx...@mail.ucsd.edu wrote:

  Dear all, sorry if this topic does not interest you. I wonder whether
 anyone has experience with freezing crystals grown in ~0.2 M Magnesium
 Formate. Garman and Mitchell suggested that A major anomaly is solution
 44, 0.2 M magnesium formate, which requires 50% glycerol for
 cryoprotection in their 1996 paper (J Appl. Cryst.  29, 584-587).  Since
 50% glycerol is kind of harsh, I wonder whether anyone has tried
 alternative cryo protectant. Your kind help will be highly appreciated.


Another good reference:

http://journals.iucr.org/j/issues/2002/05/00/do0015/index.html

It suggests 35% PEG 400, 30% ethylene glycol, or 30% of whatever PG means
(based on the rest of the paper I suspect propanediol, but the abbreviation
doesn't really make sense - perhaps Eddie Snell can clarify).  There are of
course many other good cryoprotectants beyond those evaluated in the paper;
personally, I'm a big fan of xylitol (which I believe will work in lower
concentrations - at least with some conditions), but what really matters is
what the crystals can tolerate.

Note that these estimates are using very strict criteria - you can often
get away with less cryoprotection if you are very good at freezing crystals
and/or willing to tolerate some increased background.  But I wouldn't try
this until you've determined that your crystals can't handle the
recommended amounts.

-Nat

Re: [ccp4bb] Comparison of Water Positions across PDBs

2013-11-06 Thread Nat Echols

On Wed, Nov 6, 2013 at 12:39 AM, Bernhard Rupp hofkristall...@gmail.comwrote:

 Hmmm….does that mean that the journals are now the ultimate authority of
 what stays in the PDB?

 I find this slightly irritating and worthy of change.


http://www.wwpdb.org/UAB.html

It is the current wwPDB (Worldwide PDB) policy that entries can be made
obsolete following a request from the people responsible for publishing it
(be it the principal author or journal editors).

I'm not sure I understand why things should be any different; the PDB is
not advertising itself as anything other than an archival service, unlike
the journals which are supposed to be our primary mechanism of quality
control.

-Nat

Re: [ccp4bb] Comparison of Water Positions across PDBs

2013-11-05 Thread Nat Echols

On Tue, Nov 5, 2013 at 12:22 AM, Bernhard Rupp hofkristall...@gmail.comwrote:

 Given their otherwise almost paranoid sensitivity to ultimate author
 authority

 (resulting in things like still having 2hr0 etc in the bank because
 certain authors go AWOL or ignore major issues)


In defense of the PDB, it's not just the authors who went AWOL in that case
- it is ultimately the responsibility of the journals to retract clearly
fraudulent publications.

-Nat

Re: [ccp4bb] MacBook Pro graphics card options

2013-10-23 Thread Nat Echols

On Wed, Oct 23, 2013 at 1:10 PM, Kristin Low kristin@queensu.ca wrote:

 I’m looking at upgrading my current laptop to a newer MacBook Pro. I’m
 torn as to whether I need integrated vs discrete graphics for structural
 biology, including molecular modelling, especially since the latest
 advances by Intel in terms of integrated graphics. Right now with the new
 releases, the options are between Intel Iris Pro (5200 series) and Intel
 Iris Pro + Nvidia GT 750M.


It depends on how demanding your graphics needs are.  I have no experience
with the Iris Pro chips, but I've been using a MacBook Air from late 2011
almost exclusively for most of the last two years, including heavy use of
Coot and PyMOL, and the only times I've been annoyed by slow graphics is
when I'm trying to visualize a very large and/or high-resolution region of
density.  And even that doesn't happen too often.  Most of the time it runs
very smoothly.  However, there are some persistent glitches with the
graphical display in Coot - sometimes I get weird visual artifacts, or
messed up depth perception.  Whether this is the fault of Coot, XQuartz, or
Intel is unknown.  But speed is not an issue.

The premium for the model with the Nvidia chip is quite steep at $600
(perhaps it's less with academic discount?).  I don't think the graphics
upgrade alone is worth it - but I'd be very tempted by the faster
processor, doubled memory, and doubled SSD, all of which will come in handy
when refining or rendering.

Disclaimer: I know absolutely nothing about the availability of stereo
options with any of these systems.  It's possible the NVidia card has
additional capabilities in that respect.

-Nat

Re: [ccp4bb] Problematic PDBs

2013-10-17 Thread Nat Echols

On Thu, Oct 17, 2013 at 6:51 AM, Lucas lucasbleic...@gmail.com wrote:

 I wonder if there's a list of problematic structures somewhere that I
 could use for that practice? Apart from a few ones I'm aware of because of
 (bad) publicity, what I usually do is an advanced search on PDB for entries
 with poor resolution and bound ligands, then checking then manually,
 hopefully finding some examples of creative map interpretation. But it
 would be nice to have specific examples for each thing that can go wrong in
 a PDB construction.


This would be a good place to start:

http://www.ncbi.nlm.nih.gov/pubmed/23385452

The retracted ABC transporter structures are also good, although less
obvious to the untrained eye.  I forget what the PDB IDs are but I'll see
if I can dig them up.

-Nat

Re: [ccp4bb] OT: Who's Afraid of Peer Review?

2013-10-10 Thread Nat Echols

On Wed, Oct 9, 2013 at 6:56 PM, Marco Lolicato chimbio...@gmail.com wrote:

 Anyway, for those reasons and more, I was wondering if maybe is nowadays
 needed to revisit the peer-review process.



Apologies for the lengthy response, but I really do think the current
publication system is broken, just not for the same reasons as others.

One of the interesting suggestions put forth by Michael Eisen (PLoS
co-founder, and author of the previously-linked rant), among others, is
that post-publication peer review should become more widely used and
partially substitute for the current system.  This has always been done
informally at conferences, journal clubs, ccp4bb emails, and so on, but
this is all very unstructured and not always public.  The only formal
outlets are by writing a letter to the editor of a journal, which is a very
time-consuming process, or by writing a more thorough follow-up article
dissecting the problems.  There are some advantages to this - the discovery
of fraudulent structures would probably not have been as widely noticed if
the analysis took the form of blog comments.  However, the overhead (in
time and effort) is so massive as to deter all but the most determined
scientists.  At a minimum, I'd like to see a more structured but very
lightweight way to discuss *and track* problems with the literature,
ideally at the source(s).  For instance, if I go to this PDB entry, there
is absolutely no indication of anything suspicious:

http://www.rcsb.org/pdb/explore/explore.do?structureId=2hr0

If I follow the publication links, I can see that there is a brief
communication arising associated with the article:

http://www.nature.com/nature/journal/v444/n7116/full/nature05258.html

but since Nature's editors watered down the letter and accepted a response
that did nothing to address any of the questions, it's difficult for an
non-expert to reach any conclusions.  Few of the multiple derived databases
have anything either; Proteopedia is the big exception.  One has to do a
surprising amount of digging to find out that the senior author's
university publicly disclaimed the structure as fraudulent.  There is also
a very large volume of comment on the ccp4bb, including some excellent
specific (illustrated!) examples of problems with the structure.  But none
of this is centrally available, because the journal (and databases) do not
provide any mechanism other than the lengthy formal route.

It's true that a little Googling will quickly uncover problems with this
particular paper.  However, from what I've seen there are depressingly many
scientists who are unable to use Google even for the questions they already
know to ask.  And this is an exceptional case; there are many other
problematic structures (anyone working in methods development probably has
a long list) for which no such information is available, because we don't
have time to write a formal letter to the journal editors, especially since
there's no guarantee that they'll even pay attention to us.  It would be
far more efficient if I could simply post a comment at the source saying
ligand density does not support binding - see attached image.

In the long term, if there was a better system for this kind of peer
review, the current system could mostly go away.  Post a manuscript on
arXiv (or equivalent), let the community comment on it and rate it, and the
eventual consensus determines its credibility and importance.  Scientists
would stop wasting months tailoring the paper to fit into arbitrary and
obsolete length restrictions or to impress the editors of high-profile
journals or please a handful of anonymous reviewers.  There would be no
disincentive to publish negative results or brief technical write-ups.
Both publication and review would be immediate, inexpensive, and public.
The scientific literature would become truly self-correcting over any time
scale.

Undoubtedly there are issues with this, and I'm sure there are other
approaches that could work too.  But the present system is both horribly
inefficient and too permissive of outright junk, and I think it's really
holding us back.

-Nat

Re: [ccp4bb] השב: [ccp4bb] Why nobody comments about the Nobel committee decision?

2013-10-09 Thread Nat Echols

Levitt also contributed to DEN refinement (Schroder et al. 2007, 2010).

-Nat


On Wed, Oct 9, 2013 at 2:29 PM, Boaz Shaanan bshaa...@bgu.ac.il wrote:

  Good point. Now since you mentioned contributions of the recent Nobel
 laureates to crystallography Mike Levitt also had a significant
 contribution through the by now forgotten Jack-Levitt refinement which to
 the best of my knowledge was the first time that x-ray term was added to
 the energy minimization algorithm. I think I'm right about this. This was
 later adapted by Axel Brunger in Xplor and other progrmas followed.
 Cheers, Boaz



  הודעה מקורית 
 מאת Alexander Aleshin aales...@sanfordburnham.org
 תאריך: 10/10/2013 0:07 (GMT+02:00)
 אל CCP4BB@JISCMAIL.AC.UK
 נושא [ccp4bb] Why nobody comments about the Nobel committee decision?


  Sorry for a provocative question, but I am surprised why nobody
 comments/congratulations laureates with regard to recently awarded Nobel
 prizes? However, one of laureates  in chemistry contributed to a popular
 method in computational crystallography.
 CHARMM - XPLOR - CNS - PHENIX-…

 Alex Aleshin

Re: [ccp4bb] Rmerge of the last shell is zero

2013-08-14 Thread Nat Echols

On Wed, Aug 14, 2013 at 10:31 PM, Edward A. Berry ber...@upstate.eduwrote:

 If you refine once in phenix you can use phenix.cc_star to calculate cc*
 and compare with R and R-free; from the output mtz file and your unmerged
 .sca file.


FYI, this should also work with structures refined in Refmac, assuming it
can recalculate the R-factors to within a reasonable margin of error.

-Nat

Re: [ccp4bb] mmCIF as working format?

2013-08-07 Thread Nat Echols

On Wed, Aug 7, 2013 at 12:54 PM, James Stroud xtald...@gmail.com wrote:

 All that needs to happen is that the community agree on

 1. What is the finite set of essential/useful attributes of macromolecular
 structural data.
 2. What is the syntax of (a) accessing and (b) modifying those attributes.
 3. What is the syntax of selecting subsets of structural data based on
 those attributes.

 The resulting syntax (i.e. language) itself should be terse, easy to
 learn, easy to use, and preferably easy to implement.


Ah, but the nice thing about mmCIF is that it isn't truly finite - the
PDB may limit what tags are actually included in the distributed files, but
there is nothing preventing other developers from including their own tags,
and there is a community process for extending the officially defined
tags.  Item (2) is very well-established, unlike the current chaos of
REMARK records.  I think (3) will be left to the various libraries to deal
with.

-Nat

Re: [ccp4bb] mmCIF as working format?

2013-08-07 Thread Nat Echols

On Wed, Aug 7, 2013 at 2:36 PM, James Stroud xtald...@gmail.com wrote:

 Although it is likely the best library for working with structural data,
 CCTBX requires a loop just to change a specific chain ID (to the best of my
 knowledge):

 ...

 I don't intend to pick on CCTBX specifically (because the CCTBX developers
 have specific needs to which they program), but loop/test mechanisms are
 awkward for selecting and modifying structural data, and get much more
 awkward as selections get more complex (e.g. selecting the C-alpha of every
 alanine of chain A, etc.).


True - it's really an issue of what purpose the libraries were designed
for.  CCTBX wasn't intended to be a general-purpose tool for users to
perform quick manipulations of a model; the goal was to build large,
complex, and more-or-less automated crystallography applications on top of
it.  (The same applies to the CCP4 libraries, mmdb, clipper, etc.;
BioPython I guess is designed for bioinformatics.)  The design of CNS (for
example) reflects an era where it was much more likely that the average
crystallographer knew some programming, worked exclusively on the command
line, built new models manually, and didn't have access to a large number
of convenient tools for purposes like this.  (Or so I've heard; I was in
still in high school.)

Personally, if I need to change a chain ID, I can use Coot or pdbset or
many other tools.  Writing code for this should only be necessary if you're
processing large numbers of models, or have a spectacularly misformatted
PDB file.  Again, I'll repeat what I said before: if it's truly necessary
to view or edit a model by hand or with custom shell scripts, this often
means that the available software is deficient.  PLEASE tell the developers
what you need to get your job done; we can't read minds.

-Nat

Re: [ccp4bb] mmCIF as working format?

2013-08-05 Thread Nat Echols

On Mon, Aug 5, 2013 at 11:11 AM, Phil Jeffrey pjeff...@princeton.eduwrote:

 While alternative programs exist to do almost everything I prefer
 something that works well, works quickly, and provides instant visual
 feedback.  CCP4 and Phenix are stuck in a batch processing paradigm that I
 don't find useful for these manipulations.


Speaking as a developer, it's probably much easier and faster for us to
write software that *does* do what you want, instead of piling on hacks to
keep the PDB format alive another 30+ years.

While PDB is limited and has a lot of redundant information it's for the
 latter reason it's a rather useful format for quickly making changes in a
 text editor.  It's certainly far faster than using any GUI, and it's also
 faster than the command line in many instances - and I have my own command
 line programs for hacking PDB files (and ultimately whatever formats come
 next)


Most complaints of this sort seem to be based on an unrealistic expectation
that your own experiences and skills are representative of the rest of the
community.  The vast majority of crystallographers don't have their own
command-line programs, aren't familiar with the intricacies of PDB format,
and as often as not botch the job when they attempt to edit their PDB files
by hand.  (I get a lot of bug reports like this.)  They're not going to
care whether they can use 'awk' on their structures.


 Using mmCIF as an archive format makes sense, but I doubt it's going to
 make building structures any easier except for particularly large
 structures where some extended-PDB format might work just as well or better.


There is a lot of information that can't easily be stored simply by making
the ATOM records wider.  Right now some of this gets crammed into the
REMARK section, but usually in an unstructured and/or poorly documented
format.  This isn't just problematic for archival - it limits what
information can be transferred between programs.  mmCIF has none of these
limitations.  I have some reservations about the current specification (for
instance, the fact that the original R-free flags are not stored separately
in deposited structure factor files, and are instead mixed into the
status flag, which can have multiple other meanings), but at least there
is a clear process for extending this in a way that does not (or should
not, anyway) break existing parsers.

-Nat

Re: [ccp4bb] mmCIF as working format?

2013-08-05 Thread Nat Echols

On Mon, Aug 5, 2013 at 12:37 PM, Boaz Shaanan bshaa...@bgu.ac.il wrote:

  There seems to be some kind of a gap between users and developers as far
 the eagerness to abandon PDB in favour of mmCIF. I myself fully agree with
 Jeffrey about the ease of manipulating PDB's during work, particularly when
 encountering unusual circumstances (and there are many of those, as we all
 know). And how about non-crystallographers that are using PDB's for
 visualization and understanding how their proteins work? I teach many such
 students and it's fairly easy to explain to them where to look in the PDB
 for particular pieces of information relevant to the structure. I can't
 imagine how they'll cope with the cryptic mmCIF format.


I think the only gap is between developers and *expert* users - most of the
community simply wants tools and formats that work with a minimum of
fiddling.  Again, if users are having to examine the raw PDB records
visually to find information, this is a failure of the software.

-Nat

Re: [ccp4bb] Where to cut the data in this medium resolution dataset

2013-07-22 Thread Nat Echols

On Mon, Jul 22, 2013 at 10:19 AM, Stefan Gajewski sgajew...@gmail.comwrote:

 The maps shows signs of over fitting, the B-factors do not look correct in
 my opinion.


What do correct B-factors look like?  What refinement strategy did you
use for them?


 Note that the R-free value in the 3.4A shell is lower than the R-work (and
 also the Rpim in that shell!) which clearly indicates this refinement was
 not stable.


I don't think it indicates anything about the stability of refinement -
my guess would be that the NCS is biasing R-free.  I suppose it could also
indicate that the data in the 3.6-3.4 range are basically noise, although
if the maps look better then that would suggest the opposite.

 The structure contains no beta sheets and refinement also profits greatly
 from very rigid high-order NCS. The maps are very detailed, in fact better
 than some 2.8A maps I've seen before.  The 0.2A in question here are
 actually quite helpful to increase the map quality, so I keep wondering if
 I should deposit the structure with them or keep them only for my own
 interpretation.


I would deposit the data to 3.4Å in any case; what cutoff you refine the
structure to is a separate decision.

Before I continue optimizing the integration/refinement I would like to
 hear suggestions from the experts where to make the resolution cut-off in
 this case?
 Do I have all information I need to make that decision?
 What arguments should I present when dealing with the reviewers? I mean,
 the Rrim/Rmerge values are really very high.


Do what Karplus  Diederichs suggest: take the structure refined to 3.4Å,
and recalculate the R-factors for that model with the data cut to 3.6Å.  If
the R-free calculated this way is below the R-free for the model refined to
only 3.6Å, then the extra 0.2Å is contributing real information and
improving the quality of your model, which is the best justification for
extending to higher resolution.

-Nat

Re: [ccp4bb] Refinement of partly occupied water molecules

2013-07-12 Thread Nat Echols

On Fri, Jul 12, 2013 at 1:08 AM, Stefan Krimmer 
krim...@staff.uni-marburg.de wrote:

 in some of my macromolecular crystal structures with resolutions between
 1.1 - 1.4 Å, several round positive Fo-Fc electron density blobs are
 detectable which show after assignment of a water molecule to these blobs
 and subsequent refinement with Phenix.refine a good-looking  2Fo-Fc
 electron density. However, there also occurs a small negative Fo-Fc
 electron density detectable inside the 2Fo-Fc density blob. The negative
 Fo-Fc electron density disappears if the occupancy of the water molecule is
 automatically refined by Phenix.refine (occupancy manually set to a value
 below 100% followed by refinement) or manually set to 50% and fixed for
 this value (Fix occupancy option in phenix.refine). Therefore, I think
 these positions are partly occupied by water molecules, but I am not sure
 how I should handle it/how it is generally handled. Which one of the two
 options described above is the better one? I would be thankful for any
 advice and/or literature about this topic.


When I had to deal with this in the past, I followed this advice (from
Thomas Schneider):

http://www.embl-hamburg.de/~tschneider/shelxl/shelxl_faq/shelxlfaq.html#Q16

This is especially true at the resolutions you're working with; even with
subatomic resolution data I believe that the observation in the FAQ (that
refining the occupancies doesn't improve R-factors and may even make them
worse) will be true in most cases - and regardless of program used, btw.
(I can't remember if I ever tried comparing the outcomes myself, though.)

-Nat

Re: [ccp4bb] ctruncate bug?

2013-06-22 Thread Nat Echols

On Sat, Jun 22, 2013 at 3:18 PM, Frank von Delft 
frank.vonde...@sgc.ox.ac.uk wrote:

  In what scenarios would these improved estimates make a significant
 difference?


Perhaps datasets where a unusually large number of reflections are very
weak, for instance where TNCS is present, or where the intensity falls off
quickly at lower resolution (but remains detectable much further)?

-Nat

Re: [ccp4bb] Concerns about statistics

2013-06-13 Thread Nat Echols

On Thu, Jun 13, 2013 at 8:15 AM, Andrea Edwards edwar...@stanford.eduwrote:

 I have some rather (embarrassingly) basic questions to ask. Mainly.. when
 deciding the resolution limit, which statistics are the most important? I
 have always been taught that the highest resolution bin should be chosen
 with I/sig no less than 2.0, Rmerg no less than 40%, and %Completeness
 should be as high as possible. However, I am currently encountered with a
 set of statistics that are clearly outside this criteria. Is it acceptable
 cut off resolution using I/sig as low as 1.5 as long as the completeness is
 greater than 75%? Another way to put this.. if % completeness is the new
 criteria for choosing your resolution limit (instead of Rmerg or I/sig),
 then what %completeness is too low to be considered? Also, I am aware that
 Rmerg increases with redundancy, is it acceptable to report Rmerg (or Rsym)
 at 66% and 98% with redundancy at 3.8 and 2.4 for the highest resolution
 bin of these crystals? I appreciate any comments.


A (probably) better way:

http://www.ncbi.nlm.nih.gov/pubmed/22628654

Short version: don't try to use simplistic rules, instead use all data
that actually improve the model.  In practice, what I've noticed in some
recent articles is (paraphrasing) data extend to 2.5Å with an I/sigma of 2
in the highest-resolution shell, but we used data to 2.2Å as suggested by
Karplus  Diederichs.  This allows you to actually use as much data as
possible while still (hopefully) pleasing any pedantic reviewers.
(Substitute 90% completeness or R-merge of whatever for the I/sigma cutoff
if you prefer, the end result will still be the same.)

-Nat

Re: [ccp4bb] Extracting .pdb info with python

2013-06-06 Thread Nat Echols

On Fri, Jun 7, 2013 at 8:37 AM, Pete Meyer pame...@mcw.edu wrote:

 On the other hand, programming an implementation of something is a good
 way to make sure that you really understand it - even if you end up using
 another program.


I would argue that it's not really necessary to understand the column
formatting of a PDB file, any more than it's necessary to understand how
binary data is arranged in an MTZ file.  (Especially since the long-term
plan is to migrate to mmCIF, which is more flexible and can store far more
information.)  We're ultimately trying to answer questions of biology and
chemistry, not informatics, and writing a parser that actually handles all
of the variety in the PDB (let alone the garbage produced by some programs)
is far more difficult than it sounds.

-Nat

Re: [ccp4bb] Off-topic: PDB statistics

2013-04-15 Thread Nat Echols

On Mon, Apr 15, 2013 at 11:47 AM, James Holton jmhol...@lbl.gov wrote:

 However, I'm sure the day is not far off when phenix.refine or the like
 will check if the starting R factor is too high and just automatically
 invoke a run of MR to see if something clicks.


I think the latest Phaser code actually does the reverse: if the R-factor
is already relatively low, it just outputs the search model.  The more
problematic (and very common) situation is where the structures are
slightly isomorphous and rigid-body plus restrained refinement alone could
work, but MR might work better - I don't think anyone has comprehensively
evaluated this.  We usually just run Phaser because compared to rebuilding
and refinement, it's simply not that much of a bottleneck.

-Nat

Re: [ccp4bb] CCP4 Update victim of own success

2013-04-12 Thread Nat Echols

On Fri, Apr 12, 2013 at 10:27 AM, James Holton jmhol...@lbl.gov wrote:

  But, when it comes to GUIs, I have always found them counterproductive.
 In my humble opinion, the purpose of computers and other machines is to DO
 work for me, not create work for me, and I already have enough buttons to
 push each day.


This is a very defensible position with regards to your normal workflow (or
mine) - but beamline scientists (or software developers) are not very
representative of crystallographers as a group.  I've seen a lot of
reflexive anti-GUI mentality from users who don't fall into either
category, presumably because a senior postdoc or PI told them real
crystallographers use the command line, when in reality they'd be better
served by figuring out on their own what workflow is most efficient for
them.

-Nat

Re: [ccp4bb] CCP4 Update victim of own success

2013-04-12 Thread Nat Echols

On Fri, Apr 12, 2013 at 2:45 PM, Boaz Shaanan
bshaa...@exchange.bgu.ac.ilwrote:

  Whichever way the input file for the run is prepared (via GUI or command
 line), anybody who doesn't inspect the log file at the end of the run is
 doomed and bound to commit senseless errors. I was taught a long time ago
 that computers always do what you told them to do and not what you think
 you told them, which is why inspecting the log file helps.


I agree in principle - I would not advocate that anyone (*especially*
novices) run crystallography software as a black box.  However, whether
or not a program constitutes a black box has nothing to do whether it runs
in a GUI or not.  The one advantage a GUI has is the ability to convey
inherently graphical information (plots, etc.).  That it is still necessary
to inspect the log file(s) carefully reflects the design of the underlying
programs; ideally any and all essential feedback should also be displayed
in the GUI (if one exists).  Obviously there is still much work to be done
here.

-Nat

Re: [ccp4bb] delete subject

2013-03-28 Thread Nat Echols

On Thu, Mar 28, 2013 at 11:28 AM,  mjvdwo...@netscape.net wrote:
 Although it is hard to imagine, there could be a mechanism by
 which you make all your data public, immediately when you get it and this
 public record shows who owns it.

http://deposit.rcsb.org

(or international equivalent)

 The advantage (in my mind) of such a system would be that you would also
 make public the data that does not make sense to you (it does not fit your
 scientific model) and this could (and has) lead to great discoveries.  The
 disadvantage to the method is that you will sometimes post experiments that
 are just completely wrong

There is a further problem: since as Frank pointed out, structures are
increasingly less valuable without accompanying non-crystallographic
experiments, there is a risk of other groups taking advantage of the
availability of data and performing the experiments that *you* had
hoped to do.  Or, similarly, a group who already has compelling
biochemical data lacking a structural explanation would immediately
have everything they needed to publish.  Either way, you would be
deprived of what might have been a thorough and genuinely novel
publication.   Since most employment and funding decisions in the
academic world are made on the basis of original and high-profile
research and not simply number of structures deposited in the PDB,
this puts the crystallographer at a distinct disadvantage.

This isn't purely hypothetical - a grad school classmate who worked on
genome sequences complained about the same problem (in her case, the
problem was bioinformatics groups analyzing the data - freely
available on the NCBI site, as mandated by the funding agencies -
before the sequencing was even complete).

Of course the same argument has been used in the past against
immediate release of PDB entries upon publication - and the community
(quite appropriately, IMHO) rejected it as nonsense.  I actually like
the idea of releasing data ASAP without waiting to publish, but it has
a lot of practical difficulties.

-Nat

Re: [ccp4bb] Need specific molecular replacement test cases

2013-03-08 Thread Nat Echols

On Fri, Mar 8, 2013 at 11:38 AM, Raji Edayathumangalam
r...@brandeis.edu wrote:
 I am looking for two specific test cases (below) and appreciate anyone
 pointing me to known structures/examples for the same.

 (1) For a successful case of molecular replacement in which the search model
 has an overall sequence identity to the target in the twilight zone or worse
 (25% or less).

There are some good examples here:

http://journals.iucr.org/d/issues/2011/04/00/ba5163/index.html
http://journals.iucr.org/d/issues/2004/07/00/gx5015/index.html

-Nat

Re: [ccp4bb] How to slow down crystallization? Need hep!

2013-02-25 Thread Nat Echols

On Mon, Feb 25, 2013 at 8:02 AM, lei feng spartanfeng...@hotmail.com wrote:
 I need your suggestion for slowing down crystallization for my protein
 my protein got hit in PEG/ION #5 ( 0.2 M MgCl2, 20% PEG 3350, pH 5.9), but
 it crystallize too fast. In 1 hr I can see tons of tiny needles.
 Can anyone give me some suggestion on how to slow down the process? I used
 lower conc. of potein, lower conc. of PEG ( 10%), it helped a little bit,
 giving me small rod crystal. but no improvement after that.

Sometimes you can do this by adding a tiny amount of glycerol to your
protein solution - I've seen 0.5% make the difference between awful
plate clusters and nice individual crystals.

I think it's also possible to use a combination of low concentrations
and micro-seeding, but I've never done this personally.

-Nat

Re: [ccp4bb] Improving Homology Models

2013-02-20 Thread Nat Echols

On Wed, Feb 20, 2013 at 12:39 PM, Jacob Keller
j-kell...@fsm.northwestern.edu wrote:
 it has been my experience that homology modelling programs get folds pretty
 well, but sometimes the details are pretty obviously bad, like too-close
 contacts. One might think that the modelling software would put in a sort of
 polishing step, but they don't seem to. Is there any way to trick the CCP4
 or other software to fix these things, such as by simulated annealing or
 otherwise, I guess without any weight on the [non-existent] structure
 factors?

What software were you using?  There must be dozens of papers (at
least) on this subject, and assessment of refinement and model quality
is a major part of the CASP competition.  The Rosetta relax protocol
is one of the best known, but there are many other approaches
(including MD), some of which are definitely integrated into modeling
pipelines.  I'd start here:

https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d6/d41/relax_commands.html

and also:

https://www.rosettacommons.org/manuals/archive/rosetta3.4_user_guide/d5/d4e/comparative_modeling.html

Of course if the model is too awful, there isn't much that can be done
to relieve gross errors without completely rebuilding.  I don't know
what the radius of convergence of the various protocols is; Rosetta
relax certainly can't fix some of the truly awful models in the PDB
(but it's by no means the only option).

-Nat

Re: [ccp4bb] protein crystals or salt crystals

2013-02-07 Thread Nat Echols

If SPG buffer is what I think it is, that means you have a significant
concentration of inorganic phosphate, which forms salt crystals when
mixed with divalent metal ions.

-Nat

On Thu, Feb 7, 2013 at 2:24 PM, amro selem amro_selem2...@yahoo.com wrote:




 Hallo my colleagues.
  i hope every one doing ok . i did screening since two weeks . i noticed
 today this crystals. i don`t know either it salt or protein crystal . my
 protein has zero tryptophan so i could distinguish by UV camera.
 the condition was conditions:
 0.1M SPG buffer pH 8 and 25%PEG 1500. in addition to Nickle chlorid 1mM.


 best regards
 Amr

Re: [ccp4bb] Fwd: Strange Density

2013-02-04 Thread Nat Echols

On Mon, Feb 4, 2013 at 12:24 PM, Roger Rowlett rrowl...@colgate.edu wrote:
 It's possibly a transition metal ion. Zinc is a common adventitious
 contaminant of solutions. Typical Zn-O distances (tetrahedral or
 pseudo-tetrahedral coordination) are 2.0 A. ICP-OES or ICP-MS of the protein
 solution might offer a clue to the possible identity of the metal ion, since
 it appears to be nearly stoichiometric with your protein.

Zinc tends not to bind carbonyl oxygens, however, but calcium does
quite frequently.  Also, the presence of calcium acetate in the
crystallization solution strongly suggests that this is what is
actually bound (especially if it's at a concentration around 200mM, as
is common in many crystallization screens).  Jared: what do you mean
by proper coordination?  Surface ions bound non-specifically
frequently don't have recognizable coordination, and they become even
more vague as resolution decreases.  As always, it would be worth
looking at the anomalous difference map, although whether you'll
actually see anything depends on the element and on wavelength, data
quality, anomalous completeness, etc.

-Nat

Re: [ccp4bb] off topic: DSSP

2013-01-28 Thread Nat Echols

On Mon, Jan 28, 2013 at 8:04 AM, Antony Oliver
antony.oli...@sussex.ac.uk wrote:
 If you don't mind using the ksDSSP implementation, it is already installed 
 with the phenix suite if you have it.

Correct, but although the method is supposed to be the same, the
output is not, and there are bugs in how it presents helix
annotations.  So I'm not sure it's a reliable substitute for the
original DSSP - we use it in Phenix to calculate secondary structure
restraints, with some extra filtering to catch the buggy annotations.
(Unfortunately it was the only open-source program I could find for
this purpose.)

-Nat

Re: [ccp4bb] off topic: DSSP

2013-01-28 Thread Nat Echols

On Mon, Jan 28, 2013 at 8:39 AM, Robbie Joosten
robbie_joos...@hotmail.com wrote:
 DSSP recently went open source with a very liberal license. So you can
 consider using the real DSSP now. This may also be the moment to integrate
 DSSP in CCP4.

Based on the info here:

http://swift.cmbi.ru.nl/gv/dssp/

the license isn't very liberal - it looks more like the
proprietary-with-source-code licenses used by CCP4 and Phenix (among
others), which preclude redistribution or anything resembling
commercial use without permission.  I also saw this: The new DSSP,
is distributed as executable only. You can get the new DSSP source
code only if you can convince us that it is really needed for a worthy
scientific cause.

Or am I looking in the wrong place?

-Nat

Re: [ccp4bb] refmac5 vs phenix refine mixed up

2013-01-25 Thread Nat Echols

On Fri, Jan 25, 2013 at 2:24 AM, Robbie Joosten
robbie_joos...@hotmail.com wrote:
 Phenix however needs to deal with the CCP4 type reflection binning. Now the
 size of the sets cannot be used which means that you have find a smarter
 solution. So I wonder how this is implemented. Does Phenix use the
 (reasonable) assumption that the test set is labeled 1.00 or 0.00? Or does
 it also check the sets with other labels?

I forget the exact rules, but the general assumption is that if you
have multiple flag values (such as 0 through 19), the test set is
marked by the lowest value.  If you have just two values, the test set
is whichever is less common.  (For SHELX files this would typically be
-1, for CNS files it would be 1, but you could just as easily swap
flag values and it would still pick the correct set.)  I'm sure
someone can figure out a way to break this (for instance, by assigning
the flags with CCP4, but using 7 instead of 0 as the test set), but in
practice nearly every file we've seen obeys these rules, and it can of
course be overridden by the user.

Anyway this is all open-source, so you can check (and re-use!) the
logic for yourself here:

http://cctbx.svn.sourceforge.net/viewvc/cctbx/trunk/iotbx/reflection_file_utils.py?revision=16491view=markup

-Nat

Re: [ccp4bb] refmac5 vs phenix refine mixed up

2013-01-24 Thread Nat Echols

On Thu, Jan 24, 2013 at 10:34 AM, Leonid Sazanov
saza...@mrc-mbu.cam.ac.uk wrote:
 Most likely scenario is that Phenix by default assigns Rfree flag as 1, while 
 ccp4/refmac - as 0.
 That would explain your Rfree going down - because your Rfree reflections 
 were refined by refmac.

According to Garib, the current version of Refmac will automatically
switch to the proper flags, so this problem should go away.

-Nat

Re: [ccp4bb] B-factors

2013-01-24 Thread Nat Echols

On Thu, Jan 24, 2013 at 3:52 PM, Urmi Dhagat udha...@svi.edu.au wrote:
 If Rfree reflections are refined my refmac upon switching from phenix to 
 refmac then does this contaminate the Rfree set ? Should swiching between 
 refinement programs Phenix and Refmac be avoided?

Repeating what was said earlier today: if you use the newest version
of Refmac, either in CCP4 6.3 or downloaded from Garib's homepage, you
will not have any problem switching back and forth with Phenix (any
version).

-Nat

Re: [ccp4bb] Mac mini advice

2013-01-22 Thread Nat Echols

On Tue, Jan 22, 2013 at 9:59 AM, Cara Vaughan
c.vaug...@mail.cryst.bbk.ac.uk wrote:
 I've seen from the archive that some people do use the Mac Mini for
 crystallography and I've got two questions:
 1. Do I need the Quad core or is a Dual core processor enough?

You can survive with the dual, but I would definitely spring for the
quad if you can afford it - but I suppose it depends on how you work.
I like to run multiple of jobs at once if possible, and still have a
core left for the web browser, etc.  Some programs will also take
advantage of multiple processors, for that matter.

I would definitely recommend maxing out the memory, but don't buy it
from Apple - we were able to get 16GB from CDW for less than $100.

 2. Is the intergrated Intel HD graphics card OK for crystallography
 requirements?

It depends on your requirements, but I've been using Coot and PyMOL
frequently on a MacBook Air for the last year, and usually the
graphics chip (also Intel HD) isn't the bottleneck.

-Nat

Re: [ccp4bb] Mac mini advice

2013-01-22 Thread Nat Echols

On Tue, Jan 22, 2013 at 10:05 PM, James Stroud xtald...@gmail.com wrote:
 On Mac v. Linux where calculations come secondary to office-type 
 calculations, you have to weigh your level of vendor lock-in. Do you run 
 Libreoffice or Microsoft Office? Inkscape or Illustrator? Gimp or Photoshop? 
 Etc. If you are locked-in to commercial products and haven't migrated to open 
 source, then you may want to think twice about a Linux box. Macs are very 
 seamless for an office environment, but I don't know if they are appropriate 
 for heavy-duty calculations given that you'll trade horsepower for the Mac 
 experience.

In my experience, yes they are (depending on your definition of
heavy-duty - everything I work with is either small or
low-resolution).  The real difficulty is integrating Macs into a
Linux-centric environment, for example configuring NFS, NIS, etc.
Far, far more painful than it needs to be, and for this reason I would
avoid Macs for shared workstations or (even worse) servers.  But they
make excellent standalone systems,  are very easy to maintain, and
while they may be relatively pricey, some of the premium features
(like SSDs) really do make a big difference, and the performance is
quite adequate even for the low-end laptops like the Air.  A $400
Celeron PC laptop, on the other hand, is probably large, heavy, and a
piece of junk.

-Nat

Re: [ccp4bb] how many metal sites

2013-01-16 Thread Nat Echols

On Wed, Jan 16, 2013 at 2:53 PM, Roger Rowlett rrowl...@colgate.edu wrote:
 When you are a building a metalloenzyme model you should really have some
 solid evidence that a metal ion is present by (1) inclusion in the
 crystallization medium, (2) direct determination by an analytical technique,
 (3) UV-visible spectroscopy (when appropriate--obviously Zn(II) is d10 and
 silent in the visible d-d transition wavelength range)  and/or (4)
 appropriate coordination geometry and bond lengths.

What about: (5) anomalous scattering (i.e. anomalous difference map)?
Even on a home source I suspect Zn should still be visible, and at
shorter wavelengths this should certainly be the case if the anomalous
data are reasonably good and complete.  The coordination geometry and
bond lengths aren't necessarily going to be definitive at this
resolution, although I agree that it should be approximately
tetrahedral.

-Nat

Re: [ccp4bb] a challenge

2013-01-14 Thread Nat Echols

On Mon, Jan 14, 2013 at 11:18 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote:
 I admit not having read all contributions to this thread. I understand
 the John Henry Challenge as whether there is an 'automated way of
 producing a model from impossible.mtz'. From looking at it and without
 having gone all the way to a PDB-file my feeling is one could without
 too much effort from the baton mode in e.g. coot.

This should be even more possible if one also uses existing knowledge
about the expected structure of the protein: a kinase domain is quite
distinctive.  So, James, how much external information from homologous
structures are we allowed to use?  Running Phaser would certainly be
cheating, but if I take (for instance) a 25% identical kinase
structure, manually align it to the map and/or a partial model, and
use that as a guide to manually rebuild the target model, does that
meet the terms of the challenge?

-Nat

Re: [ccp4bb] Fwd: Re: [ccp4bb] Convert cbf to png/tiff?

2013-01-11 Thread Nat Echols

I think the help message refers to another program.  Anyway, it's an
extremely simple script - having examined the code, the command-line
invocation is:

labelit.png input_file [output_file]

and that's it - no other options available.  But Nick or I will fix it
so it prints something more useful if run without arguments.

-Nat

On Fri, Jan 11, 2013 at 9:33 AM, Frank von Delft
frank.vonde...@sgc.ox.ac.uk wrote:
 I got that error blurb too when I run without an image on the commandline.
 Not very elegant.

 Try:
 labelit.png --help




 On 11/01/2013 16:34, Tim Gruene wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi Nat!

 How recent is recent? From today's 'phenix-online.org':
 New Phenix version 1.8.1 now available, but

 tg@slartibartfast:~/uni/datasets/nk/xds_run3$ labelit.png_1.8.1-1168
 DX-CORRECTIONS.cbf
 Traceback (most recent call last):
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/build/intel-linux-2.6-x86_64/../../labelit/command_line/png.py,
 line 27, in module
  OV = overlay_plain(infile,graphics_bin)
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/build/intel-linux-2.6-x86_64/../../labelit/command_line/png.py,
 line 6, in __init__
  OverlayDriverClass.__init__(self,infile,graphics_bin)
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/overlay_distl.py,
 line 19, in __init__
  self.I = GenericImageWorker(infile,binning=graphics_bin)
File
 /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/graphics/support.py,
 line 12, in __init__
  images = ImageFiles(imagenames,labelit_commands)
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/imagefiles.py,
 line 184, in __init__
  self.filenames = FileNames(arg_module,phil_params)
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/labelit/command_line/imagefiles.py,
 line 70, in __init__
  self.interface3_parse_command()
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/cctbx_project/spotfinder/diffraction/imagefiles.py,
 line 138, in interface3_parse_command

 self.interface3_FN_factory(os.path.abspath(file),error_message=File
 name not accepted)
File

 /xtal/Suites/Phenix/phenix-1.8.1-1168/cctbx_project/spotfinder/diffraction/imagefiles.py,
 line 127, in interface3_FN_factory
  raise Exception(Input error: +error_message)
 Exception: Input error: File name not accepted

 Compared to:
 tg@slartibartfast:~/uni/datasets/nk/xds_run3$ adxv -sa
 DX-CORRECTIONS.cbf DX-CORRECTIONS.tiff
 Adxv Version 1.9.8
 Copyright (C) 1994-2011 by Andrew Arvai, Area Detector Systems Corporation
 Recognized CBF format data.
 Warning: Could not find diffrn_frame_data or diffrn_data_frame
 tg@slartibartfast:~/uni/datasets/nk/xds_run3$ identify
 DX-CORRECTIONS.tiff
 DX-CORRECTIONS.tiff TIFF 768x768 768x768+0+0 8-bit PseudoClass 256c
 592KB 0.000u 0:00.000

 Cheers,
 Tim

 On 01/10/2013 09:59 PM, Frank von Delft wrote:

 Brilliant - thanks Nat!!  Easy to work around that feature.

 And thanks Nick!!




  Original Message  Subject: Re: [ccp4bb]
 Convert cbf to png/tiff? Date: Thu, 10 Jan 2013 12:47:21 -0800
 From: Nat Echols nathaniel.ech...@gmail.com To: Frank von
 Delft frank.vonde...@sgc.ox.ac.uk








 Using any recent Phenix distribution:

 labelit.png file_name

 For reasons unknown to me, the output is named plain.png - I
 will bug Nick about this.

 On Thu, Jan 10, 2013 at 12:36 PM, Frank von Delft
 frank.vonde...@sgc.ox.ac.uk wrote:

 Hello all - anybody know an easy way to convert CBF images
 (Pilatus) into something lossless like tiff or png?

 Ideally *easy* as in   r e a l l y   e a s y  and not requiring
 extensive installation of dependencies and stuff.  Because then I
 might as well write my own stuff using cbflib and PIL in python.

 Thanks! phx




 - -- - --
 Dr Tim Gruene
 Institut fuer anorganische Chemie
 Tammannstr. 4
 D-37077 Goettingen

 GPG Key ID = A46BEE1A

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (GNU/Linux)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

 iD8DBQFQ8D8kUxlJ7aRr7hoRAkcsAJ9Z4JmZ3zAvFYDRMaGbzunMYTjLNACfQWBP
 s3YM17LqF9zvCsAK8ezG8xQ=
 =hXlq
 -END PGP SIGNATURE-

Re: [ccp4bb] About NCS and inhibitors

2013-01-07 Thread Nat Echols

On Mon, Jan 7, 2013 at 1:28 AM, Xiaopeng Hu huxp...@mail.sysu.edu.cn wrote:
 We recently resolved an enzyme/inhibitor complex structure. The enzyme 
 contains two NCS related active site and we did find extra density in both of 
 them.However we observed that the two inhbitor moleculors are not NCS 
 related, but partly overlaped if make a NCS moleculor. Has anyone else 
 observed this before? Thanks for any help and suggestion!

If I'm not misunderstanding the question, some HIV protease inhibitor
complexes do this too.  PDB ID 2fxe is a good example.

-Nat

Re: [ccp4bb] Acceptable Clash Score

2012-11-08 Thread Nat Echols

On Thu, Nov 8, 2012 at 12:20 AM, Mark J van Raaij
mjvanra...@cnb.csic.es wrote:
 Depends on what you call a solved structure.
 For deposition to the pdb ideally there should be very little clashes like 
 Nat writes.
 But perhaps you are referring to the clash score just after molecular 
 replacement, like that output by Phaser or Molrep?

A word of caution: if he's referring specifically to the MolProbity
clash score, this won't take crystal symmetry into account, so
depending on how the placed copies of the search model are arrange
with respect to NCS and crystal symmetry operators, it may actually
undercount the clashes after MR.  (Of course the packing score from
Phaser will properly account for symmetry, and I assume Molrep
displays something similar.)

-Nat

Re: [ccp4bb] Acceptable Clash Score

2012-11-07 Thread Nat Echols

On Wed, Nov 7, 2012 at 4:02 PM, Meisam Nosrati meisam.nosr...@gmail.com wrote:
 I want to know what is considered an acceptable Clash Score for a solved 
 structure.

The recommendation from MolProbity is less than 10.  If you have
low-resolution data and don't have a high-resolution starting model,
it could be a little higher, but I would really put 20 as the maximum,
and I think it should still be possible to lower it without too much
effort.

-Nat

Re: [ccp4bb] how to find and add water molecules in electron density map in coot??

2012-11-06 Thread Nat Echols

On Tue, Nov 6, 2012 at 12:06 PM, saleem raza mysaleemr...@hotmail.com wrote:
 I have to put water molecules in my model but It's difficult to judge that
 electron density is for water of something else. How to differentiate

 How the electron density look like for metal ions like Ca and Na???

Sodium can't be distinguished from water on the basis of electron
density.  You can use the bond valence method (e.g. a program like
WASP) to identify likely sodium ions, but this requires good data and
relatively high resolution.

Calcium has nearly twice as many electrons as water, and should be
very obvious in the Fo-Fc map (assuming you've built in a water to
start with) unless it's only present at half occupancy.  Depending on
the wavelength you collected at and the quality of the data, you may
also be able to see a peak in the anomalous difference map.  The
chemical environment tends to be more distinctive than sodium too.

-Nat

Re: [ccp4bb] Ca or Zn

2012-10-30 Thread Nat Echols

On Tue, Oct 30, 2012 at 12:12 PM, Jim Pflugrath
jim.pflugr...@rigaku.com wrote:
 How would you distinguish between a mixture of Ca and Zn in the same 
 locations?

How often would they be likely to bind in the same place?  Some of the
other transition metals are difficult to tell apart, but Ca and Zn
have very different coordination preferences.

-Nat

Re: [ccp4bb] Convention on residue numbering of fusion proteins?

2012-10-23 Thread Nat Echols

On Tue, Oct 23, 2012 at 9:55 AM, Meindert Lamers
mlam...@mrc-lmb.cam.ac.uk wrote:
 Is there any convention on the numbering of residues in a fusion protein?

 I have a structure of two domains fused together but would like to keep the
 biological numbering intact.
 1st domain: residue 200-300 (protein A).
 2nd domain: residue 170-350 (protein B).
 The fusion is between A300 and B170

 Is it OK to label them chain A and B and create a LINK between the two (thus
 keeping the biological residue number intact).
 Or do I have to start the 2nd domain with residue number 301 (and loose all
 biological information).

You could use the insertion code: the first domain could be residues
200A - 300A, the second domain would be residues 170B - 350B, e.g.

ATOM   2743  CA  THR A 300A -9.899   6.476  21.720  1.00 27.53   C
ATOM   2750  CA  VAL A 170B -6.589   4.599  21.939  1.00 32.82   C

but the chain ID stays the same, with no BREAK or TER record (and no
LINK required).  The insertion code can be a pain to deal with from a
programmer's perspective, and it makes it more difficult to specify
residue ranges, but I think this is exactly what it's supposed to be
used for.

-Nat

Re: [ccp4bb] Etiquette on publishing if there is a crystallization report from someone else.

2012-09-25 Thread Nat Echols

On Tue, Sep 25, 2012 at 6:51 AM, Tim Gruene t...@shelx.uni-ac.gwdg.de wrote:
 I would assume that someone who publishes crystallisation conditions has
 given up solving the structure or some other reason to encourage others
 to pick up the project, i.e., no, I don't see much point NOT
 publishing your data.

I always assumed that the point of publishing crystallization
conditions was to establish priority, and apparently there was once
such an unspoken rule about publishing the structure.  Or so I'm told;
from what I've seen it's long abandoned.

A bit of historical perspective (about a very high-profile project):

http://www.sciencemag.org/content/285/5436/2048.full

-Nat

Re: [ccp4bb] Off-topic: Best Scripting Language

2012-09-12 Thread Nat Echols

On Wed, Sep 12, 2012 at 7:32 AM, Jacob Keller
j-kell...@fsm.northwestern.edu wrote:
 since this probably comes up a lot in manipulation of pdb/reflection files
 and so on, I was curious what people thought would be the best language for
 the following: I have some huge (100s MB) tables of tab-delimited data on
 which I would like to do some math (averaging, sigmas, simple arithmetic,
 etc) as well as some sorting and rejecting. It can be done in Excel, but
 this is exceedingly slow even in 64-bit, so I am looking to do it through
 some scripting. Just as an example, a sort which takes 10 min in Excel
 takes ~10 sec max with the unix command sort (seems crazy, no?). Any
 suggestions?

Anything but Fortran.

Seriously, there are probably a dozen (or more) good solutions, and it
depends on whose syntax you prefer, what external libraries you need,
whether you want to someday apply your new programming skills to
another project, and whether you want anyone else to be able to read
your code.  For me, Python wins easily, but the suggestions of Octave
or R are probably just as good for a one-time script of the sort you
describe.

-Nat

Re: [ccp4bb] Off-topic: Best Scripting Language

2012-09-12 Thread Nat Echols

On Wed, Sep 12, 2012 at 12:49 PM, James Stroud xtald...@gmail.com wrote:
 Also, python (aka python 2) and python 3000 (aka python 3) are considered
 two different languages. It's not reasonable to consider them one language
 and then complain that they are incompatible. Python 3 was created as a new
 language (and should be treated as such) precisely because it breaks
 compatibility with python 2. That was the intent of the language authors.

Actually, despite having endorsed Python, I have to agree with the
complaints about Python 3, for several reasons:

1) It doesn't actually introduce many fundamentally new features that
would have changed how we code for it.  (Like getting rid of self or
the Global Interpreter Lock, or writing the interpreter in C++ and
improving the API for writing extensions.)  The only really huge
change is Unicode support, which is probably good but doesn't really
make it a different programming language.
2) The changes that really break code compatibility - like getting rid
of the print statement - seem to have been done on a whim rather than
because of any pressing need.  Maybe this was done to try to force
everyone to migrate immediately (since module developers couldn't
easily maintain code that works with 2.x and 3.x), but it has had the
opposite effect.
3) Development on Python 2 is being shut down.

Despite all this, I would still choose Python over nearly anything
else for scripting (and most other purposes, but eventually C++ will
be necessary too).

 You blame the authors for recognizing limitations of a language and
 inventing a new one to overcome those limitations.
 If the FORTRAN authors would have done that about 30 years ago, we all might
 be programming in FORTRAN.

I think this is what Fortran 90 was supposed to do (unsuccessfully, at
least in the world of crystallography) - but F77 code is still valid
F90 code, just like ANSI C is still valid C++.

-Nat

Re: [ccp4bb] Calculating I/sig when sig = 0

2012-08-23 Thread Nat Echols

On Thu, Aug 23, 2012 at 10:44 AM, Jim Pflugrath
jim.pflugr...@rigaku.com wrote:
 Singly-measured reflections should have a sigma from Poisson counting
 statistics, so that should not be a problem.  A problem might occur if the
 X-ray background is exactly zero and the observed (sic) intensity is exactly
 zero.

Or the data-processing program truncates the sigma to 0.0 because it
writes it out in %.1f format... but maybe this has been fixed since
the last time it happened to me?

(For what it's worth, Xtriage throws out reflections where sigma=0; I
imagine other programs do the same thing.)

-Nat

Re: [ccp4bb] Various OSes and Crystallography

2012-08-09 Thread Nat Echols

On Thu, Aug 9, 2012 at 6:55 AM, Jacob Keller
j-kell...@fsm.northwestern.edu wrote:
 one. Are there any really reasonable arguments for preferring Mac over
 windows (or linux) with regard to crystallography? What can Mac/Linux do
 that windows cannot (especially considering that there is Cygwin)? What
 wonderful features am I missing?

Mac vs. Linux: mostly a matter of personal preference, but I agree
with Graeme.  Most programs run equally well on either - with Coot a
partial exception, apparently due to problems with the X11
implementation (but once you get used to these, it's not a big deal).

Windows, on the other hand, simply doesn't support the full range of
modern crystallography software.  And in my experience, it has
crippling flaws that mean some programs will always work better on
Mac/Linux.  I wouldn't ever endorse trying to use Windows for serious
scientific computing unless you need to run an application that won't
work on any other OS, and as far as I know there isn't a single
(macromolecular) crystallography program that falls into this
category.

-Nat

Re: [ccp4bb] Various OSes and Crystallography

2012-08-09 Thread Nat Echols

On Thu, Aug 9, 2012 at 8:14 AM, Quentin Delettre q...@hotmail.fr wrote:
 I have seen that in the last Mac Os, X11 have been removed... But can still
 be used with some package installation.

I guess it isn't distributed with the OS any more - but it is still available:

http://xquartz.macosforge.org/landing/

-Nat

Re: [ccp4bb] Mac or PC?

2012-08-09 Thread Nat Echols

On Thu, Aug 9, 2012 at 12:52 PM, Lee, Ting Wai twlee.scie...@gmail.com wrote:
 May I ask a very general question? I am going to buy a laptop. I am going to
 do a lot of structural biology work on it using programs such as CCP4,
 Phenix, Coot and Pymol. Mac or PC, which is better?

See this morning's thread.  Short answer: either works, just avoid Windows.

 I have never installed
 this kind of programs and done structural biology work on laptops except
 using Pymol. Will these programs cause any problems when they are run on
 laptops? I mean, will they slow down very much or even freeze the laptops?
 Can the programs finish the jobs at an OK speed? I mean, maybe not as fast
 as desktops, but not taking too long like days or weeks.

It depends on how big the structures you work with are, and what
you're trying to run.  I have a MacBook Air and it is quite adequate
for crystallography, but I've only worked with small and/or
low-resolution structures where there's no danger of exceeding the 4GB
memory limit.  (Of course, one can buy far more powerful laptops, but
the price goes up steeply.)  The important thing is *not* to buy the
cheapest PC laptop you can find, because the really low-end hardware
probably won't work very well.

-Nat

Re: [ccp4bb] MR with Phaser

2012-08-01 Thread Nat Echols

On Wed, Aug 1, 2012 at 11:27 AM, Uma Ratu rosiso2...@gmail.com wrote:
 The protein is in tetramer form. I define this by using the residue number
 (1332) which is 4 x monomer.

 After run, Phaser only gave 9 partial solutions, and no solution with all
 components. The resulted PDB contains only dimer form of the protein, not
 the tetramer. And the first TFZ score is around 2.5, which is too low for
 MR.

 I have the report file of data processing and the summary of Phaser
 attached.

 Could you please advice which part is wrong, why can I get the tetramer form
 of the protein?

Your data are processed as P2, which is much less common (for
proteins) than P21, and it looks like you haven't told Phaser to try
P21 too.  There are many other reasons why MR might not work, but I
think it's very likely that the space group is wrong.

-Nat

Re: [ccp4bb] How to identify unknow heavy atom??

2012-07-24 Thread Nat Echols

On Tue, Jul 24, 2012 at 10:14 AM, Haytham Wahba haytham_wa...@yahoo.com wrote:
 1- if i have anomalous peak of unknown heavy atom, How can i identify this
 heavy atom in general. (different methods)

 2- in my case, i see anomalous peak in heavy atom binding site (without any
 soaking). preliminary i did mass spec. i got Zn++ and Cu, How can i know
 which one give the anomalous peak in my protein.

 3- there is way to know if i have Cu+ or Cu++.

You may be able to identify the element based on the coordination
geometry - I'm assuming (perhaps incorrectly) that it is actually
different for Cu and Zn.  Marjorie Harding has written extensively on
the geometry of ion binding:

http://tanna.bch.ed.ac.uk/

The only way to be certain crystallographically, if you have easy
access to a synchrotron, is to collect data above and below the K edge
of any candidate element, and compare the difference maps.  (For
monovalent ions it is more complicated, since they don't have
accessible K edges.)  On a home source, Cu should have a larger
anomalous map peak, but I'm not sure if this will be enough to
identify it conclusively.

-Nat

Re: [ccp4bb] How to identify unknow heavy atom??

2012-07-24 Thread Nat Echols

On Tue, Jul 24, 2012 at 10:33 AM, Ethan Merritt
merr...@u.washington.edu wrote:
 As to the home source - no.
 Neither Cu nor Zn has appreciable anomalous signal when excited with a
 Cu K-alpha home source.
   http://www.bmsc.washington.edu/scatter

 An element's emission edge (Cu K-alpha in this case) is about 1 keV below
 the corresponding absorption edge.  This makes sense, because after
 absorbing a photon it can only emit at an equal or lower energy, not a
 higher energy.  So you can't reach the Cu absorption edge, where the
 anomalous signal is, by exciting with Cu K-alpha.

Oops, sorry, I was of course comparing the wrong numbers.

-Nat

Re: [ccp4bb] Structure Refinement Program

2012-07-23 Thread Nat Echols

On Mon, Jul 23, 2012 at 9:50 AM, Scott Foy s...@mail.umkc.edu wrote:
 We are computationally averaging several homologous protein structures into a 
 single structure. This of course will lead to a single protein structure that 
 possesses poor biophysical characteristics of bond lengths, bond angles, 
 steric hindrance, etc. Therefore, we will need a refinement program that is 
 very rapid and that will restore optimal protein parameters upon input of a 
 single PDB coordinate file. We are considering several programs such as 
 Phenix and CNS and would appreciate any comments or opinions as to 
 recommendations, advantages, and disadvantage for these, or other, programs. 
 We will need to refine thousands of PDB files so speed is a significant 
 consideration.

Can you clarify whether you intended to refine against experimental
data, or just clean up the model geometry?

-Nat

Re: [ccp4bb] harvesting in cold room (was: cryo for high salt crystal)

2012-07-13 Thread Nat Echols

On Fri, Jul 13, 2012 at 2:19 PM, Radisky, Evette S., Ph.D.
radisky.eve...@mayo.edu wrote:
 Several have mentioned harvesting in the cold room to reduce evaporation.  I
 used to do this also as a postdoc, but I worried whether I risked nitrogen
 gas poisoning from liquid N2 boil-off, since the cold room did not seem very
 well-ventilated.  I’ve also hesitated to recommend it to trainees in my
 current lab for the same reason.  Does anyone have solid information on
 this?  I would like to be convinced that such fears are unfounded …

Aside from safety concerns, won't this reduce the solubility?  I hated
harvesting high-salt conditions in the cold room for exactly this
reason.

-Nat

Re: [ccp4bb] Rfactors stuck very high

2012-07-08 Thread Nat Echols

On Sun, Jul 8, 2012 at 2:11 PM, James Garnett j.garn...@imperial.ac.uk wrote:
 I have found a molecular replacement solution in I212121 using an NMR 
 structure of the same protein and MR-ROSETTA/PHENIX (PHASER LLG=128 
 TFZ=12.3), although I can not refine this below R ~45% and Rfree ~50%. The 
 maps look OK in parts but in other regions the connectivity is much reduced. 
 In case of model bias I have used density modification and also used 
 simulated annealing etc in case it is stuck in a local minima - these did not 
 help. This protein is an Ig-like fold (potential for pseudo-internal 
 symmetry) and so I have also played around with rotations of the structure 
 but this has not helped. Although twinning analysis in all spacegroups 
 suggest there is no twinning I have tried refinement in PHENIX and REFMAC 
 using twin laws but this does not help.

Several questions:

1) Are you certain you crystallized the protein you're interested in
and not a contaminant?  This is unlikely to be the culprit, but it's
always good to check.  (An R-free of 50% does not guarantee that the
model is correct.)

2) What happens if you delete the regions of the model where the map
connectivity is poor, and refine the partial model?

3) Did you try rebuilding the model completely from scratch, i.e.
starting from the map without an input PDB file?  I'm pretty sure
MR-Rosetta will only do this if the sequence is significantly
different than the template.  I'd recommend trying several different
programs to do this (AutoBuild, ARP/wARP, Buccaneer), as the methods
involved are quite different, and you may be able to combine different
fragments together afterwards.

-Nat

Re: [ccp4bb] help regarding structure solution

2012-06-20 Thread Nat Echols

On Wed, Jun 20, 2012 at 11:13 AM, sonali dhindwal 
sonali11dhind...@yahoo.co.in wrote:


 I am working on a protein for last so many years and for which i have got
 crystal now in a tray which i kept 1 years ago. It diffracts well and
 resolution is 2.2A, which is good.

 I indexed in HKL2000, mosflm and automar and it shows P21 space group in
 all data reduction packages. But when I tried using molrep or phaser then I
 do not get any solution. The sequence of my protein is having  46% identity
 with other available crystal structure.
 Also when I tried to get matthews coffecient, it calculates its molecular
 mass less ( about 35 kDa) than which should be (original 54kDa) with
 solvent content 47%.

 I have also run the silver staining gel of the protein which contained
 crystal that shows about 45 kD protein band which is 10 less than the
 original.  Also I tried to run gel on crystal but it did not give anything
 as it was a small crystal.

 I have tried all combinations of the search model and tried to break
 available pdb many ways to make different search models but have not got
 any good solution. Molrep gives contrast even 10 or more but no good
 electron density map yet. Free R and figure of merit becomes 52% and 42%
 respectively in Refmac with all the solutions.


Have you tried using an automated building program on the best solutions
you have so far?  Refinement programs will often get stuck quickly if the
MR solution is poor, but rebuilding from scratch can sometimes do a much
better job.  Other things to try in cases like this are DEN refinement or
MR-Rosetta - both require significant computational resources but also have
a wider radius of convergence.

The other question to ask yourself in situations like this is did I really
crystallize the protein I'm interested in, or something else?  It's
surprisingly easy to crystallize minor contaminants; at last count, I've
met at least four different people who've done this.  (One of them actually
ended up with a decent paper describing a structure he'd never intended to
solve.)  I suspect there are dozens if not hundreds of datasets lying
abandoned because they couldn't be phased or reproduced because they
weren't what the researcher thought they were.

If there is any way to obtain the mass spec of the crystallized protein,
this will be the most useful confirmation either way.  The next thing to
try is searching for similar unit cells in the PDB, although this doesn't
take into account changes in space group that result in a different unit
cell without actually changing the lattice.  (There are probably multiple
tools that can account for this; I can point to one if you're interested.)
 As a last resort, I would recommend a brute-force approach:

https://portal.sbgrid.org/d/apps/wsmr/

-Nat

Re: [ccp4bb] Model submission

2012-06-19 Thread Nat Echols

On Tue, Jun 19, 2012 at 8:35 AM, RHYS GRINTER
r.grinte...@research.gla.ac.uk wrote:
 There's no significant difference between the high res and low res proteins 
 in the shared region (amino acid 38+) (r.m.s.d 0.46 A), and the while there 
 is broken density for the first 38aa from the full length data it's too poor 
 to model into.

 I want to present a figure which shows the density corresponding to the first 
 38aa and where that fits with the rest if protein molecule. What I'm unsure 
 of it whether I will be required by the journal to submit a model from the 
 lower resolution data to the PDB in order to present this figure. Bearing in 
 mind the density doesn't allow any additional residues to be modelled 
 compared to the high res. structure.

This may be true today, but there is no guarantee that it will still
be the case in five years, or ten, or however long it takes for the
software to improve.  I'd argue that anything you illustrate in the
paper should end up in the PDB anyway, but if there is any chance that
someone could improve on your structure in the future and possibly
learn something new as a result, it's worth depositing for that reason
as well.  Otherwise the data will probably be lost, and we'll never
know if those extra residues could have been modeled.  (Although I
suspect that it would be more helpful if the images were also
available, instead of having to start from the processed data.)

-Nat

Re: [ccp4bb] how to get phase of huge complex

2012-06-12 Thread Nat Echols

On Tue, Jun 12, 2012 at 8:53 PM, Frank von Delft
frank.vonde...@sgc.ox.ac.uk wrote:
 Finding 111 sites should be feasible without other tricks than very careful
 data collection (see below);  if you have two or more copies in the ASU, you
 may find you need to do what the ribosome guys did, namely use other
 derivatives (e.g TaBr clusters) to locate your seleniums, and then phase.

With 40% of the complex having homologues in the PDB, you may be able
to place those subunits by MR, then use the phases from the incomplete
model to locate the seleniums.

-Nat

Re: [ccp4bb] metal modelling in coot

2012-05-05 Thread Nat Echols

On Sat, May 5, 2012 at 2:23 PM, Pavel Afonine pafon...@gmail.com wrote:
 may be I'm missing something but I think all you need to do is to place (add
 to PDB file) a Zn2+ into a blob of density that you believe that Zn belongs
 to, and then most of refinement tools will take care of it automatically. So
 I'm not seeing why you need files for a Zn atom  I guess the task is
 as simple as I just wrote, isn't it?

Not quite - as Roger noted, the charge would need to be set
separately.  (Actually, having Coot do this automatically for ions
would be a very nice feature, and hopefully not difficult to add - or
alternately, make this an option in the pointer atoms dialog.)  This
probably won't make a huge difference at most resolutions since the
B-factor will soak up some of the discrepancy, but it may result in
cleaner maps.

-Nat

Re: [ccp4bb] Refmac executables - win vs linux in RHEL VM

2012-04-07 Thread Nat Echols

On Sat, Apr 7, 2012 at 9:50 AM, Roger Rowlett rrowl...@colgate.edu wrote:
 I don't know the state of current software, because I haven't tried
 recently, but when I set up my student crystallography workstations a few
 years back I noticed many packages (e.g. EPMR, Phaser) that had potentially
 long run times (where it is really noticeable) would run on the identical
 hardware about 2-3 times faster in Linux than in Windows XP. Memory swapping
 wasn't the issue. I was astounded there could be that much overhead in
 Windows. A Linux VM on a windows machine being faster than native Win7 is
 pretty weird, though.

Different compiler implementations will often have a huge effect on
runtimes.  I recently spent some time trying to get a large amount of
C++ code (converted from F77) to compile under Visual C++ 9.0, and I
had to disable optimization of at least ten different functions to
prevent cl.exe from crashing.  This was not especially complex code
(and g++ never complains) - just nested 'for' loops over three
dimensions.  I did not attempt to compare runtimes since I was running
Windows in a virtual machine on a Mac, but I would be surprised if the
resulting Windows binaries were not slower on identical hardware.  And
even if the compiler isn't broken, the math libraries may be; one of
my colleagues found (on Linux) that the exp() function provided by g77
was 20-fold slower than the equivalent in the Intel math library.

So I suspect it is related to the compilers (and optimization flags)
used by CCP4 for these platforms.  Another good reason to avoid
Windows!

-Nat

Re: [ccp4bb] very informative - Trends in Data Fabrication

2012-04-02 Thread Nat Echols

On Mon, Apr 2, 2012 at 11:00 AM, Maria Sola i Vilarrubias
msv...@ibmb.csic.es wrote:
 About a wrongly fit compound, the reviewer can ask images about the model in
 a map calculated at a specific sigma and in different orientations.

This will often be insufficient, I'm afraid.  We generally assume good
faith on the part of the authors: if the caption says the 2mFo-DFc
map is shown contoured at 1.5sigma, we assume that this is an honest
statement, but we also have no way of verifying it until the
experimental data are available.  I know of at least one case offhand
where the maps could not possibly have been contoured at that level -
the ligands are not misfit, they are simply not present in the
crystals, and the paper is misleading (deliberately or not, I don't
know).  Most reviewers do not have the patience to spend weeks
pursuing these issues.  (Although it would certainly help if reviewers
insisted that the density around ligands not be shown in isolation.)

That aside, I completely understand why someone would be reluctant to
share their data with potential competitors.  Someone once suggested
making the model and maps viewable via a web applet (AstexViewer or
similar), but even that sounds like it could be prone to abuse.

-Nat

Re: [ccp4bb] Crystal Structures as Snapshots

2012-02-10 Thread Nat Echols

On Fri, Feb 10, 2012 at 12:29 PM, James Stroud xtald...@gmail.com wrote:
 How could they not be snapshots of conformations adopted in solution?

Packing billions of copies of an irregularly-shaped protein into a
compact lattice and freezing it to 100K isn't necessarily
representative of solution, especially when your solution contains
non-physiological amounts of salt and various organics (and possibly
non-physiological pH too).

-Nat

Re: [ccp4bb] Crystal Structures as Snapshots

2012-02-10 Thread Nat Echols

Just to clarify - I actually think the original assumption that Jacob
posted is generally reasonable.  But it needn't necessarily follow
that the conformation we see in crystal structures is always
representative of the solution state; given the extreme range of
conditions in which crystals grow, I would be surprised if there
weren't counter-examples.  I'm not familiar enough with the literature
on domain swapping (e.g. diptheria toxin) to know if any of those
structures are crystal packing artifacts.

On Fri, Feb 10, 2012 at 1:04 PM, George gkontopi...@vet.uth.gr wrote:
Packing billions of copies into a compact lattice
 Not so compact there is 40-80% water
freezing it to 100K
 We have frozen many times protein solutions in liquid nitrogen and then thaw
 and were working OK
 non-physiological amounts of salt and various organics
 What is the amount of salt and osmotic pressure in the cell??
non-physiological pH too
 What is the non-physiological pH too? I am sure that some enzymes they are
 not working in pH 7. Also most of the proteins they have crystallized in pH
 close to 7 so I would not say non-physiological.

 George

 PS There are lots of solution NMR structures as well supporting the
 physiological crystal structures


 -Original Message-
 From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Nat
 Echols
 Sent: Friday, February 10, 2012 10:35 PM
 To: CCP4BB@JISCMAIL.AC.UK
 Subject: Re: [ccp4bb] Crystal Structures as Snapshots

 On Fri, Feb 10, 2012 at 12:29 PM, James Stroud xtald...@gmail.com wrote:
 How could they not be snapshots of conformations adopted in solution?

 Packing billions of copies of an irregularly-shaped protein into a
 compact lattice and freezing it to 100K isn't necessarily
 representative of solution, especially when your solution contains
 non-physiological amounts of salt and various organics (and possibly
 non-physiological pH too).

 -Nat

Re: [ccp4bb] Soaking Kinase Crystals with ATP analogues

2012-02-01 Thread Nat Echols

On Wed, Feb 1, 2012 at 11:17 AM, Dianfan Li l...@tcd.ie wrote:
 I am working on a kinase and would like to get an ATP analogue into
 the crystals. When soaked with AMP-PCP, the kinase crystals crack in
 about 15 min at 4 C.

This isn't too surprising; most kinases undergo global conformational
changes (domain closure) when binding ATP.

 I could try other analogues like AMP-PNP etc, but those would probably
 behavour in a same way as AMP-PCP. Is it a good idea of trying quick
 soaks at high concentrations of AMP-PCP? Co-crystallization is another
 option I have but AMP-PCP is a substrate of the kinase (with low
 rate).

 What are other ways of getting ATP analogues into a crystal?

I'd recommend trying ATP-gammaS - it could also be a substrate, but
it's worth a look.  (Is there any reason to believe that AMP-PNP is a
substrate?)  I've noticed that the various analogues have been known
to result in different conformations in the crystal structure, so it
may be a good idea to try more than one anyway.

-Nat

Re: [ccp4bb] New Faster-than-fast Fourier transform

2012-01-24 Thread Nat Echols

On Tue, Jan 24, 2012 at 1:38 AM, Adam Ralph adam.ra...@nuim.ie wrote:
    CUDA is a set of extensions for C which will allow you to access hardware
 accelerators (certain NVidia cards in this case). CUDA has been around for
 a
 while and there are CUDA libraries for FFT and BLAS.
    I have not used cuFFT myself, I know that its APIs are based on those of
 FFTW. The capabilities and ease of use of these cards are improving with
 each generation. If you are in the game of speeding up your FFTs then I
 recommend you take a look.

Unfortunately this isn't going to make refinement programs much faster
either.  I found that cuFFT was about 20x faster on a state-of-the-art
NVidia accelerator versus a single Intel Xeon core - but the memory
transfer knocks it down to 4x.  OpenMP parallelization can give a
similar speedup without spending $2500 extra on the GPU, and with much
less butchering of the code.  (And even that doesn't help much,
because FFTs still take up less than half the runtime during
refinement, at least in Phenix - I would be surprised if other
programs were significantly different in this respect.)

-Nat

Re: [ccp4bb] writing scripts-off topic

2012-01-24 Thread Nat Echols

On Tue, Jan 24, 2012 at 10:24 AM, Ian Tickle ianj...@gmail.com wrote:
 reassuring air of finality!  Maybe a Python expert will answer this
 but I've often wondered, what happens if as some editors do
 (particularly if as I do you have to use different editors at
 different times depending on where you are working, such as on Windows
 working remotely from home or Linux at work), you could have a mixture
 of space and tab characters in the file?

Yes, this can happen.  In practice, one learns very quickly to
configure the text editors to prevent this kind of mix-up, i.e. by
always inserting spaces instead of tab characters.  In vim, for
instance, you can do this:

set expandtab
set tabstop=2

For CCTBX, the rule is to use two spaces as the indentation (tabs
strictly forbidden), and despite having multiple active contributors,
this is almost never a problem.  (The rule for most other Python
modules appears to be four spaces, which I personally find too wide.)
It becomes second nature after a while, just like adding a semicolon
at the end of each statement in C/C++/Java/etc.  I agree that it seems
annoying and confusing at first, but if you've ever tried to edit
someone else's C or Perl code where the indentation was totally
inconsistent, you'll quickly learn to appreciate Python's style.

 So does Python automatically expand the tabs to the
 equivalent number of spaces or (as in data input) are they treated as
 single characters?  And anyway how does Python know what tab stops my
 editors are set to and indeed how exactly my editors treat tabs?

Answers are no and it doesn't, at least I don't think so.  The
safest thing (especially if you need to copy and paste code from any
other module) is to never use literal tabs.

-Nat

1 2 >

1 - 100 of 156 matches

Mail list logo