All interesting points.. (And good to see a reference to
*" P.A. Machin, J.W. Campbell, M. Elder (Eds)Refinement of Protein
Structures, SERC Daresbury Laboratory, Warrington, UK (1980)"*
- for those who remember, a super exciting discussion over what was
feasible for refinement, and how to do it! )

My take - if a crystal diffracts to 1A we can be fairly sure of the
accurate position of most of the coordinates, see other conformations for
some regions, and give realistic B values to most atoms.
If the crystal only diffracts to 3A then the lattice is not perfect, and
there must be multiple conformations for lots of the molecule.
There is not going to be sufficient experimental data to model this
properly so every parameter assuming a single conformer - coordinate, B
value, occupancy - is an approximation. Restraints help to some extent but
they impose prior knowledge and do not glean information from the
experimental data.
The "trash can" should indicate the degree of uncertainty and interpreting
that is a bit problematic.  B values twice the overall B ?? Hmm-  do NOT
base too much faith in that part of the model.. As crystallographers I
think maybe we need to flag this better for trusting users of the
information. Omitting that region? I am not sure .. How do others model
those floppy lysines? I usually make a sort of informed guess but indeed
giving a single conformation is not the truth, the whole truth, and nothing
but the truth..


On Fri, 2 Aug 2024 at 01:14, James Holton <[email protected]> wrote:

> I submit that modern B factor restraints make them much less trashy than
> they were in the early days.  As Pavel points out the exact strategies
> differ from program to program, but I don't think anybody does unrestrained
> B factor refinement. Not by default.
>
> Besides, all we are really doing is fitting Gaussian-shaped peaks to the
> "curve" of the data.  These peaks have a width and a height.  For example,
> a carbon atom with B=20 has a peak density of 1.6 e-/A^3 and a
> full-width-at-half-max (FWHM) of 1.4 A.  That is it! That is the model
> density being fit. If you increase to B=80 the peak drops to 0.3 e-/A^3 and
> the FWHM increases to 2.6 A.  At the largest B you can stuff into a PDB
> file (999.99), the peak height is 0.008 e-/A^3 and the "peak" is 8.45A
> wide. Your disordered loop, however, is probably not sampling from a
> symmetric Gaussian distribution like that. This is the real problem with
> large B factors. They can fit better than sharper B atoms, but that doesn't
> mean they fit well.
>
> Occupancy is easy because all it does is scale the height without
> affecting the width.  So, an 0.5 occupancy atom model is half the height of
> a full-occupancy one.  The width is unchanged.  B factors impact both width
> and height because they must preserve the number of electrons in the peak.
> This is perhaps why they are often confusing and mysterious.  We should
> also never forget that bulk solvent gets excluded with exactly the same
> radii rules from every modeled atom, regardless of B factor and occupancy.
> So, the "change in density" from adding or deleting an atom is a little
> more complicated than adding or subtracting a Gaussian peak.
>
> Nevertheless, if you want to fit peak height and width independently (like
> we do in pretty much every other kind of curve fitting), then you should
> refine occupancy and B factors at the same time.
>
> Over-fitting you say?  Hardly. Polynomials are easy to over-fit, but not
> Gaussians. Observations/parameters is a useful guide for polynomial fits,
> but in general the hallmark of over-fitting is that the prediction passes
> exactly through all the observed points (and not the cross-validation or
> "Rfree" points). I have never seen a macromolecular refinement end up with
> Rwork = 0.  Have you?
>
> At the end of the day, what we do with our models is look at their
> parameters and try to extract the physically meaningful reality they are
> trying to capture. Restraints are very helpful in preventing many types of
> unrealistic situations, but ultimately it is up to you to decide if the
> fitted model makes sense.
>
> -James Holton
> MAD Scientist
>
> On 7/30/2024 11:30 AM, Ian Tickle wrote:
>
>
> Obviously no refined parameters can ever be completely error-free, it's
> just that for the co-ordinates we have very accurate geometric restraints
> so that the relative uncertainty in the refined co-ordinates is small (but
> try refining co-ordinates without restraints!).  For the B factors we don't
> have accurate estimates (if any) for their restraints so their relative
> uncertainty after refinement is much greater.
>
> -- Ian
>
>
> On Tue, Jul 30, 2024 at 6:57 PM Oganesyan, Vaheh <
> [email protected]> wrote:
>
>> Yes, it is and I like the definition of shared “trash bin”. It will have
>> more physical meaning if we can separate those contributions into separate
>> bins.
>>
>>
>>
>> Vaheh
>>
>>
>>
>>
>>
>>
>>
>> *From:* Pavel Afonine <[email protected]>
>> *Sent:* Tuesday, July 30, 2024 1:51 PM
>> *To:* Oganesyan, Vaheh <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Re: [ccp4bb] How high a B factor is too high to assume a loop
>> is in place, in the AlphaFold era?
>>
>>
>>
>> Vaheh,
>>
>> I think coordinates are no different from B factors, occupancies, f', or
>> f'' in this respect. Coordinates can play their "trash bin" role by
>> adjusting to the noise at the expense of violated geometry (bonds, angles,
>> planes, torsions, etc.). As I mentioned in my previous email, their trash
>> bin capacity is much smaller (but definitely not zero!) because the number
>> and strength (confidence) of geometry restraints are much greater than
>> those of ADP restraints.
>>
>> I agree that all refined parameters share this trash bin capacity, but to
>> varying degrees. Isn't this essentially what we call the error on the
>> refined parameter? All refined parameters have their error bars, which we
>> have referred to as the "trash bin" in this thread.
>>
>> Pavel
>>
>>
>>
>> On Tue, Jul 30, 2024 at 10:09 AM Oganesyan, Vaheh <
>> [email protected]> wrote:
>>
>> Your point is taken, Pavel. However, despite resolution, you define
>> coordinate of the atom as a geometric point with no width. Although
>> coordinates are “refineable”, they have no capacity for “trash”. Their
>> “trash” still goes into B-factor “trash bin”. At least this is how I see it.
>>
>>
>>
>> Thank you.
>>
>>
>>
>> *Vaheh Oganesyan, Ph.D.*
>>
>> *R&D **| Biologics Engineering*
>>
>> One Medimmune Way, Gaithersburg, MD 20878
>>
>> T:  301-398-5851
>>
>> *[email protected] <[email protected]>*
>>
>>
>>
>>
>>
>>
>>
>> *From:* Pavel Afonine <[email protected]>
>> *Sent:* Tuesday, July 30, 2024 11:45 AM
>> *To:* Oganesyan, Vaheh <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Re: [ccp4bb] How high a B factor is too high to assume a loop
>> is in place, in the AlphaFold era?
>>
>>
>>
>> From this perspective, all refinable atomic model parameters can be
>> viewed as trash bins, with the size of these bins being proportional to the
>> amount of prior information (restraints) imposed on these parameters. For
>> example, coordinates have the most restraints and thus are the smallest
>> trash bins, while B factors have the least restraints and thus are one of
>> the largest bins.
>>
>> Pavel
>>
>>
>>
>>
>>
>> On Tue, Jul 30, 2024 at 8:25 AM Oganesyan, Vaheh <
>> [email protected]> wrote:
>>
>> Early in my Crystallography life I was postdoc with Robert Huber in
>> Munich. We had those gatherings once a week when in very informal way we
>> can ask and answer questions. I remember my question about B factors: how
>> is it possible to have high resolution structure and average B-factor of
>> 100A2. I think it was Robert or Albrecht Messerschmidt who told that
>> B-factor is a “trash can” that describes not only loosely positioned atoms
>> but also all other problems that either you created during processing,
>> harvesting or crystal had from the beginning.
>>
>>
>>
>> *Vaheh Oganesyan, Ph.D.*
>>
>> *R&D **| Biologics Engineering*
>>
>> One Medimmune Way, Gaithersburg, MD 20878
>>
>> T:  301-398-5851
>>
>> *[email protected] <[email protected]>*
>>
>>
>>
>>
>>
>>
>>
>> *From:* CCP4 bulletin board <[email protected]> *On Behalf Of *James
>> Holton
>> *Sent:* Tuesday, July 30, 2024 10:35 AM
>> *To:* [email protected]
>> *Subject:* Re: [ccp4bb] How high a B factor is too high to assume a loop
>> is in place, in the AlphaFold era?
>>
>>
>>
>> How high B factors can go depends on the refinement program you are
>> using.
>>
>> In fact, my impression is that the division between the "let the B
>> factors blow up" and "delete the unseen" camps is correlated to their
>> preferred refinement program. You see, phenix.refine is relatively
>> aggressive with B factor refinement, and will allow "missing" atoms to
>> attain very high B factors. Refmac, on the other hand, has restraints that
>> try to make B factor distributions look like those found in the PDB, and so
>> tends to keep nearby B factors similar. As a result, you may get "red
>> density" for disordered regions from refmac, inviting you to delete the
>> offending atoms, but not from phenix, which will raise the B factor until
>> the density fits.
>>
>> Then there are programs like VagaBond that don't formally have B factors,
>> but rather let an ensemble of chains spread out in the loopy regions you
>> are concerned about.  This might be the way to go?
>>
>> You can also do ensemble refinement in the latest Amber.  That is, you
>> run an MD simulation of a unit cell (or more) and gradually increase
>> structure factor restraints. This would probably result in the "fan" of
>> loops you have in mind?
>>
>> -James Holton
>> MAD Scientist
>>
>> On 7/28/2024 8:13 AM, Javier Gonzalez wrote:
>>
>>
>>
>> Dear CCP4bb,
>>
>>
>>
>> I'm refining the ~3A crystal structure of a big protein, largely composed
>> of alpha helices connected by poorly-resolved loops.
>>
>> In the old pre-AlphaFold (AF) days I used to simply remove those
>> loops/regions with too high B factors, because there was little to none
>> density at 1 sigma in a 2Fo-Fc map.
>>
>> However, considering that the quality of a readily-computable AF model is
>> comparable to a 3A experimental structure, and that the UniProt database is
>> flooded with noodle-like AF models, I was considering depositing a combined
>> model in the PDB.
>>
>> Once R/Rfree reach a minimum for the model truncated in poorly resolved
>> loops, I would calculate an augmented model with AF calculated missing
>> regions (provided they have an acceptable pLDDT value), assign them zero
>> occupancy, and run only one cycle of refinement to calculate the formal
>> refinement statistics.
>>
>> Would that be acceptable? Has anyone tried a similar approach?
>>
>> I'd rather do that instead of depositing a counterintuitive model with
>> truncated regions that few people would find useful!!
>>
>>
>>
>> Thank you for your comments,
>>
>>
>>
>> Javier
>>
>>
>> --
>>
>> Dr. Javier M. González
>> Instituto de Bionanotecnología del NOA (INBIONATEC-CONICET)
>> Universidad Nacional de Santiago del Estero (UNSE)
>> RN9, Km 1125. Villa El Zanjón. (G4206XCP)
>> Santiago del Estero. Argentina
>>
>> Tel: +54-(0385)-4238352
>>
>> Email <[email protected]> Twitter <https://twitter.com/_biojmg>
>>
>>
>>
>>
>> ------------------------------
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>>
>>
>>
>> ------------------------------
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>>
>> ------------------------------
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>>
>> ------------------------------
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>>
>
> ------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>
>
>
> ------------------------------
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1
>

########################################################################

To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB&A=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/

Reply via email to