Re: [ccp4bb] Hydrogens in PDB File

2020-03-02 Thread Pavel Afonine
Clearly, it is a good idea to keep hydrogens:

http://phenix-online.org/presentations/hydrogens.pdf

Not sure why this keeps coming up as a topic given how much it was said
about it in the past, all the MolProbity arguments, etc..

Issue of missing side chains and loops is tricker indeed.

Pavel

On Mon, Mar 2, 2020 at 11:33 AM Dale Tronrud  wrote:

> On 3/2/2020 10:12 AM, Alexander Aleshin wrote:
> > Dear Dale,
> > You raised a very important issue that has been overly ignored by the
> crystallographic community. The riding hydrogens are just a tip of an
> iceberg. It is absolutely unclear even to an experienced crystallographer
> how to treat poorly ordered side chains or even whole residues. As a matter
> of fact,  their models are "riding atoms", and consumers have no clue how
> much they can trust our modeling.
>
>Oh no!  Now I've opened up this can of worms.
>
>The matter of describing completely disordered side chains has been
> discussed heavily on this BB, along with the advantages and shortcomings
> of overloading the meaning of "B factor" or "Occupancy" to describe this
> situation.
>
>Using one data item to describe multiple things is never a good idea,
> in my opinion.  The move to mmCIF for model storage does open the
> possibility of adding new tags to uniquely describe model properties.
> Creating such a tag for "place-holder" side chain atoms was one of the
> recommendations in "Outcome of the First wwPDB/CCDC/D3R Ligand
> Validation Workshop" (https://www.ncbi.nlm.nih.gov/pubmed/27050687).  I
> don't know the status of the implementation of any of these
> recommendations.  The wheels of the wwPDB grind exceedingly slowly.
>
>This is just another part of the huge problem of describing the
> nature of the deposited model and the origin of the information
> supporting all of its parts.
>
> 1) Riding Hydrogen atoms vrs free-floating and refined
> 2) Placeholder side chains vrs visible in density
> 3) Placeholder loops vrs visible in density
> 4) TLS anisotropic B's vrs restrained individual atom aniso vrs
> unrestrained individual - The options here are many and multiple types
> may be present in one model
> 5) NCS restraint/constraint - The options here are many and multiple
> types may be present in one model
> 6) Concerted alternative conformation spread over multiple residues
> 7) Sequence heterogeneity
>
> These are just the topics that bother me with almost every model I
> download.  I'm sure there are plenty of other topics that don't come to
> mind at the moment.
>
>With the move to mmCIF we now have the opportunity to create
> descriptions of these model properties w/o just adding more and more
> REMARK cards.
>
>Until that wondrous day arrives we are stuck with trying to create
> models that are useful to the general community and provide minimal
> opportunity for confusion.  We can argue as to where that line is, but
> shouldn't loose sight of the ultimate goal.
>
>Ethan and I disagree over the relative damage caused by the inclusion
> of riding hydrogen atom positions vrs the confusion that results from
> their absence.  I think we agree strongly that all of the list items
> above need to be tackled by the wwPDB and are of extreme importance.  I
> think we need a comprehensive solution, not a piecemeal, special case,
> for each.
>
> Dale Tronrud
>
> > Moreover, some programs (including the version of Pymol that I use), get
> confused when they detect residues with multiple conformations. Like my
> Pymol version fails to build cartoon elements in those areas, and it is not
> obvious for a beginner how to remove the alternative conformations. I
> presume many consumers just ignore such structures and pick up analogues
> that are displayed without problems.
> >
> > Pymol developers, is it so difficult to report a user, when a structure
> is loaded that it has residues with alternative conformations, and one of
> conformers should be hidden for a correct presentation of the secondary
> structure elements?
> >
> > Alex
> >
> >
> >
> > On 3/2/20, 12:57 AM, "CCP4 bulletin board on behalf of Dale Tronrud" <
> CCP4BB@JISCMAIL.AC.UK on behalf of de...@daletronrud.com> wrote:
> >
> > [EXTERNAL EMAIL]
> >
> > Dear Tim,
> >
> >I am in agreement with Ethan and you that a complete description
> of
> > the restraints and constraints applied to the model should be
> included
> > in the deposition.  This is currently a major failing of the wwPDB.
> For
> > hydrogen atoms we, at least, have the "Riding hydrogen atoms were
> added"
> > remark but that simple statement is inadequate to allow anyone (or
> > program) to reproduce what the depositor had on disk before the
> hydrogen
> > atoms were redacted.  We know that shelxl and MolProbity produce
> > hydrogen models that differ, and that shelxl requires additional
> > information about the temperature of the molecule at least.
> >
> >How could someone hope to 

Re: [ccp4bb] Hydrogens in PDB File

2020-03-02 Thread Dale Tronrud
On 3/2/2020 10:12 AM, Alexander Aleshin wrote:
> Dear Dale,
> You raised a very important issue that has been overly ignored by the 
> crystallographic community. The riding hydrogens are just a tip of an 
> iceberg. It is absolutely unclear even to an experienced crystallographer how 
> to treat poorly ordered side chains or even whole residues. As a matter of 
> fact,  their models are "riding atoms", and consumers have no clue how much 
> they can trust our modeling.

   Oh no!  Now I've opened up this can of worms.

   The matter of describing completely disordered side chains has been
discussed heavily on this BB, along with the advantages and shortcomings
of overloading the meaning of "B factor" or "Occupancy" to describe this
situation.

   Using one data item to describe multiple things is never a good idea,
in my opinion.  The move to mmCIF for model storage does open the
possibility of adding new tags to uniquely describe model properties.
Creating such a tag for "place-holder" side chain atoms was one of the
recommendations in "Outcome of the First wwPDB/CCDC/D3R Ligand
Validation Workshop" (https://www.ncbi.nlm.nih.gov/pubmed/27050687).  I
don't know the status of the implementation of any of these
recommendations.  The wheels of the wwPDB grind exceedingly slowly.

   This is just another part of the huge problem of describing the
nature of the deposited model and the origin of the information
supporting all of its parts.

1) Riding Hydrogen atoms vrs free-floating and refined
2) Placeholder side chains vrs visible in density
3) Placeholder loops vrs visible in density
4) TLS anisotropic B's vrs restrained individual atom aniso vrs
unrestrained individual - The options here are many and multiple types
may be present in one model
5) NCS restraint/constraint - The options here are many and multiple
types may be present in one model
6) Concerted alternative conformation spread over multiple residues
7) Sequence heterogeneity

These are just the topics that bother me with almost every model I
download.  I'm sure there are plenty of other topics that don't come to
mind at the moment.

   With the move to mmCIF we now have the opportunity to create
descriptions of these model properties w/o just adding more and more
REMARK cards.

   Until that wondrous day arrives we are stuck with trying to create
models that are useful to the general community and provide minimal
opportunity for confusion.  We can argue as to where that line is, but
shouldn't loose sight of the ultimate goal.

   Ethan and I disagree over the relative damage caused by the inclusion
of riding hydrogen atom positions vrs the confusion that results from
their absence.  I think we agree strongly that all of the list items
above need to be tackled by the wwPDB and are of extreme importance.  I
think we need a comprehensive solution, not a piecemeal, special case,
for each.

Dale Tronrud

> Moreover, some programs (including the version of Pymol that I use), get 
> confused when they detect residues with multiple conformations. Like my Pymol 
> version fails to build cartoon elements in those areas, and it is not obvious 
> for a beginner how to remove the alternative conformations. I presume many 
> consumers just ignore such structures and pick up analogues that are 
> displayed without problems.  
> 
> Pymol developers, is it so difficult to report a user, when a structure is 
> loaded that it has residues with alternative conformations, and one of 
> conformers should be hidden for a correct presentation of the secondary 
> structure elements?
> 
> Alex
> 
> 
> 
> On 3/2/20, 12:57 AM, "CCP4 bulletin board on behalf of Dale Tronrud" 
>  wrote:
> 
> [EXTERNAL EMAIL]
> 
> Dear Tim,
> 
>I am in agreement with Ethan and you that a complete description of
> the restraints and constraints applied to the model should be included
> in the deposition.  This is currently a major failing of the wwPDB.  For
> hydrogen atoms we, at least, have the "Riding hydrogen atoms were added"
> remark but that simple statement is inadequate to allow anyone (or
> program) to reproduce what the depositor had on disk before the hydrogen
> atoms were redacted.  We know that shelxl and MolProbity produce
> hydrogen models that differ, and that shelxl requires additional
> information about the temperature of the molecule at least.
> 
>How could someone hope to develop a better technique for generating
> hydrogen atom models if the results could never be deposited and used?
> 
>There is an additional matter of practical importance.  While the two
> of us share a lack of confidence in the care taken by some depositors in
> the creation of hydrogen atoms, I believe the PDB customers are even
> worst.  If a crystallographer or microscopist should not be trusted to
> add hydrogen atoms should we expect an undergrad or, maybe, a high
> school student to do 

Re: [ccp4bb] Hydrogens in PDB File

2020-03-02 Thread Alexander Aleshin
Dear Thomas,
Thank you for the explanation. I use an old version 1.7.2.3, and I believe that 
the problem is fixed in a newer one. What about residues with missing side 
chains? How are they treated when a surface is calculated? I began placing the 
"riding side chains" into my models to avoid problems with various 
presentations in graphical programs.  

As a matter of fact, why don't we use an atom occupancy to indicate that a 
residue/side chain does not have a reliable density? There should be a 
parameter other than a B-factor to indicate reliability of model.


Alex



On 3/2/20, 10:41 AM, "Thomas Holder"  wrote:

[EXTERNAL EMAIL]

Dear Alex,

PyMOL by default shows cartoon (secondary structure) only for the first alt 
conformation. Do you have an example PDB code were you do not get the expected 
representation?

See also:

https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpymolwiki.org%2Findex.php%2FCartoon_all_altdata=02%7C01%7Caaleshin%40sbpdiscovery.org%7C97979562a289428d8d5e08d7bed9596a%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C637187713034852349sdata=7lbfaLPB2OQAOuPo8HEpOCUlRhrCSCQ2UlIzAfbrxxY%3Dreserved=0

Thomas

> On Mar 2, 2020, at 7:12 PM, Alexander Aleshin  
wrote:
>
> Dear Dale,
> You raised a very important issue that has been overly ignored by the 
crystallographic community. The riding hydrogens are just a tip of an iceberg. 
It is absolutely unclear even to an experienced crystallographer how to treat 
poorly ordered side chains or even whole residues. As a matter of fact,  their 
models are "riding atoms", and consumers have no clue how much they can trust 
our modeling.
> Moreover, some programs (including the version of Pymol that I use), get 
confused when they detect residues with multiple conformations. Like my Pymol 
version fails to build cartoon elements in those areas, and it is not obvious 
for a beginner how to remove the alternative conformations. I presume many 
consumers just ignore such structures and pick up analogues that are displayed 
without problems.
>
> Pymol developers, is it so difficult to report a user, when a structure 
is loaded that it has residues with alternative conformations, and one of 
conformers should be hidden for a correct presentation of the secondary 
structure elements?
>
> Alex
>
>
>
> On 3/2/20, 12:57 AM, "CCP4 bulletin board on behalf of Dale Tronrud" 
 wrote:
>
>[EXTERNAL EMAIL]
>
>Dear Tim,
>
>   I am in agreement with Ethan and you that a complete description of
>the restraints and constraints applied to the model should be included
>in the deposition.  This is currently a major failing of the wwPDB.  
For
>hydrogen atoms we, at least, have the "Riding hydrogen atoms were 
added"
>remark but that simple statement is inadequate to allow anyone (or
>program) to reproduce what the depositor had on disk before the 
hydrogen
>atoms were redacted.  We know that shelxl and MolProbity produce
>hydrogen models that differ, and that shelxl requires additional
>information about the temperature of the molecule at least.
>
>   How could someone hope to develop a better technique for generating
>hydrogen atom models if the results could never be deposited and used?
>
>   There is an additional matter of practical importance.  While the 
two
>of us share a lack of confidence in the care taken by some depositors 
in
>the creation of hydrogen atoms, I believe the PDB customers are even
>worst.  If a crystallographer or microscopist should not be trusted to
>add hydrogen atoms should we expect an undergrad or, maybe, a high
>school student to do better?
>
>   When someone downloads a model they expect they will be able to use
>that model without performing a host of technical manipulations just to
>be able to see where the depositor thought the atoms were located.  We
>should certainly give them enough information to understand how those
>atoms were placed (and we are failing at that), but anyone should be
>able to fire up Coot, load a PDB and map, and make some sense of it.
>Maybe someday Coot will be able to automatically generate hydrogen
>atoms, but currently the files do not contain enough information for it
>to do a reasonable job.
>
>   If hydrogen atoms are to be deleted because they can, sort of, be
>recalculated, there are other aspects of the PDB file that also could 
be
>removed.  I think I could do a pretty good job of resurrecting deleted
>CB atoms for any of the nineteen amino acids that contain them.  Should
>we just drop all CB's and add a remark saying that their locations can
>be inferred from the deposited atoms?
>

Re: [ccp4bb] Hydrogens in PDB File

2020-03-02 Thread Alexander Aleshin
Dear Dale,
You raised a very important issue that has been overly ignored by the 
crystallographic community. The riding hydrogens are just a tip of an iceberg. 
It is absolutely unclear even to an experienced crystallographer how to treat 
poorly ordered side chains or even whole residues. As a matter of fact,  their 
models are "riding atoms", and consumers have no clue how much they can trust 
our modeling.
Moreover, some programs (including the version of Pymol that I use), get 
confused when they detect residues with multiple conformations. Like my Pymol 
version fails to build cartoon elements in those areas, and it is not obvious 
for a beginner how to remove the alternative conformations. I presume many 
consumers just ignore such structures and pick up analogues that are displayed 
without problems.  

Pymol developers, is it so difficult to report a user, when a structure is 
loaded that it has residues with alternative conformations, and one of 
conformers should be hidden for a correct presentation of the secondary 
structure elements?

Alex



On 3/2/20, 12:57 AM, "CCP4 bulletin board on behalf of Dale Tronrud" 
 wrote:

[EXTERNAL EMAIL]

Dear Tim,

   I am in agreement with Ethan and you that a complete description of
the restraints and constraints applied to the model should be included
in the deposition.  This is currently a major failing of the wwPDB.  For
hydrogen atoms we, at least, have the "Riding hydrogen atoms were added"
remark but that simple statement is inadequate to allow anyone (or
program) to reproduce what the depositor had on disk before the hydrogen
atoms were redacted.  We know that shelxl and MolProbity produce
hydrogen models that differ, and that shelxl requires additional
information about the temperature of the molecule at least.

   How could someone hope to develop a better technique for generating
hydrogen atom models if the results could never be deposited and used?

   There is an additional matter of practical importance.  While the two
of us share a lack of confidence in the care taken by some depositors in
the creation of hydrogen atoms, I believe the PDB customers are even
worst.  If a crystallographer or microscopist should not be trusted to
add hydrogen atoms should we expect an undergrad or, maybe, a high
school student to do better?

   When someone downloads a model they expect they will be able to use
that model without performing a host of technical manipulations just to
be able to see where the depositor thought the atoms were located.  We
should certainly give them enough information to understand how those
atoms were placed (and we are failing at that), but anyone should be
able to fire up Coot, load a PDB and map, and make some sense of it.
Maybe someday Coot will be able to automatically generate hydrogen
atoms, but currently the files do not contain enough information for it
to do a reasonable job.

   If hydrogen atoms are to be deleted because they can, sort of, be
recalculated, there are other aspects of the PDB file that also could be
removed.  I think I could do a pretty good job of resurrecting deleted
CB atoms for any of the nineteen amino acids that contain them.  Should
we just drop all CB's and add a remark saying that their locations can
be inferred from the deposited atoms?

Dale Tronrud

P.S. I realize that I am open to charges of inconsistency since I have
advocated not depositing an atomic model for atoms that weren't placed
by the depositor (i.e. disordered side chains).  I don't believe I'm
committing this sin.  I'm just saying if the depositor comes up with
locations for atoms they should be deposited.  If the location of an
atom is not known it should not be deposited.  I do not have a desire
for completeness for completeness' sake, just a complete listing of all
the atoms placed by the depositor.  Let that high school student see our
work in all its glory!


On 3/1/2020 4:53 AM, Tim Gruene wrote:
> Dear Dale,
>
> your last sentence is of great importance:
>
> "leaving the (hopefully) manually inspected and curated Hydrogen atoms in
> the deposited PDB"
>
> I believe this hope is unrealistic. Most people do probably not think or
> understand what refinement programs do about hydrogen atoms. In Refmac5 
it has
> long been an option to generate hydrogen atoms for refinement but do not 
put
> them out into the PDB file. Like Ethan, I believe this is best practice. 
Of
> course, in case hydrogen atoms have been curated, one may leave them in 
for
> deposition. It is not useful to see all the H-atoms in Coot, and chemists 
omit
> hydrogen atoms as well even for 2D drawings.
>
> @Matthew Whitley: Adding hydrogen atoms in calculated (riding)  positions
  

Re: [ccp4bb] Hydrogens in PDB File

2020-03-02 Thread Dale Tronrud
Dear Tim,

   I am in agreement with Ethan and you that a complete description of
the restraints and constraints applied to the model should be included
in the deposition.  This is currently a major failing of the wwPDB.  For
hydrogen atoms we, at least, have the "Riding hydrogen atoms were added"
remark but that simple statement is inadequate to allow anyone (or
program) to reproduce what the depositor had on disk before the hydrogen
atoms were redacted.  We know that shelxl and MolProbity produce
hydrogen models that differ, and that shelxl requires additional
information about the temperature of the molecule at least.

   How could someone hope to develop a better technique for generating
hydrogen atom models if the results could never be deposited and used?

   There is an additional matter of practical importance.  While the two
of us share a lack of confidence in the care taken by some depositors in
the creation of hydrogen atoms, I believe the PDB customers are even
worst.  If a crystallographer or microscopist should not be trusted to
add hydrogen atoms should we expect an undergrad or, maybe, a high
school student to do better?

   When someone downloads a model they expect they will be able to use
that model without performing a host of technical manipulations just to
be able to see where the depositor thought the atoms were located.  We
should certainly give them enough information to understand how those
atoms were placed (and we are failing at that), but anyone should be
able to fire up Coot, load a PDB and map, and make some sense of it.
Maybe someday Coot will be able to automatically generate hydrogen
atoms, but currently the files do not contain enough information for it
to do a reasonable job.

   If hydrogen atoms are to be deleted because they can, sort of, be
recalculated, there are other aspects of the PDB file that also could be
removed.  I think I could do a pretty good job of resurrecting deleted
CB atoms for any of the nineteen amino acids that contain them.  Should
we just drop all CB's and add a remark saying that their locations can
be inferred from the deposited atoms?

Dale Tronrud

P.S. I realize that I am open to charges of inconsistency since I have
advocated not depositing an atomic model for atoms that weren't placed
by the depositor (i.e. disordered side chains).  I don't believe I'm
committing this sin.  I'm just saying if the depositor comes up with
locations for atoms they should be deposited.  If the location of an
atom is not known it should not be deposited.  I do not have a desire
for completeness for completeness' sake, just a complete listing of all
the atoms placed by the depositor.  Let that high school student see our
work in all its glory!


On 3/1/2020 4:53 AM, Tim Gruene wrote:
> Dear Dale,
> 
> your last sentence is of great importance:
> 
> "leaving the (hopefully) manually inspected and curated Hydrogen atoms in
> the deposited PDB"
> 
> I believe this hope is unrealistic. Most people do probably not think or 
> understand what refinement programs do about hydrogen atoms. In Refmac5 it 
> has 
> long been an option to generate hydrogen atoms for refinement but do not put 
> them out into the PDB file. Like Ethan, I believe this is best practice. Of 
> course, in case hydrogen atoms have been curated, one may leave them in for 
> deposition. It is not useful to see all the H-atoms in Coot, and chemists 
> omit 
> hydrogen atoms as well even for 2D drawings.
> 
> @Matthew Whitley: Adding hydrogen atoms in calculated (riding)  positions 
> should be rather independent of resolution of the data, since their major 
> role 
> is in improving anti-bumping restraints, and since their major contribution 
> to 
> the diffraction data is in the low resolution data. 
> 
> Best,
> Tim
> 
> 
> On Sunday, March 1, 2020 9:26:29 AM CET Dale Tronrud wrote:
>> Dear Ethan,
>>
>>To move away from an abstract discussion of hydrogen atoms I'd like
>> to describe a concrete example.  In 2008 I deposited a model of the FMO
>> (Bacteriochlorophyll containing) protein.  The ID code is 3EOJ.  The
>> model was refined to a data set cut off at 1.3 A resolution using the
>> criteria of the day.  I used shelxl for the final stage of refinement
>> and added riding hydrogen atoms to the mix.  When I deposited the model
>> I succumb to peer pressure and removed the hydrogen atoms.
>>
>>If you look at the map calculate by the Electron Density Server you
>> will see many peaks in the Fo-Fc map indicating the missing hydrogen
>> atoms.  (I have attached a screen-shot from Coot but I recommend that
>> you fire up Coot and explore the map yourself.)  In my picture you can
>> see the three peaks around a methyl group.  Above and to the left is the
>> peak for the hydrogen of a CH bridging atom in the Bacteriochlorophyll-a
>> ring.  To the right and in the distance is a peak for the hydrogen of a
>> CH2 group.  Not every hydrogen is represented by a positive peak, but
>> they exist 

Re: [ccp4bb] Hydrogens in PDB File

2020-03-01 Thread Tim Gruene
Dear Dale,

your last sentence is of great importance:

"leaving the (hopefully) manually inspected and curated Hydrogen atoms in
the deposited PDB"

I believe this hope is unrealistic. Most people do probably not think or 
understand what refinement programs do about hydrogen atoms. In Refmac5 it has 
long been an option to generate hydrogen atoms for refinement but do not put 
them out into the PDB file. Like Ethan, I believe this is best practice. Of 
course, in case hydrogen atoms have been curated, one may leave them in for 
deposition. It is not useful to see all the H-atoms in Coot, and chemists omit 
hydrogen atoms as well even for 2D drawings.

@Matthew Whitley: Adding hydrogen atoms in calculated (riding)  positions 
should be rather independent of resolution of the data, since their major role 
is in improving anti-bumping restraints, and since their major contribution to 
the diffraction data is in the low resolution data. 

Best,
Tim


On Sunday, March 1, 2020 9:26:29 AM CET Dale Tronrud wrote:
> Dear Ethan,
> 
>To move away from an abstract discussion of hydrogen atoms I'd like
> to describe a concrete example.  In 2008 I deposited a model of the FMO
> (Bacteriochlorophyll containing) protein.  The ID code is 3EOJ.  The
> model was refined to a data set cut off at 1.3 A resolution using the
> criteria of the day.  I used shelxl for the final stage of refinement
> and added riding hydrogen atoms to the mix.  When I deposited the model
> I succumb to peer pressure and removed the hydrogen atoms.
> 
>If you look at the map calculate by the Electron Density Server you
> will see many peaks in the Fo-Fc map indicating the missing hydrogen
> atoms.  (I have attached a screen-shot from Coot but I recommend that
> you fire up Coot and explore the map yourself.)  In my picture you can
> see the three peaks around a methyl group.  Above and to the left is the
> peak for the hydrogen of a CH bridging atom in the Bacteriochlorophyll-a
> ring.  To the right and in the distance is a peak for the hydrogen of a
> CH2 group.  Not every hydrogen is represented by a positive peak, but
> they exist throughout the map.  This Fo-Fc map is useless for the
> purpose of assessing the quality of my model, since the true residuals
> are hidden among all these hydrogen peaks.
> 
>A critic might say that these peaks are simply the result of the
> model being biased toward the presence of hydrogen atoms and therefore
> an artifact.  A model refined to this data set w/o hydrogen atoms would
> not likely show peaks indicating that hydrogen atoms need to be built.
> 
>I would say that the map calculated from a Hydrogen-free model is the
> biased one.  I am 99% confident in the location of most of the riding
> hydrogen atoms and leaving them out results in a model that is
> fantastically unlikely.  The absence of peaks in an apo map is a flaw in
> that map.  I would describe it as "vacuum bias".  "Biasing" a model
> toward reality is not a problem.
> 
>This example shows that the current PDB is incompatible with models
> whose Hydrogen atoms have been stripped.  To get proper maps and
> validation reports one has to either preserve the Hydrogen atoms in the
> model, or modify all the software that uses coordinate files to add the
> hydrogen atoms back in.  That is a major programming task, which the
> authors have, apparently, been unwilling to do.
> 
>I will go further and disagree with you that even this is a solution.
>  It is very difficult to add the Hydrogen atoms back into 3EOJ, and I
> expect this difficulty is the reason software has not been written that
> successfully does it.
> 
>There are two major problems to be overcome in 3EOJ.  shelxl has an
> option to twirl the methyl groups and select the torsion angle with the
> best fit to the map.  The hydrogen atoms in the pictured methyl group
> weren't built as staggered -- All values for the torsion angle were
> tested and it happens that the best fit placed them in a staggered
> conformation.  That is a much more interesting result.  There are other
> methyl groups around the edges of the Bchl-a molecules that are crowded
> and the methyl groups are observed to have torsion angles that are not
> standard for riding Hydrogen atoms.  The neighboring methyl groups avoid
> H-H bumps by twisting and that twist can be detected by shelxl in the
> 1.3 A data.
> 
>The second problem is the matter of Histidine residues.  There are
> two Nitrogen atoms in the side chain.  A hydrogen atom could be on
> either one, and sometimes both have hydrogens.  A very clever program
> could work out from the hydrogen bonding pattern the most likely
> placement, but I've not seen any program that is very good with hydrogen
> bonding networks.  Worst still, I've often seen programs place the
> hydrogen atom *between* the Nitrogen and Magnesium atoms of a Histidine
> ligand to a Bacteriochlorophyll a.  This mistake will certainly lead to
> very bad geometry!

Re: [ccp4bb] Hydrogens in PDB File

2020-03-01 Thread Robbie Joosten
Hi Dale,

You make very valid points and there are good reasons to keep the refined 
hydrogen positions (methyl twists an protonation of HIS are good examples). 
There is a way of distinguishing refined a modelled hydrogens in mmCIF and we 
should start using that.
About protonation of hustidines: WHAT IF does quite a decent job although there 
is room for improvement around ligands.
About maps from EDS: These were (and perhaps are) made by running 0 cycles of 
unrestrained refinement in Refmac. Unrestrained refinement takes away the need 
to sort out restraints which was an absolute nightmare at the time. A 
side-effect of unrestrained refinement is that Refmac cannot (could not) add 
hydrogens. PDB-REDO needs restraints anyway so this does 0 cycles restrained 
refinement to generate the first maps and adds hydrogens in the process. This 
addition is not that sophisticated, but it should make the maps better. Have a 
look https://pdb-redo.eu/db/3eoj.zip

Cheers,
Robbie

On 1 Mar 2020 09:26, Dale Tronrud  wrote:

Dear Ethan,

   To move away from an abstract discussion of hydrogen atoms I'd like
to describe a concrete example.  In 2008 I deposited a model of the FMO
(Bacteriochlorophyll containing) protein.  The ID code is 3EOJ.  The
model was refined to a data set cut off at 1.3 A resolution using the
criteria of the day.  I used shelxl for the final stage of refinement
and added riding hydrogen atoms to the mix.  When I deposited the model
I succumb to peer pressure and removed the hydrogen atoms.

   If you look at the map calculate by the Electron Density Server you
will see many peaks in the Fo-Fc map indicating the missing hydrogen
atoms.  (I have attached a screen-shot from Coot but I recommend that
you fire up Coot and explore the map yourself.)  In my picture you can
see the three peaks around a methyl group.  Above and to the left is the
peak for the hydrogen of a CH bridging atom in the Bacteriochlorophyll-a
ring.  To the right and in the distance is a peak for the hydrogen of a
CH2 group.  Not every hydrogen is represented by a positive peak, but
they exist throughout the map.  This Fo-Fc map is useless for the
purpose of assessing the quality of my model, since the true residuals
are hidden among all these hydrogen peaks.

   A critic might say that these peaks are simply the result of the
model being biased toward the presence of hydrogen atoms and therefore
an artifact.  A model refined to this data set w/o hydrogen atoms would
not likely show peaks indicating that hydrogen atoms need to be built.

   I would say that the map calculated from a Hydrogen-free model is the
biased one.  I am 99% confident in the location of most of the riding
hydrogen atoms and leaving them out results in a model that is
fantastically unlikely.  The absence of peaks in an apo map is a flaw in
that map.  I would describe it as "vacuum bias".  "Biasing" a model
toward reality is not a problem.

   This example shows that the current PDB is incompatible with models
whose Hydrogen atoms have been stripped.  To get proper maps and
validation reports one has to either preserve the Hydrogen atoms in the
model, or modify all the software that uses coordinate files to add the
hydrogen atoms back in.  That is a major programming task, which the
authors have, apparently, been unwilling to do.

   I will go further and disagree with you that even this is a solution.
It is very difficult to add the Hydrogen atoms back into 3EOJ, and I
expect this difficulty is the reason software has not been written that
successfully does it.

   There are two major problems to be overcome in 3EOJ.  shelxl has an
option to twirl the methyl groups and select the torsion angle with the
best fit to the map.  The hydrogen atoms in the pictured methyl group
weren't built as staggered -- All values for the torsion angle were
tested and it happens that the best fit placed them in a staggered
conformation.  That is a much more interesting result.  There are other
methyl groups around the edges of the Bchl-a molecules that are crowded
and the methyl groups are observed to have torsion angles that are not
standard for riding Hydrogen atoms.  The neighboring methyl groups avoid
H-H bumps by twisting and that twist can be detected by shelxl in the
1.3 A data.

   The second problem is the matter of Histidine residues.  There are
two Nitrogen atoms in the side chain.  A hydrogen atom could be on
either one, and sometimes both have hydrogens.  A very clever program
could work out from the hydrogen bonding pattern the most likely
placement, but I've not seen any program that is very good with hydrogen
bonding networks.  Worst still, I've often seen programs place the
hydrogen atom *between* the Nitrogen and Magnesium atoms of a Histidine
ligand to a Bacteriochlorophyll a.  This mistake will certainly lead to
very bad geometry!

   Until an hydrogenation program is written that can handle all
ligands, all hydrogen bonding networks (even overlapping 

Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Ethan A Merritt
Matthew:

I think your nice summary leaves out an important point that has not been
explicitly mentioned.  That is the question of whether depositing hydrogens
actually adds information to the model. I submit that for a typical protein
refinement it does not.  The model is adequately described by saying
"hydrogens were added in their riding positions". This, together with 
knowledge of the refinement program used, is sufficient to reconstruct
the full model.  

This is an example of a recurring concern of mine that model validation
should include consideration of whether the model is overly complex.
Unless you have an abundance of data (which admittedly your 1.0Å case
might) there are insufficient observations to refine 3 positional parameters
for each hydrogen as if they were free variables.  We typically bypass this
by instead using the riding hydrogen model, which adds effectively a single 
on/off parameter for the entire mode (plus a small number of implicit
parameters that describe the ideal riding geometry, but those are 
normally taken as a priori knowledge rather than free variables).

So I find deposition of hydrogens for a typical resolution structure to
be more misleading than useful.  The correct, parsimonious, description is the
one-line statement that a riding hydrogen model was used.

It is tangential to your question, but I hold the same view about depositing
ANISOU records for a structure when the source of the anisotropy is solely
a TLS model, either with or without individual Biso contributions.
The parsimonious description is to give the TLS parameters and the Biso
component, if any.   These can be expanded to regenerate per-atom
ANISOU parameters if desired by a downstream program.  
If you deposit ANISOU records it implies that the Uij terms they describe
are free variables, but they are not.  (or anyway IMHO they should not be,
although PHENIX can violate this stricture).

My view is that for a typical structure (i.e. worse than say 1Å resolution data)
depositing hydrogen positions and ANISOU records at best does no harm. 
Unfortunately it implies a statistically unjustifiable model treatment.
The justifiable model is adequately described by the small number of
parameters in the header records;  the hydrogen coordinates and ANISOU
parameters are redundant dross.  

I fully understand that your original question was driven by cases where
you do have very high resolution data and so the statistical justification of
refining individual hydrogens or anisotropic ADPs enters a different realm.

Ethan


On Friday, 28 February 2020 20:22:17 PST Whitley, Matthew J wrote:
> Dear all,
> 
> I want to thank everyone who responded to my query about whether or not to
> include hydrogens in PDB depositions when they were explicitly included in
> the model during refinement.  In addition to the replies posted to this
> bulletin board, I received numerous replies sent directly to my email
> address.
> 
> To clarify one more time for casual readers so that we are all on the same
> page: because these two structures happen to be at high resolution (1.0 and
> 1.2 Å, respectively), I decided to include explicit hydrogens in the model
> for refinement, as recommended by the documentation for both Phenix and
> Buster, which I used for these refinements.  For the Phenix refinements,
> hydrogens were added by phenix.ready_set, whereas for the Buster
> refinements the hydrogenate tool was used.  My understanding is that both
> of these eventually call the reduce tool from MolProbity.  Unsurprisingly,
> the presence of hydrogens on the model led to both better model geometry
> and lower R-factors, although at these resolutions there is no observable
> density for the vast majority of the H-atoms in any of the refined maps.
> 
> Because the presence of the hydrogens improved the model, I have decided to
> leave the hydrogens at their refined positions for deposition.
> 
> I do want to point out one thing for readers interested in this topic: based
> on all the replies I received, there are a number of differing opinions
> (and therefore different practices) as to whether hydrogens should be
> included in deposited structures.  The expressed opinions ranged from the
> ethical (if the hydrogens were there for refinement, then it’s only fair
> that they be present in the deposited structure so that downstream users
> know what went into generating the reported statistics) to the practical
> (if the paper’s conclusions don’t rely on any arguments based on hydrogen
> atom positions, then there’s no compelling reason for them to be there;
> include them or don’t, it doesn’t matter.)  Because opinions seem to vary,
> perhaps it would be worthwhile for the PDB to issue some guidance on the
> matter for the future.
> 
> Have a nice weekend, everyone.
> 
> Matthew
> 
> ---
> Matthew J. Whitley, Ph.D.
> Research Instructor
> Department of Pharmacology & Chemical Biology
> University of Pittsburgh School of 

Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Alexander Aleshin
Dear Diana and CCP4BB subscribers,

I apologize for the mess created by our recent communication with Diana. The 
misunderstanding occurred because my replies to “ALL” went to senders, but were 
not posted at the CCP4BB site. This happened because my email address changed 
recently. As a result, Diana could only see my message with comments inserted 
by Ethan, which was hard to understand.

In reality, I agree with Diana that the riding hydrogens could not be omitted 
from the PDB submission if they were used during the refinement. This would 
create questions to the submitted structure because the Depositor refinement 
statistics would differ from that calculated from the submitted structure  
missing the hydrogens.

Regards,
Alex


From: Diana Tomchick 
Date: Friday, February 28, 2020 at 12:34 PM
To: Alexander Aleshin 
Cc: CCP4 
Subject: Re: [ccp4bb] Hydrogens in PDB File

[EXTERNAL EMAIL]
My structure contains riding hydrogens because (surprise!) the protein contains 
hydrogens, no matter the resolution of the data. Hydrogens should contribute to 
the calculated structure factors, because they do contribute to the observed 
structure factors. If I delete the hydrogens from the coordinates and re-refine 
the model (even keeping the B-factors constant), the geometry will be worse.

When using Refmac, you can opt to explicitly write out the riding hydrogen 
atoms to the final mmCIF file, why don’t you try that and see if it works for 
you?

From the Phenix documentation web site 
(https://www.phenix-online.org/documentation/reference/refine_gui.html<https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phenix-online.org%2Fdocumentation%2Freference%2Frefine_gui.html=02%7C01%7Caaleshin%40sbpdiscovery.org%7Ce21f5ba4e0ee48a2b87908d7bc8da049%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C637185188782353919=BA18fGAO7IUmBOvQDqI5ZXOpHfIztfyUXBn1EQJP%2FwY%3D=0>)

Optimization methods and other 
options<https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.phenix-online.org%2Fdocumentation%2Freference%2Frefine_gui.html%23id9=02%7C01%7Caaleshin%40sbpdiscovery.org%7Ce21f5ba4e0ee48a2b87908d7bc8da049%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C637185188782363912=80mJPVgw5B%2FX%2BU56HczRIq%2Ftb3FtxQ8WB7tcKNGy9MI%3D=0>
Several other options are available for the types of optimization used; these 
typically apply globally, except as noted.

  *   Automatically add hydrogens to model runs phenix.ready_set to add 
hydrogen atoms (or deuteriums, if performing neutron crystallography) where 
appropriate. This usually only affects the R-factors at high resolution, but 
can be very helpful for improving geometry at any resolution. We recommend 
using explicit hydrogens on protein, nucleic acid, and ligand molecules 
throughout refinement. (We do not reocmmend adding hydrogens to waters unless 
you have exceptionally high resolution.) Hydrogen atoms will still be defined 
using the "riding" model unless otherwise requested, so they do not add 
parameters during refinement. (Note that this option can be left on if you 
already have hydrogen atoms in place and are refining as "riding"; if you are 
refining against neutron data and/or allowing hydrogen atoms to refine 
individually, you should uncheck the box, as it will otherwise replace the 
existing atoms.)

Diana

**
Diana R. Tomchick
Professor
Departments of Biophysics and Biochemistry
UT Southwestern Medical Center
5323 Harry Hines Blvd.
Rm. ND10.214A
Dallas, TX 75390-8816
diana.tomch...@utsouthwestern.edu<mailto:diana.tomch...@utsouthwestern.edu>
(214) 645-6383 (phone)
(214) 645-6353 (fax)

On Feb 28, 2020, at 2:23 PM, Alexander Aleshin 
mailto:aales...@sbpdiscovery.org>> wrote:


EXTERNAL MAIL

Diana,
Your structure 6PWD contains riding hydrogens even though the resolution is  
only 2.5A. So, the lack of differences between Depositor and DCC values does 
not contradict my point. Now, try to delete the hydrogens and recalculate 
R/Rfree without re-refining B-factors. You’ll see a noticeable difference. It 
is because hydrogens contribute to calculated structural factors.

Alex

From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
on behalf of Diana Tomchick 
mailto:diana.tomch...@utsouthwestern.edu>>
Reply-To: Diana Tomchick 
mailto:diana.tomch...@utsouthwestern.edu>>
Date: Friday, February 28, 2020 at 11:52 AM
To: "CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>" 
mailto:CCP4BB@JISCMAIL.AC.UK>>
Subject: Re: [ccp4bb] Hydrogens in PDB File

[EXTERNAL EMAIL]
Here is just one of many examples that illustrates what I was trying to say 
about deposition of the final mmCIF file from the final output of Phenix 
refinement. This is from the freely downloadable PDB validation report for PDB 
ID 6PWD. Search the PDB on my last name and download several of the validation 
reports, and if you find one that diverg

Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Diana Tomchick
My structure contains riding hydrogens because (surprise!) the protein contains 
hydrogens, no matter the resolution of the data. Hydrogens should contribute to 
the calculated structure factors, because they do contribute to the observed 
structure factors. If I delete the hydrogens from the coordinates and re-refine 
the model (even keeping the B-factors constant), the geometry will be worse.

When using Refmac, you can opt to explicitly write out the riding hydrogen 
atoms to the final mmCIF file, why don’t you try that and see if it works for 
you?

From the Phenix documentation web site 
(https://www.phenix-online.org/documentation/reference/refine_gui.html)

Optimization methods and other 
options<https://www.phenix-online.org/documentation/reference/refine_gui.html#id9>

Several other options are available for the types of optimization used; these 
typically apply globally, except as noted.

  *   Automatically add hydrogens to model runs phenix.ready_set to add 
hydrogen atoms (or deuteriums, if performing neutron crystallography) where 
appropriate. This usually only affects the R-factors at high resolution, but 
can be very helpful for improving geometry at any resolution. We recommend 
using explicit hydrogens on protein, nucleic acid, and ligand molecules 
throughout refinement. (We do not reocmmend adding hydrogens to waters unless 
you have exceptionally high resolution.) Hydrogen atoms will still be defined 
using the "riding" model unless otherwise requested, so they do not add 
parameters during refinement. (Note that this option can be left on if you 
already have hydrogen atoms in place and are refining as "riding"; if you are 
refining against neutron data and/or allowing hydrogen atoms to refine 
individually, you should uncheck the box, as it will otherwise replace the 
existing atoms.)

Diana

**
Diana R. Tomchick
Professor
Departments of Biophysics and Biochemistry
UT Southwestern Medical Center
5323 Harry Hines Blvd.
Rm. ND10.214A
Dallas, TX 75390-8816
diana.tomch...@utsouthwestern.edu<mailto:diana.tomch...@utsouthwestern.edu>
(214) 645-6383 (phone)
(214) 645-6353 (fax)

On Feb 28, 2020, at 2:23 PM, Alexander Aleshin 
mailto:aales...@sbpdiscovery.org>> wrote:


EXTERNAL MAIL

Diana,
Your structure 6PWD contains riding hydrogens even though the resolution is  
only 2.5A. So, the lack of differences between Depositor and DCC values does 
not contradict my point. Now, try to delete the hydrogens and recalculate 
R/Rfree without re-refining B-factors. You’ll see a noticeable difference. It 
is because hydrogens contribute to calculated structural factors.

Alex

From: CCP4 bulletin board mailto:CCP4BB@JISCMAIL.AC.UK>> 
on behalf of Diana Tomchick 
mailto:diana.tomch...@utsouthwestern.edu>>
Reply-To: Diana Tomchick 
mailto:diana.tomch...@utsouthwestern.edu>>
Date: Friday, February 28, 2020 at 11:52 AM
To: "CCP4BB@JISCMAIL.AC.UK<mailto:CCP4BB@JISCMAIL.AC.UK>" 
mailto:CCP4BB@JISCMAIL.AC.UK>>
Subject: Re: [ccp4bb] Hydrogens in PDB File

[EXTERNAL EMAIL]
Here is just one of many examples that illustrates what I was trying to say 
about deposition of the final mmCIF file from the final output of Phenix 
refinement. This is from the freely downloadable PDB validation report for PDB 
ID 6PWD. Search the PDB on my last name and download several of the validation 
reports, and if you find one that diverges a lot between the Depositor and DCC 
values, let me know. They should be so close to the same as to be meaningless, 
at least for depositions within the last 3-5 years (older ones may indeed 
diverge a lot, but we all can’t be held responsible for the outputs of 
now-obsolete software).

Diana




UT Southwestern


Medical Center



The future of medicine, today.





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1<https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2Fwebadmin%3FSUBED1%3DCCP4BB%26A%3D1=02%7C01%7Caaleshin%40sbpdiscovery.org%7C55b909a58ce1494011bc08d7bc87cd5d%7C0b162723004547deb0699f1a7aa955a1%7C0%7C0%7C63718516374363=e3HTPz3AzuwT1Ao7iNT63QOLKfYD%2FZWJOFtPOOBiXp0%3D=0>

CAUTION: This email originated from outside UTSW. Please be cautious of links 
or attachments, and validate the sender's email address before replying.




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Ethan A Merritt
On Friday, 28 February 2020 11:19:37 PST Diana Tomchick wrote:
> If you deposit an mmCIF file that contains both the observed and calculated
> structure factors from your final round of refinement, then the PDB
> auto-validation reports the same (or so close to the same as to be
> negligible) R factors.

Yes, that's the "Depositor" line in the validation report.
The separate "DCC" line tries to recalculate from the model description.
It matches quite well if your model matches its default expectations,
but otherwise it can diverge.  It's a useful check that you haven't made
some inadvertent mistake in preparing the deposition files, but beyond
that it is less useful than, say, confirming that the model refinement
can be reproduced by feeding it to PDB-Redo.

Ethan


 
> Phenix outputs all of this automatically for you if you click the correct
> radio button in the GUI.
 
> Diana
> 
> **
> Diana R. Tomchick
> Professor
> Departments of Biophysics and Biochemistry
> UT Southwestern Medical Center
> 5323 Harry Hines Blvd.
> Rm. ND10.214A
> Dallas, TX 75390-8816
> diana.tomch...@utsouthwestern.edu
> (214) 645-6383 (phone)
> (214) 645-6353 (fax)
> 
> On Feb 28, 2020, at 12:51 PM, Ethan A Merritt
> mailto:merr...@uw.edu>> wrote:
 
> EXTERNAL MAIL
> 
> On Thursday, 27 February 2020 16:34:50 PST Alexander Aleshin wrote:
> Ethan wrote:
>- If you are not making claims about hydrogens but just want to
> describe what you did during refinement, I'd go with taking them out
> 
> I've noticed that REFMAC and Phenix use riding hydrogens to calculate the
> refinement statistics, and their exclusion affects R/Rfree. As a result, it
> is not clear what values should be reported.
> 
> In my opinion, riding
> hydrogens play same role as the TLS parameters, which we keep in a pdb
> submission. So, I am not convinced their omission is a good idea.
> I think PDB curators should provide a guidance  how to deal with issues
> like
 a resolution of anisotropic or incomplete data sets, riding
> hydrogens, sequence numbering etc.
> 
> Alex:
> 
> You are right that the PDB auto-validation step of recalculating R factors
> from the deposited model and observed F's is far from perfect.
> I have not looked at the DCC source code, but my impression from the
> R factors it spits back at me during deposition:
> - it ignores TLS records
> - it ignores the header record specifying choice of solvent model
> - it does use scattering factors f' and f" from the mmcif coordinate file
> - I have no idea what it does with twinning descriptions
> 
> As a result there is often a noticeable discrepancy between the R-factors
> from "Depositor" and "DCC" in the validation reports.
> 
> Regards,
> Alex
> 
> 
> 
> 
> On 2/27/20, 4:05 PM, "CCP4 bulletin board on behalf of Ethan A Merritt"
> mailto:CCP4BB@JISCMAIL.AC.UK> on behalf of
> merr...@uw.edu> wrote:
 
>[EXTERNAL EMAIL]
> 
>On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:
> 
> Hello all,
> 
> 
> 
> I am nearly finished refining the structures of two mutant proteins
> from
> crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
> respectively.  Refinement was conducted in the presence of explicit
> hydrogens on the models.  I am preparing to deposit these models into
> the
> PDB but am unsure about whether to retain or remove the hydrogens for
> deposition.  On one hand, these hydrogens were explicitly used during
> refinement, so that makes me want to keep them, but on the other hand,
> they
> were added at theoretical positions by MolProbity’s reduce tool
> for refinement and were not positioned on the basis of experimentally
> observed electron density, so that makes me want to delete them from
> the
> experimental model.  Which is the preferred option for this
> situation?
> 
> 
>The order of operations you describe is unclear.
> 
>If you explicitly refined hydrogens then their final positions are
> indeed
> based on experimentally determined data.
>The fact that you initially placed them into ideal geometry is not
> really
> any different from the non-H atoms of individual protein residues
> in your model, whose original positions were also based on known
> stereochemistry.
>On the other hand, if you mean that the hydrogens you used for
> refinement
> were deleted and replaced during validation by Molprobity
> (which I think it may do by default) that's not good.  You should rather
> keep the hydrogen positions from refinement, not the ones from Molprobity.
> 
>Assuming (since this is ccp4bb) you refined with refmac...
>- If you are at the level of investigating hydrogen positions, you may
> want
> to consider taking the refinement into shelxl.
>- If you are not making claims about hydrogens but just want to
> describe
> what you did during refinement, I'd go with taking them out and
> settling for the 

Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Diana Tomchick
If you deposit an mmCIF file that contains both the observed and calculated 
structure factors from your final round of refinement, then the PDB 
auto-validation reports the same (or so close to the same as to be negligible) 
R factors.

Phenix outputs all of this automatically for you if you click the correct radio 
button in the GUI.

Diana

**
Diana R. Tomchick
Professor
Departments of Biophysics and Biochemistry
UT Southwestern Medical Center
5323 Harry Hines Blvd.
Rm. ND10.214A
Dallas, TX 75390-8816
diana.tomch...@utsouthwestern.edu
(214) 645-6383 (phone)
(214) 645-6353 (fax)

On Feb 28, 2020, at 12:51 PM, Ethan A Merritt 
mailto:merr...@uw.edu>> wrote:

EXTERNAL MAIL

On Thursday, 27 February 2020 16:34:50 PST Alexander Aleshin wrote:
Ethan wrote:
   - If you are not making claims about hydrogens but just want to
describe what you did during refinement, I'd go with taking them out

I've noticed that REFMAC and Phenix use riding hydrogens to calculate the
refinement statistics, and their exclusion affects R/Rfree. As a result, it
is not clear what values should be reported.

In my opinion, riding
hydrogens play same role as the TLS parameters, which we keep in a pdb
submission. So, I am not convinced their omission is a good idea.
I think PDB curators should provide a guidance  how to deal with issues like
a resolution of anisotropic or incomplete data sets, riding hydrogens,
sequence numbering etc.

Alex:

You are right that the PDB auto-validation step of recalculating R factors
from the deposited model and observed F's is far from perfect.
I have not looked at the DCC source code, but my impression from the
R factors it spits back at me during deposition:
- it ignores TLS records
- it ignores the header record specifying choice of solvent model
- it does use scattering factors f' and f" from the mmcif coordinate file
- I have no idea what it does with twinning descriptions

As a result there is often a noticeable discrepancy between the R-factors
from "Depositor" and "DCC" in the validation reports.

Regards,
Alex




On 2/27/20, 4:05 PM, "CCP4 bulletin board on behalf of Ethan A Merritt"
mailto:CCP4BB@JISCMAIL.AC.UK> on behalf of 
merr...@uw.edu> wrote:

   [EXTERNAL EMAIL]

   On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:

Hello all,



I am nearly finished refining the structures of two mutant proteins
from
crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
respectively.  Refinement was conducted in the presence of explicit
hydrogens on the models.  I am preparing to deposit these models into
the
PDB but am unsure about whether to retain or remove the hydrogens for
deposition.  On one hand, these hydrogens were explicitly used during
refinement, so that makes me want to keep them, but on the other hand,
they
were added at theoretical positions by MolProbity’s reduce tool
for refinement and were not positioned on the basis of experimentally
observed electron density, so that makes me want to delete them from
the
experimental model.  Which is the preferred option for this
situation?


   The order of operations you describe is unclear.

   If you explicitly refined hydrogens then their final positions are
indeed
based on experimentally determined data.
   The fact that you initially placed them into ideal geometry is not
really
any different from the non-H atoms of individual protein residues
in your model, whose original positions were also based on known
stereochemistry.
   On the other hand, if you mean that the hydrogens you used for
refinement
were deleted and replaced during validation by Molprobity
(which I think it may do by default) that's not good.  You should rather
keep the hydrogen positions from refinement, not the ones from Molprobity.

   Assuming (since this is ccp4bb) you refined with refmac...
   - If you are at the level of investigating hydrogen positions, you may
want
to consider taking the refinement into shelxl.
   - If you are not making claims about hydrogens but just want to
describe
what you did during refinement, I'd go with taking them out and
settling for the standard record in the resulting PDB file:
 REMARK   3  HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS
   which looks like this in the corresponding mmcif file:
 _refine.details   'Hydrogens have been added in their riding
positions'

   Ethan




Thanks,
Matthew



---
Matthew J. Whitley, Ph.D.
Research Instructor
Department of Pharmacology & Chemical Biology
University of Pittsburgh School of Medicine




##
##



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1
CAUTION: This email originated from outside UTSW. Please be cautious of 

Re: [ccp4bb] Hydrogens in PDB File

2020-02-28 Thread Ethan A Merritt
On Thursday, 27 February 2020 16:34:50 PST Alexander Aleshin wrote:
> Ethan wrote:
> - If you are not making claims about hydrogens but just want to
> describe what you did during refinement, I'd go with taking them out
> 
> I've noticed that REFMAC and Phenix use riding hydrogens to calculate the
> refinement statistics, and their exclusion affects R/Rfree. As a result, it
> is not clear what values should be reported. 

> In my opinion, riding
> hydrogens play same role as the TLS parameters, which we keep in a pdb
> submission. So, I am not convinced their omission is a good idea. 
> I think PDB curators should provide a guidance  how to deal with issues like
> a resolution of anisotropic or incomplete data sets, riding hydrogens,
> sequence numbering etc. 

Alex:

You are right that the PDB auto-validation step of recalculating R factors
from the deposited model and observed F's is far from perfect.
I have not looked at the DCC source code, but my impression from the 
R factors it spits back at me during deposition:
- it ignores TLS records
- it ignores the header record specifying choice of solvent model
- it does use scattering factors f' and f" from the mmcif coordinate file
- I have no idea what it does with twinning descriptions

As a result there is often a noticeable discrepancy between the R-factors
from "Depositor" and "DCC" in the validation reports.
 
> Regards,
> Alex 
> 
>   
> 
> 
> On 2/27/20, 4:05 PM, "CCP4 bulletin board on behalf of Ethan A Merritt"
>  wrote:
 
> [EXTERNAL EMAIL]
> 
> On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:
> 
> > Hello all,
> >
> >
> >
> > I am nearly finished refining the structures of two mutant proteins
> > from
> > crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
> > respectively.  Refinement was conducted in the presence of explicit
> > hydrogens on the models.  I am preparing to deposit these models into
> > the
> > PDB but am unsure about whether to retain or remove the hydrogens for
> > deposition.  On one hand, these hydrogens were explicitly used during
> > refinement, so that makes me want to keep them, but on the other hand,
> > they
 were added at theoretical positions by MolProbity’s reduce tool
> > for refinement and were not positioned on the basis of experimentally
> > observed electron density, so that makes me want to delete them from
> > the
> > experimental model.  Which is the preferred option for this
> > situation?
> 
> 
> The order of operations you describe is unclear.
> 
> If you explicitly refined hydrogens then their final positions are
> indeed
 based on experimentally determined data.
> The fact that you initially placed them into ideal geometry is not
> really
 any different from the non-H atoms of individual protein residues
> in your model, whose original positions were also based on known
> stereochemistry. 
> On the other hand, if you mean that the hydrogens you used for
> refinement
 were deleted and replaced during validation by Molprobity
> (which I think it may do by default) that's not good.  You should rather
> keep the hydrogen positions from refinement, not the ones from Molprobity.
> 
> Assuming (since this is ccp4bb) you refined with refmac...
> - If you are at the level of investigating hydrogen positions, you may
> want
 to consider taking the refinement into shelxl.
> - If you are not making claims about hydrogens but just want to
> describe
 what you did during refinement, I'd go with taking them out and
> settling for the standard record in the resulting PDB file:
>   REMARK   3  HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS
> which looks like this in the corresponding mmcif file:
>   _refine.details   'Hydrogens have been added in their riding
> positions'
 
> Ethan
> 
> 
> >
> >
> > Thanks,
> > Matthew
> >
> >
> >
> > ---
> > Matthew J. Whitley, Ph.D.
> > Research Instructor
> > Department of Pharmacology & Chemical Biology
> > University of Pittsburgh School of Medicine
> >
> >
> >
> >
> > ##
> > ##



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] Hydrogens in PDB File

2020-02-27 Thread 00000c2488af9525-dmarc-request
I was going to say there's no harm in depositing hydrogen atom positions! However, some structures of similar resolution that we refined with shelx using riding H's and anisotropic B's, somewhat embarrassingly, do not seem to have them! These were deposited about 20 years ago so my memory is vague. Maybe there is or was some restriction on depositing H-atom coordinates, or more probably we found that having them in the PDB file (in the pre-Coot era) was guaranteed to crash the average graphics program of the day or at least scramble the displayed structure for whoever downloaded the coordinates. I suspect the latter.Jon CooperOn 28 Feb 2020 00:07, "Whitley, Matthew J"  wrote:

Hi Ethan, thanks for your reply.  The correct situation is the former: hydrogens added at idealized positions *before* refinement and then subjected to refinement along
 with the rest of the model.
 
After refinement, MolProbity (the online server) does indeed remove any hydrogens in the PDB file and add them back at idealized positions for the purpose of its calculations,
 but I am most definitely *not* talking about depositing these post-refinement hydrogens manipulated by MolProbity.

 
Matthew
 
---
Matthew J. Whitley, Ph.D.
Research Instructor
Department of Pharmacology & Chemical Biology
University of Pittsburgh School of Medicine
 

From: Ethan A Merritt
Sent: Thursday, February 27, 2020 6:57 PM
To: Whitley, Matthew J
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] Hydrogens in PDB File

 
On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:
> Hello all,
> 
> I am nearly finished refining the structures of two mutant proteins from
> crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
> respectively.  Refinement was conducted in the presence of explicit
> hydrogens on the models.  I am preparing to deposit these models into the
> PDB but am unsure about whether to retain or remove the hydrogens for
> deposition.  On one hand, these hydrogens were explicitly used during
> refinement, so that makes me want to keep them, but on the other hand, they
> were added at theoretical positions by MolProbity’s reduce tool for
> refinement and were not positioned on the basis of experimentally observed
> electron density, so that makes me want to delete them from the
> experimental model.  Which is the preferred option for this situation?

The order of operations you describe is unclear.

If you explicitly refined hydrogens then their final positions are indeed
based on experimentally determined data.
The fact that you initially placed them into ideal geometry is not really
any different from the non-H atoms of individual protein residues in your
model, whose original positions were also based on known stereochemistry.

On the other hand, if you mean that the hydrogens you used for refinement
were deleted and replaced during validation by Molprobity (which I think it
may do by default) that's not good.  You should rather keep the hydrogen
positions from refinement, not the ones from Molprobity.

Assuming (since this is ccp4bb) you refined with refmac...
- If you are at the level of investigating hydrogen positions, you may want
to consider taking the refinement into shelxl.  
- If you are not making claims about hydrogens but just want to describe
what you did during refinement, I'd go with taking them out and settling
for the standard record in the resulting PDB file:
  REMARK   3  HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS
which looks like this in the corresponding mmcif file:
  _refine.details   'Hydrogens have been added in their riding positions'

    Ethan

> 
> Thanks,
> Matthew
> 
> ---
> Matthew J. Whitley, Ph.D.
> Research Instructor
> Department of Pharmacology & Chemical Biology
> University of Pittsburgh School of Medicine
> 
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> 
https://nam05.safelinks.protection.outlook.com/?url="">


-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742


 




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1



Re: [ccp4bb] Hydrogens in PDB File

2020-02-27 Thread Whitley, Matthew J
Hi Ethan, thanks for your reply.  The correct situation is the former: 
hydrogens added at idealized positions *before* refinement and then subjected 
to refinement along with the rest of the model.

After refinement, MolProbity (the online server) does indeed remove any 
hydrogens in the PDB file and add them back at idealized positions for the 
purpose of its calculations, but I am most definitely *not* talking about 
depositing these post-refinement hydrogens manipulated by MolProbity.

Matthew

---
Matthew J. Whitley, Ph.D.
Research Instructor
Department of Pharmacology & Chemical Biology
University of Pittsburgh School of Medicine

From: Ethan A Merritt<mailto:merr...@uw.edu>
Sent: Thursday, February 27, 2020 6:57 PM
To: Whitley, Matthew J<mailto:mjw...@pitt.edu>
Cc: CCP4BB@jiscmail.ac.uk<mailto:CCP4BB@jiscmail.ac.uk>
Subject: Re: [ccp4bb] Hydrogens in PDB File

On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:
> Hello all,
>
> I am nearly finished refining the structures of two mutant proteins from
> crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
> respectively.  Refinement was conducted in the presence of explicit
> hydrogens on the models.  I am preparing to deposit these models into the
> PDB but am unsure about whether to retain or remove the hydrogens for
> deposition.  On one hand, these hydrogens were explicitly used during
> refinement, so that makes me want to keep them, but on the other hand, they
> were added at theoretical positions by MolProbity’s reduce tool for
> refinement and were not positioned on the basis of experimentally observed
> electron density, so that makes me want to delete them from the
> experimental model.  Which is the preferred option for this situation?

The order of operations you describe is unclear.

If you explicitly refined hydrogens then their final positions are indeed
based on experimentally determined data.
The fact that you initially placed them into ideal geometry is not really
any different from the non-H atoms of individual protein residues in your
model, whose original positions were also based on known stereochemistry.

On the other hand, if you mean that the hydrogens you used for refinement
were deleted and replaced during validation by Molprobity (which I think it
may do by default) that's not good.  You should rather keep the hydrogen
positions from refinement, not the ones from Molprobity.

Assuming (since this is ccp4bb) you refined with refmac...
- If you are at the level of investigating hydrogen positions, you may want
to consider taking the refinement into shelxl.
- If you are not making claims about hydrogens but just want to describe
what you did during refinement, I'd go with taking them out and settling
for the standard record in the resulting PDB file:
  REMARK   3  HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS
which looks like this in the corresponding mmcif file:
  _refine.details   'Hydrogens have been added in their riding positions'

Ethan

>
> Thanks,
> Matthew
>
> ---
> Matthew J. Whitley, Ph.D.
> Research Instructor
> Department of Pharmacology & Chemical Biology
> University of Pittsburgh School of Medicine
>
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2Fwebadmin%3FSUBED1%3DCCP4BB%26A%3D1data=02%7C01%7Cmjw100%40pitt.edu%7C362bbbd7dc824fae088c08d7bbe0bc51%7C9ef9f489e0a04eeb87cc3a526112fd0d%7C1%7C1%7C637184446218002827sdata=vdxjusapwXKGys9TqSvCC%2BeFKWn9m0zUznr6JrTTxbk%3Dreserved=0


--
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742





To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


Re: [ccp4bb] Hydrogens in PDB File

2020-02-27 Thread Ethan A Merritt
On Thursday, 27 February 2020 15:35:05 PST Whitley, Matthew J wrote:
> Hello all,
> 
> I am nearly finished refining the structures of two mutant proteins from
> crystals that diffracted to very high resolution, 1 Å and 1.2 Å,
> respectively.  Refinement was conducted in the presence of explicit
> hydrogens on the models.  I am preparing to deposit these models into the
> PDB but am unsure about whether to retain or remove the hydrogens for
> deposition.  On one hand, these hydrogens were explicitly used during
> refinement, so that makes me want to keep them, but on the other hand, they
> were added at theoretical positions by MolProbity’s reduce tool for
> refinement and were not positioned on the basis of experimentally observed
> electron density, so that makes me want to delete them from the
> experimental model.  Which is the preferred option for this situation?

The order of operations you describe is unclear.

If you explicitly refined hydrogens then their final positions are indeed
based on experimentally determined data.
The fact that you initially placed them into ideal geometry is not really
any different from the non-H atoms of individual protein residues in your
model, whose original positions were also based on known stereochemistry.

On the other hand, if you mean that the hydrogens you used for refinement
were deleted and replaced during validation by Molprobity (which I think it
may do by default) that's not good.  You should rather keep the hydrogen
positions from refinement, not the ones from Molprobity.

Assuming (since this is ccp4bb) you refined with refmac...
- If you are at the level of investigating hydrogen positions, you may want
to consider taking the refinement into shelxl.  
- If you are not making claims about hydrogens but just want to describe
what you did during refinement, I'd go with taking them out and settling
for the standard record in the resulting PDB file:
  REMARK   3  HYDROGENS HAVE BEEN ADDED IN THE RIDING POSITIONS
which looks like this in the corresponding mmcif file:
  _refine.details   'Hydrogens have been added in their riding positions'

Ethan

> 
> Thanks,
> Matthew
> 
> ---
> Matthew J. Whitley, Ph.D.
> Research Instructor
> Department of Pharmacology & Chemical Biology
> University of Pittsburgh School of Medicine
> 
> 
> 
> 
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1


[ccp4bb] Hydrogens in PDB File

2020-02-27 Thread Whitley, Matthew J
Hello all,

I am nearly finished refining the structures of two mutant proteins from 
crystals that diffracted to very high resolution, 1 Å and 1.2 Å, respectively.  
Refinement was conducted in the presence of explicit hydrogens on the models.  
I am preparing to deposit these models into the PDB but am unsure about whether 
to retain or remove the hydrogens for deposition.  On one hand, these hydrogens 
were explicitly used during refinement, so that makes me want to keep them, but 
on the other hand, they were added at theoretical positions by MolProbity’s 
reduce tool for refinement and were not positioned on the basis of 
experimentally observed electron density, so that makes me want to delete them 
from the experimental model.  Which is the preferred option for this situation?

Thanks,
Matthew

---
Matthew J. Whitley, Ph.D.
Research Instructor
Department of Pharmacology & Chemical Biology
University of Pittsburgh School of Medicine




To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=CCP4BB=1