Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-17 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Pavel,

maybe I should have explained in better detail to avoid confusion. The
importance of the contribution to the X-ray term from hydrogens has
been well-known (and used in refinement programs) for - I guess - more
than 40 years, but I am sure you know that.

I meant to say that small variations in the hydrogen position don't
show up for X-ray data, which is why they can be calculated even at
0.8A resolution. Maybe you could repeat the study you point at and
move the hydrogen atoms, e.g. 0.3A out of their calculated position -
my guess is such a variation would no show up in the map or the
R-values at resolutions ranges significant for macromolecules. But
these differences may matter when it comes to quantum chemical
calculations, and Jeffrey's post seems to support this.

I am puzzled that you observe the same with neutron data, i.e. you see
no difference between the riding hydrogen model and restraining them.
Does phenix, by any chance, refine the B-values rather than constrain
them as implied by 'riding atom model'? This would be very unusual and
might explain why the differences are not detected with phenix.

In our work using shelxl, which I mentioned before, the quality
differences where really striking despite the low data completeness.
Interesting that phenix seems to wipe out the most interesting part of
(MX) neutron data.

Cheers,
Tim

On 06/16/2014 08:09 PM, Pavel Afonine wrote:
> Hi Tim,
> 
> just to spice your words up with some numbers
> 
> You may also want to note that constrained hydrogen positions are
> a
>> crude approximation and only work with X-ray data where hydrogen
>> atoms have little impact on the data.
> 
> 
> This contribution can be as large as 1.5% difference in R-factor
> (with vs without H), as shown in Figure 2 (page 19; "On the
> contribution of hydrogen atoms to X-ray scattering"): 
> http://phenix-online.org/newsletter/CCN_2012_01.pdf
> 
> 
>> Our comparison between hydrogen restraints and constraints
>> (http://dx.doi.org/10.1107/S1600576713027659) report the greater
>> quality of restraints vs. constraints when it comes to neutron
>> data, where hydrogen atoms do matter.
> 
> 
> I just re-refined (phenix.refine) all neutron structures available
> in PDB (for which I could extract diffraction data without manual
> labor; 55 in total) with two ways of handling H (D and H/D) atoms:
> a) refine H individually, and b) using riding model for H
> (rotatable H are adjusted to fit the map). In terms of Rfree and
> Rwork I don't see a huge difference. However, using riding model
> results in less overfitting: 
> http://cci.lbl.gov/~afonine/tmp/r_stats.pdf
> 
> This is not surprising given typical quality of neutron data:
> average (all neutron entries in PDB) completeness of neutron data
> sets is 76%, while average completeness of comparable X-ray data
> sets is 94% (page 21): 
> http://phenix-online.org/presentations/latest/2012_afonine_ecm27-final.pdf
>
>  All the best, Pavel
> 

- -- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iD8DBQFToEdAUxlJ7aRr7hoRAk/2AJ9WEomb6EVbAj7vkhHq3cv6l1YivQCcCGgc
4ydPJJRvRwer6SnDnvhvG+k=
=rG3w
-END PGP SIGNATURE-


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-16 Thread Jeffrey Bell
Hi, Tim,

When we were working on our paper in 2011, refmac had a bug that always 
indicated in depostions that riding hydrogens were used, whether they were or 
not. Published methods did not always clarify this issue. Since we could not be 
definitive, we refrained from saying too much about it. However, then current 
refmac structures did have plenty of close non-bonded contacts as all-atom 
models, even though some (most? all?) of them were refined with riding 
hydrogens 'on'. We should look again with a recent sample of structures and 
find out.

Your point is well taken that constrained hydrogen coordinates may not agree 
well where accurate data is available for hydrogen positions. One builds the 
best model that one can with the available data. PrimeX is intended for use 
with moderate resolution X-ray structures. Hydrogen positions are determined by 
the force field while heavier atom positions are refined to agree with the 
diffraction data and force field.

Cheers, Jeff
 


On Monday, June 16, 2014 7:14 AM, Tim Gruene  wrote:
  


Dear Jeff,

I would assume that clashing hydrogen atoms beome less and less an issue
with current refinement programs, since those I am familiar with
(refmac5 and phenix) both genereate constrained hydrogen atoms by
default now, and it has been like this for quite some time - so the
situation should become better for modellers.

You may also want to note that constrained hydrogen positions are a
crude approximation and only work with X-ray data where hydrogen atoms
have little impact on the data. Our comparison between hydrogen
restraints and constraints (http://dx.doi.org/10.1107/S1600576713027659)
report the greater quality of restraints vs. constraints when it comes
to neutron data, where hydrogen atoms do matter. Hydrogen positions are
much more flexible than the usual riding atom model may imply. This may
affect in silico simulations.

Best,
Tim


On 06/16/2014 04:34 AM, Jeffrey Bell wrote:
> Hi, all,
> 
> I am glad to see these matters being discussed.  I think we all believe that 
> protein crystallographers should be concerned with producing models that 
> modelers and chemists can respect and use. 
> 
> Schrödinger spends a lot of time thinking about ligands; its refinement 
> program, PrimeX, has a very simple way of handling ligand issues. All that a 
> crystallographer has to do is get the charge and bond order right, and the 
> force field then automatically does atom typing and generates all restraints. 
> 
> 
> However, use of our cif library files brings up another matter that must be 
> understood first. Use of PrimeX, even at low resolution, involves refinement 
> of all hydrogen atom positions. The Richardsons have abundantly demonstrated 
> how important hydrogen coordinates are to accurate model building. 
> 
> This matter of hydrogen coordinate refinement is closely connected to the 
> editorial that started this thread. Computational chemistry in drug 
> discovery, and elsewhere, uses all-atom models. When you add hydrogen atoms 
> to most models in the PDB, many chemically-impossible overlaps of atoms 
> result (see Acta Cryst. 2012, D68, 935-952 for more information; ask me for a 
> copy). This issue is almost as much of an annoyance for computational 
> chemists as bad ligand geometry because they see it in almost every structure.
> 
> If anyone would like to try PrimeX, either for ligand restraint generation or 
> refinement, please let me know. Schrödinger offers academic institutions one 
> year of free access, which may be renewed on a case-by-case basis. 
> Crystallographers at companies will also qualify for a free evaluation trial. 
> 
> 
> Cheers,
> 
> Jeff Bell
> PrimeX developer
> Schrödinger, Inc.
> 

-- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-16 Thread Pavel Afonine
Hi Tim,

just to spice your words up with some numbers

You may also want to note that constrained hydrogen positions are a
> crude approximation and only work with X-ray data where hydrogen atoms
> have little impact on the data.


This contribution can be as large as 1.5% difference in R-factor (with vs
without H), as shown in Figure 2 (page 19; "On the contribution of hydrogen
atoms to X-ray scattering"):
http://phenix-online.org/newsletter/CCN_2012_01.pdf


> Our comparison between hydrogen
> restraints and constraints (http://dx.doi.org/10.1107/S1600576713027659)
> report the greater quality of restraints vs. constraints when it comes
> to neutron data, where hydrogen atoms do matter.


I just re-refined (phenix.refine) all neutron structures available in PDB
(for which I could extract diffraction data without manual labor; 55 in
total) with two ways of handling H (D and H/D) atoms: a) refine H
individually, and b) using riding model for H (rotatable H are adjusted to
fit the map). In terms of Rfree and Rwork I don't see a huge difference.
However, using riding model results in less overfitting:
http://cci.lbl.gov/~afonine/tmp/r_stats.pdf

This is not surprising given typical quality of neutron data: average (all
neutron entries in PDB) completeness of neutron data sets is 76%, while
average completeness of comparable X-ray data sets is 94% (page 21):
http://phenix-online.org/presentations/latest/2012_afonine_ecm27-final.pdf

All the best,
Pavel


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-16 Thread Tim Gruene
Dear Jeff,

I would assume that clashing hydrogen atoms beome less and less an issue
with current refinement programs, since those I am familiar with
(refmac5 and phenix) both genereate constrained hydrogen atoms by
default now, and it has been like this for quite some time - so the
situation should become better for modellers.

You may also want to note that constrained hydrogen positions are a
crude approximation and only work with X-ray data where hydrogen atoms
have little impact on the data. Our comparison between hydrogen
restraints and constraints (http://dx.doi.org/10.1107/S1600576713027659)
report the greater quality of restraints vs. constraints when it comes
to neutron data, where hydrogen atoms do matter. Hydrogen positions are
much more flexible than the usual riding atom model may imply. This may
affect in silico simulations.

Best,
Tim

On 06/16/2014 04:34 AM, Jeffrey Bell wrote:
> Hi, all,
> 
> I am glad to see these matters being discussed.  I think we all believe that 
> protein crystallographers should be concerned with producing models that 
> modelers and chemists can respect and use. 
> 
> Schrödinger spends a lot of time thinking about ligands; its refinement 
> program, PrimeX, has a very simple way of handling ligand issues. All that a 
> crystallographer has to do is get the charge and bond order right, and the 
> force field then automatically does atom typing and generates all restraints. 
> 
> 
> However, use of our cif library files brings up another matter that must be 
> understood first. Use of PrimeX, even at low resolution, involves refinement 
> of all hydrogen atom positions. The Richardsons have abundantly demonstrated 
> how important hydrogen coordinates are to accurate model building. 
> 
> This matter of hydrogen coordinate refinement is closely connected to the 
> editorial that started this thread. Computational chemistry in drug 
> discovery, and elsewhere, uses all-atom models. When you add hydrogen atoms 
> to most models in the PDB, many chemically-impossible overlaps of atoms 
> result (see Acta Cryst. 2012, D68, 935-952 for more information; ask me for a 
> copy). This issue is almost as much of an annoyance for computational 
> chemists as bad ligand geometry because they see it in almost every structure.
> 
> If anyone would like to try PrimeX, either for ligand restraint generation or 
> refinement, please let me know. Schrödinger offers academic institutions one 
> year of free access, which may be renewed on a case-by-case basis. 
> Crystallographers at companies will also qualify for a free evaluation trial. 
> 
> 
> Cheers,
> 
> Jeff Bell
> PrimeX developer
> Schrödinger, Inc.
> 

-- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: OpenPGP digital signature


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-15 Thread Jeffrey Bell
Hi, all,

I am glad to see these matters being discussed.  I think we all believe that 
protein crystallographers should be concerned with producing models that 
modelers and chemists can respect and use. 

Schrödinger spends a lot of time thinking about ligands; its refinement 
program, PrimeX, has a very simple way of handling ligand issues. All that a 
crystallographer has to do is get the charge and bond order right, and the 
force field then automatically does atom typing and generates all restraints. 


However, use of our cif library files brings up another matter that must be 
understood first. Use of PrimeX, even at low resolution, involves refinement of 
all hydrogen atom positions. The Richardsons have abundantly demonstrated how 
important hydrogen coordinates are to accurate model building. 

This matter of hydrogen coordinate refinement is closely connected to the 
editorial that started this thread. Computational chemistry in drug discovery, 
and elsewhere, uses all-atom models. When you add hydrogen atoms to most models 
in the PDB, many chemically-impossible overlaps of atoms result (see Acta 
Cryst. 2012, D68, 935-952 for more information; ask me for a copy). This issue 
is almost as much of an annoyance for computational chemists as bad ligand 
geometry because they see it in almost every structure.

If anyone would like to try PrimeX, either for ligand restraint generation or 
refinement, please let me know. Schrödinger offers academic institutions one 
year of free access, which may be renewed on a case-by-case basis. 
Crystallographers at companies will also qualify for a free evaluation trial. 


Cheers,

Jeff Bell
PrimeX developer
Schrödinger, Inc.


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-13 Thread Ethan Merritt
On Friday, 13 June 2014 10:12:50 AM Tim Gruene wrote:
> Hi Ethan,
> 
> Maybe I miss something, but whenever an error in one of the cif-files
> has been reported, be it directly to Garib, or publicly on the ccp4bb,
> Garib (I assume) fixed very quickly - I don't quite understand why we
> need a new term for this process?

See the other thread "ccp4 ligand tools +  wwPDB validation = bug reports"

Because the error is not in a pre-packaged cif file.
Nor is it in a ccp4 program per se.
It is in a library that is used by cprodrg to generate a cif file
for previously unknown ligands.

This library originally came from the Dundee folks,
not ccp4, and it was not clear who if anyone was maintaining it.

In an admirably quick response, Alexander Schuettelkopf has now
expressed his willingness to respond to such bug reports and update
the library.

So that's good news for cprodrg, and I gather that indeed the fixes
will appear in future ccp4 updates.

But the problem is more general.
For example, I have had analogous problems with Grade.
There again it is clear that this can affect other ccp4-ers,
so ccp4bb seems to me a good place to mention any bugs or quirks that
contribute to structure refinement errors so that others are aware of
potential problems.  The eventual fix may have to come from elsewhere
(e.g. GlobalPhasing in the case of Grade).  Unlike prodrg, the Grade
code and libraries so far as I know are not available for inspection or
patching locally.

Paul Emsley has Emailed my separately that there is a new project
ACEDRG in the offing that may take over the prodeg/Grade niche inside ccp4.
Perhaps someone involved in ACEDRG will post a summary of what it
will offer?


cheers,

Ethan

> 
> On 06/12/2014 10:45 PM, Ethan A Merritt wrote:
> > [...]
> > Indeed.  All of the library-generation tools I am aware of are flawed in
> > their own idiosyncratic ways.   I think I shall start a campaign to treat
> > errors in the cif libraries as "bugs", and encourage people to report
> > these bugs in the libraries we all use just as they do for bugs in the
> > programs we all use.  
> > 
> > Ethan
> > [...]
> 

-- 
mail:   Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-13 Thread Engin Özkan
Do those fixes also make it to the phenix version of the library? Yes, 
this is the CCP4bb, but the monomer library is also used by Phenix, and 
a good number of structures (almost half of those deposited this year?) 
in the PDB now come from phenix.refine. Or in other words, is there a 
central, high quality monomer library shared among the two most common 
refinement programs? (The phenix version is more like a fork of the 
CCP4, I think.)


And not all fixes are obvious. Think of the thread from June 4 this year 
about XYP and HSZ.


https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ccp4bb;e617308e.1406

The atoms here have non-standard numbering, which would break the way 
sugar linkages are defined (and sugar linkages in glycoproteins are just 
as standard as peptide linkages are in glyco- and non-glyco-proteins.) 
Would this be fixed, or not, I am not sure.


I think it might encourage those of us who spot these errors to report, 
if there was a clear call for, or a place to submit these errors. CCP4bb 
might be that place; that should depend on what Garib and the phenix 
folk prefer.


Engin

P.S. I should, as always, say that while the libraries we use are 
imperfect, without them the situation would be much, much worse, so many 
thanks to Garib et al for their hard work.


On 6/13/14, 3:12 AM, Tim Gruene wrote:

Hi Ethan,

Maybe I miss something, but whenever an error in one of the cif-files
has been reported, be it directly to Garib, or publicly on the ccp4bb,
Garib (I assume) fixed very quickly - I don't quite understand why we
need a new term for this process?

Best,
Tim

On 06/12/2014 10:45 PM, Ethan A Merritt wrote:

[...]
Indeed.  All of the library-generation tools I am aware of are flawed in
their own idiosyncratic ways.   I think I shall start a campaign to treat
errors in the cif libraries as "bugs", and encourage people to report
these bugs in the libraries we all use just as they do for bugs in the
programs we all use.

Ethan
[...]


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-13 Thread Tim Gruene
Hi Ethan,

Maybe I miss something, but whenever an error in one of the cif-files
has been reported, be it directly to Garib, or publicly on the ccp4bb,
Garib (I assume) fixed very quickly - I don't quite understand why we
need a new term for this process?

Best,
Tim

On 06/12/2014 10:45 PM, Ethan A Merritt wrote:
> [...]
> Indeed.  All of the library-generation tools I am aware of are flawed in
> their own idiosyncratic ways.   I think I shall start a campaign to treat
> errors in the cif libraries as "bugs", and encourage people to report
> these bugs in the libraries we all use just as they do for bugs in the
> programs we all use.  
> 
>   Ethan
> [...]
-- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: OpenPGP digital signature


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-13 Thread Frank von Delft
I sense someone should quickly point out, on behalf of the developers (I 
assume it's they that are requested to magically "give optimal 
support"), that it's not for want of awareness, intelligence or 
diligence that this functions are suboptimal:  it's want of TIME.  As 
Paul suggested, in ccp4 alone there are several active projects on 
carbohydrates, nucleic acids, and general restraints.


And it's not as if the tools to do this aren't available: defining 
custom restraints has been possible for ages.  It's just not as 
fantastically convenient as everything else.


The fact that /these/ errors are now becoming prominent is testament to 
how excellent the /rest/ of our toolset has become...  presumably the 
reason we've become too lazy to look at our restraints in detail, 
because to be honest, few /other /aspects of the analysis require such 
detailed attention.  Now THAT's what I call impressive.


phx.

P.S. The first "C" in CCP4 stands for "Collaborative", i.e. anybody with 
good ideas is welcome to contribute the tools they write into the mix...




On 13/06/2014 08:43, Ute Krengel wrote:


Hei Tom,


We may not be able to prevent deposition of dodgy structures, but we 
could at least give optimal support to those of us wanting to do a 
good job. With respect to ligands, the support could sometimes be 
better - thinking in particular of carbohydrate ligands.



Best,


Ute



*From:* CCP4 bulletin board  on behalf of Tom 
Peat 

*Sent:* 13 June 2014 09:08
*To:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

I’ll wade into this quagmire before the weekend starts.

There are without question some dodgy structures and some of these are 
due to poor inputs in terms of the cif files that we use for the 
ligands, or the way we have constructed the ligands, or allowed them 
to distort during refinement. Some distortions are probably legitimate 
as maybe you have 1.4 A data and some strain has been introduced, or 
in fact you put the substrate in and it is partially reacted, so we 
are seeing an intermediate and that wasn’t taken into account, or some 
other story. Chemical space is big (really, really BIG) and it is hard 
to account for all possibilities by just defining each bond type, 
angle, etc. although we can certainly do better than we have (and in 
fact I think we are!). And even with everything defined, your mileage 
will vary (look at all of the safety features we have in cars these 
days that weren’t there 20 years ago and still thousands die on the 
road each year). We are really good at protein structures- but if the 
crystallographer doesn’t look at the data carefully, you get the 
attached (a recent one that I pulled down that has 1.4 A data and 
still got this loop wrong- clearly wasn’t looked at very carefully or 
at all).


So, even with really good data, and many automated features, and even 
good input files, you need the person to actually look at the data and 
see whether the model fits the density and make a judgement call as to 
whether the chemistry is plausible and correct. As some people are too 
busy (or lazy) to do this, there will be structures put into the 
database which are not only not perfect, but not very good. There are 
lots of people working on this- PDB REDO, better programs for 
generating more plausible dictionary files, etc. and they have made 
our lives much, much easier- Thank You! But all of these won’t 
eliminate the bad structures deposited (just make it harder to justify 
a poor structure) unless there is a change in the way structures are 
deposited (actual criteria for deposition). Do we want that? That is a 
big question and would really change the dynamics that we currently have.


My 2 cents.

Cheers, tom

*From:*CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] *On Behalf 
Of *Jeffrey Bell

*Sent:* Friday, 13 June 2014 3:05 AM
*To:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

Hi, Tim,

Thanks for your comment. Do you agree with the editorial's claim that 
some 25% of the deposited protein-ligand complexes might be dodgy in 
significant details?


This editorial comment represents something that I often hear from 
drug discovery professionals. Is it a matter of PR between 
crystallographers and other scientists, or does a real problem exist?


Cheers,

Jeff Bell

PrimeX developer

Schrödinger, Inc.

On Tuesday, June 10, 2014 10:27 AM, Tim Gruene 
mailto:t...@shelx.uni-ac.gwdg.de>> wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I hope that the contents of this section is obvious to most readers of
the ccp4 bulletin board.

Cheer,
Tim


On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> An editorial comment about protein crystallography appeared under
> that title. It's short and worth considering.
>

Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-13 Thread Ute Krengel
Hei Tom,


We may not be able to prevent deposition of dodgy structures, but we could at 
least give optimal support to those of us wanting to do a good job. With 
respect to ligands, the support could sometimes be better - thinking in 
particular of carbohydrate ligands.


Best,


Ute



From: CCP4 bulletin board  on behalf of Tom Peat 

Sent: 13 June 2014 09:08
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

I’ll wade into this quagmire before the weekend starts.
There are without question some dodgy structures and some of these are due to 
poor inputs in terms of the cif files that we use for the ligands, or the way 
we have constructed the ligands, or allowed them to distort during refinement. 
Some distortions are probably legitimate as maybe you have 1.4 A data and some 
strain has been introduced, or in fact you put the substrate in and it is 
partially reacted, so we are seeing an intermediate and that wasn’t taken into 
account, or some other story. Chemical space is big (really, really BIG) and it 
is hard to account for all possibilities by just defining each bond type, 
angle, etc. although we can certainly do better than we have (and in fact I 
think we are!). And even with everything defined, your mileage will vary (look 
at all of the safety features we have in cars these days that weren’t there 20 
years ago and still thousands die on the road each year). We are really good at 
protein structures- but if the crystallographer doesn’t look at the data 
carefully, you get the attached (a recent one that I pulled down that has 1.4 A 
data and still got this loop wrong- clearly wasn’t looked at very carefully or 
at all).

So, even with really good data, and many automated features, and even good 
input files, you need the person to actually look at the data and see whether 
the model fits the density and make a judgement call as to whether the 
chemistry is plausible and correct. As some people are too busy (or lazy) to do 
this, there will be structures put into the database which are not only not 
perfect, but not very good. There are lots of people working on this- PDB REDO, 
better programs for generating more plausible dictionary files, etc. and they 
have made our lives much, much easier- Thank You! But all of these won’t 
eliminate the bad structures deposited (just make it harder to justify a poor 
structure) unless there is a change in the way structures are deposited (actual 
criteria for deposition). Do we want that? That is a big question and would 
really change the dynamics that we currently have.

My 2 cents.
Cheers, tom

From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Jeffrey 
Bell
Sent: Friday, 13 June 2014 3:05 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

Hi, Tim,

Thanks for your comment. Do you agree with the editorial's claim that some 25% 
of the deposited protein-ligand complexes might be dodgy in significant details?

This editorial comment represents something that I often hear from drug 
discovery professionals. Is it a matter of PR between crystallographers and 
other scientists, or does a real problem exist?

Cheers,

Jeff Bell
PrimeX developer
Schrödinger, Inc.

On Tuesday, June 10, 2014 10:27 AM, Tim Gruene 
mailto:t...@shelx.uni-ac.gwdg.de>> wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I hope that the contents of this section is obvious to most readers of
the ccp4 bulletin board.

Cheer,
Tim

On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> An editorial comment about protein crystallography appeared under
> that title. It's short and worth considering.
> http://pipeline.corante.com/

>

- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iD8DBQFTlxXkUxlJ7aRr7hoRAlbpAKCcqVkkUwVa2r/M1r9Rp+1rbF6JzgCgiFWR
XnSRSZpGKHrDa0tOFgNixJM=
=UVlP
-END PGP SIGNATURE-




Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-12 Thread Joel Tyndall
Great news and I am in full support of the bug option. Where do we start?

(Caveat: we all make mistakes, I am sure I have!)

-Original Message-
From: Ethan A Merritt [mailto:merr...@u.washington.edu] 
Sent: Friday, 13 June 2014 8:45 a.m.
To: Joel Tyndall
Cc: CCP4BB@jiscmail.ac.uk
Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

On Thursday, 12 June, 2014 20:24:43 Joel Tyndall wrote:
> Hi,
> 
> I saw Jeffs post with interest and have held off until now.  It is relatively 
> easy to find structures with bad geometry for small molecules but it does not 
> do any good simply pointing them out. What I believe is needed is a way to 
> fix the problem. There are several possible ways. The pdb could parse new 
> structures through a checking process to check the geometry of small 
> molecules. This, I would presume, could be done via the CSD.

As of January 2014 this is indeed being done as part of the PDB deposition 
process.

Anyone who has deposited a structure containing a ligand this year has probably 
been surprised, pleasantly or otherwise, by the table of geometry 
violations/ouliers for each ligand.

If you missed the various announcements, you may wish to try it out on your own 
structures here:

http://wwpdb-validation.wwpdb.org/validservice/


> I also believe that cif file generation can be improved.

Indeed.  All of the library-generation tools I am aware of are flawed in
their own idiosyncratic ways.   I think I shall start a campaign to treat
errors in the cif libraries as "bugs", and encourage people to report these 
bugs in the libraries we all use just as they do for bugs in the programs we 
all use.  

Ethan


> The developers of the available programs are doing a great job but as 
> intelligent scientists we strive for perfection. I am unfortunately not in a 
> position to develop software myself ( so maybe I should pipe down) but I 
> would be happy to offer assistance (from my personal experience). In my 
> experience I have had some issues with small molecule parametrisation ( or 
> maybe I just deal with unusual molecules). By that I mean, on occasion I have 
> had a .cif files that simply do not make sense or contradicts what you would  
> expect in the geometry of a small molecule. I am aware of one "service" that 
> does check against the CSD when generating cif files.
> 
> I read in one of the editorials, or related posts that one of the structures 
> was corrected. This is also an option assuming the data has been deposited.
> 
> My two cents
> 
> J
> 
> -Original Message-
> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of 
> Tim Gruene
> Sent: Friday, 13 June 2014 5:54 a.m.
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem
> 
> Hi Jeff,
> 
> there are quite a few implications in your brief email that each might open a 
> long thread of discussion. As brief as possible I think one has to be a good 
> scientist and a good crystallographier to fully understand the meaning of a 
> crystal structure, and I think many people believe a crystal structure is 
> just a set of coordinates.
> 
> If someone tells me some distance in yards and I assume that's about 
> the same as a meter I will surely get some dodgy results up to 
> creating the first car accident on Mars ;-)
> 
> Cheers,
> Tim
> 
> On 06/12/2014 07:04 PM, Jeffrey Bell wrote:
> > Hi, Tim,
> > 
> > Thanks for your comment. Do you agree with the editorial's claim that some 
> > 25% of the deposited protein-ligand complexes might be dodgy in significant 
> > details? 
> > 
> > 
> > This editorial comment represents something that I often hear from drug 
> > discovery professionals. Is it a matter of PR between crystallographers and 
> > other scientists, or does a real problem exist?
> > 
> > Cheers,
> > 
> > Jeff Bell
> > PrimeX developer
> > Schrödinger, Inc.
> > 
> > 
> > On Tuesday, June 10, 2014 10:27 AM, Tim Gruene  
> > wrote:
> >  
> > 
> > 
> > I hope that the contents of this section is obvious to most readers 
> > of the ccp4 bulletin board.
> > 
> > Cheer,
> > Tim
> > 
> > 
> > On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> >> An editorial comment about protein crystallography appeared under 
> >> that title. It's short and worth considering.
> >> http://pipeline.corante.com/
> > 
> > 
> > 
> 
> --
> Dr Tim Gruene
> Institut fuer anorganische Chemie
> Tammannstr. 4
> D-37077 Goettingen
> 
> GPG Key ID = A46BEE1A
--
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742



Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-12 Thread Ethan A Merritt
On Thursday, 12 June, 2014 20:24:43 Joel Tyndall wrote:
> Hi,
> 
> I saw Jeffs post with interest and have held off until now.  It is relatively 
> easy to find structures with bad geometry for small molecules but it does not 
> do any good simply pointing them out. What I believe is needed is a way to 
> fix the problem. There are several possible ways. The pdb could parse new 
> structures through a checking process to check the geometry of small 
> molecules. This, I would presume, could be done via the CSD.

As of January 2014 this is indeed being done as part of the PDB deposition 
process.

Anyone who has deposited a structure containing a ligand this year has 
probably been surprised, pleasantly or otherwise, by the table of geometry
violations/ouliers for each ligand.

If you missed the various announcements, you may wish to try it out on
your own structures here:

http://wwpdb-validation.wwpdb.org/validservice/


> I also believe that cif file generation can be improved.

Indeed.  All of the library-generation tools I am aware of are flawed in
their own idiosyncratic ways.   I think I shall start a campaign to treat
errors in the cif libraries as "bugs", and encourage people to report
these bugs in the libraries we all use just as they do for bugs in the
programs we all use.  

Ethan


> The developers of the available programs are doing a great job but as 
> intelligent scientists we strive for perfection. I am unfortunately not in a 
> position to develop software myself ( so maybe I should pipe down) but I 
> would be happy to offer assistance (from my personal experience). In my 
> experience I have had some issues with small molecule parametrisation ( or 
> maybe I just deal with unusual molecules). By that I mean, on occasion I have 
> had a .cif files that simply do not make sense or contradicts what you would  
> expect in the geometry of a small molecule. I am aware of one "service" that 
> does check against the CSD when generating cif files.
> 
> I read in one of the editorials, or related posts that one of the structures 
> was corrected. This is also an option assuming the data has been deposited.
> 
> My two cents
> 
> J
> 
> -Original Message-
> From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim 
> Gruene
> Sent: Friday, 13 June 2014 5:54 a.m.
> To: CCP4BB@JISCMAIL.AC.UK
> Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem
> 
> Hi Jeff,
> 
> there are quite a few implications in your brief email that each might open a 
> long thread of discussion. As brief as possible I think one has to be a good 
> scientist and a good crystallographier to fully understand the meaning of a 
> crystal structure, and I think many people believe a crystal structure is 
> just a set of coordinates.
> 
> If someone tells me some distance in yards and I assume that's about the same 
> as a meter I will surely get some dodgy results up to creating the first car 
> accident on Mars ;-)
> 
> Cheers,
> Tim
> 
> On 06/12/2014 07:04 PM, Jeffrey Bell wrote:
> > Hi, Tim,
> > 
> > Thanks for your comment. Do you agree with the editorial's claim that some 
> > 25% of the deposited protein-ligand complexes might be dodgy in significant 
> > details? 
> > 
> > 
> > This editorial comment represents something that I often hear from drug 
> > discovery professionals. Is it a matter of PR between crystallographers and 
> > other scientists, or does a real problem exist?
> > 
> > Cheers,
> > 
> > Jeff Bell
> > PrimeX developer
> > Schrödinger, Inc.
> > 
> > 
> > On Tuesday, June 10, 2014 10:27 AM, Tim Gruene  
> > wrote:
> >  
> > 
> > 
> > I hope that the contents of this section is obvious to most readers of 
> > the ccp4 bulletin board.
> > 
> > Cheer,
> > Tim
> > 
> > 
> > On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> >> An editorial comment about protein crystallography appeared under 
> >> that title. It's short and worth considering.
> >> http://pipeline.corante.com/
> > 
> > 
> > 
> 
> --
> Dr Tim Gruene
> Institut fuer anorganische Chemie
> Tammannstr. 4
> D-37077 Goettingen
> 
> GPG Key ID = A46BEE1A
-- 
Ethan A Merritt
Biomolecular Structure Center,  K-428 Health Sciences Bldg
MS 357742,   University of Washington, Seattle 98195-7742


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-12 Thread Joel Tyndall
Hi,

I saw Jeffs post with interest and have held off until now.  It is relatively 
easy to find structures with bad geometry for small molecules but it does not 
do any good simply pointing them out. What I believe is needed is a way to fix 
the problem. There are several possible ways. The pdb could parse new 
structures through a checking process to check the geometry of small molecules. 
This, I would presume, could be done via the CSD. I also believe that cif file 
generation can be improved. The developers of the available programs are doing 
a great job but as intelligent scientists we strive for perfection. I am 
unfortunately not in a position to develop software myself ( so maybe I should 
pipe down) but I would be happy to offer assistance (from my personal 
experience). In my experience I have had some issues with small molecule 
parametrisation ( or maybe I just deal with unusual molecules). By that I mean, 
on occasion I have had a .cif files that simply do not make sense or 
contradicts what you would  expect in the geometry of a small molecule. I am 
aware of one "service" that does check against the CSD when generating cif 
files.

I read in one of the editorials, or related posts that one of the structures 
was corrected. This is also an option assuming the data has been deposited.

My two cents

J

-Original Message-
From: CCP4 bulletin board [mailto:CCP4BB@JISCMAIL.AC.UK] On Behalf Of Tim Gruene
Sent: Friday, 13 June 2014 5:54 a.m.
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

Hi Jeff,

there are quite a few implications in your brief email that each might open a 
long thread of discussion. As brief as possible I think one has to be a good 
scientist and a good crystallographier to fully understand the meaning of a 
crystal structure, and I think many people believe a crystal structure is just 
a set of coordinates.

If someone tells me some distance in yards and I assume that's about the same 
as a meter I will surely get some dodgy results up to creating the first car 
accident on Mars ;-)

Cheers,
Tim

On 06/12/2014 07:04 PM, Jeffrey Bell wrote:
> Hi, Tim,
> 
> Thanks for your comment. Do you agree with the editorial's claim that some 
> 25% of the deposited protein-ligand complexes might be dodgy in significant 
> details? 
> 
> 
> This editorial comment represents something that I often hear from drug 
> discovery professionals. Is it a matter of PR between crystallographers and 
> other scientists, or does a real problem exist?
> 
> Cheers,
> 
> Jeff Bell
> PrimeX developer
> Schrödinger, Inc.
> 
> 
> On Tuesday, June 10, 2014 10:27 AM, Tim Gruene  
> wrote:
>  
> 
> 
> I hope that the contents of this section is obvious to most readers of 
> the ccp4 bulletin board.
> 
> Cheer,
> Tim
> 
> 
> On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
>> An editorial comment about protein crystallography appeared under 
>> that title. It's short and worth considering.
>> http://pipeline.corante.com/
> 
> 
> 

--
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-12 Thread Tim Gruene
Hi Jeff,

there are quite a few implications in your brief email that each might
open a long thread of discussion. As brief as possible I think one has
to be a good scientist and a good crystallographier to fully understand
the meaning of a crystal structure, and I think many people believe a
crystal structure is just a set of coordinates.

If someone tells me some distance in yards and I assume that's about the
same as a meter I will surely get some dodgy results up to creating the
first car accident on Mars ;-)

Cheers,
Tim

On 06/12/2014 07:04 PM, Jeffrey Bell wrote:
> Hi, Tim,
> 
> Thanks for your comment. Do you agree with the editorial's claim that some 
> 25% of the deposited protein-ligand complexes might be dodgy in significant 
> details? 
> 
> 
> This editorial comment represents something that I often hear from drug 
> discovery professionals. Is it a matter of PR between crystallographers and 
> other scientists, or does a real problem exist?
> 
> Cheers,
> 
> Jeff Bell
> PrimeX developer
> Schrödinger, Inc.
> 
> 
> On Tuesday, June 10, 2014 10:27 AM, Tim Gruene  
> wrote:
>  
> 
> 
> I hope that the contents of this section is obvious to most readers of
> the ccp4 bulletin board.
> 
> Cheer,
> Tim
> 
> 
> On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
>> An editorial comment about protein crystallography appeared under
>> that title. It's short and worth considering. 
>> http://pipeline.corante.com/
> 
> 
> 

-- 
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: OpenPGP digital signature


Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-12 Thread Jeffrey Bell
Hi, Tim,

Thanks for your comment. Do you agree with the editorial's claim that some 25% 
of the deposited protein-ligand complexes might be dodgy in significant 
details? 


This editorial comment represents something that I often hear from drug 
discovery professionals. Is it a matter of PR between crystallographers and 
other scientists, or does a real problem exist?

Cheers,

Jeff Bell
PrimeX developer
Schrödinger, Inc.


On Tuesday, June 10, 2014 10:27 AM, Tim Gruene  
wrote:
 


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I hope that the contents of this section is obvious to most readers of
the ccp4 bulletin board.

Cheer,
Tim


On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> An editorial comment about protein crystallography appeared under
> that title. It's short and worth considering. 
> http://pipeline.corante.com/
> 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iD8DBQFTlxXkUxlJ7aRr7hoRAlbpAKCcqVkkUwVa2r/M1r9Rp+1rbF6JzgCgiFWR
XnSRSZpGKHrDa0tOFgNixJM=
=UVlP
-END PGP SIGNATURE-

Re: [ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-10 Thread Tim Gruene
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I hope that the contents of this section is obvious to most readers of
the ccp4 bulletin board.

Cheer,
Tim

On 06/10/2014 03:40 PM, Jeffrey Bell wrote:
> An editorial comment about protein crystallography appeared under
> that title. It's short and worth considering. 
> http://pipeline.corante.com/
> 

- -- 
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Icedove - http://www.enigmail.net/

iD8DBQFTlxXkUxlJ7aRr7hoRAlbpAKCcqVkkUwVa2r/M1r9Rp+1rbF6JzgCgiFWR
XnSRSZpGKHrDa0tOFgNixJM=
=UVlP
-END PGP SIGNATURE-


[ccp4bb] Hosed-Up X-Ray Structures: A Big Problem

2014-06-10 Thread Jeffrey Bell
An editorial comment about protein crystallography appeared under that title. 
It's short and worth considering.
http://pipeline.corante.com/