Re: [ccp4bb] About Staraniso

2024-02-16 Thread vincent Chaptal

Hi Gerard,

indeed, thank you for your clarification, I'm rusty since I'm on 
something else now.


Anisotropy might not be the right word, as it does mean the opposite of 
isotropy.
I meant 2 phenomenon at play: lack of completeness in the high 
resolution shells, and different intensity falloff compared to a soluble 
protein.


As I answered separately to some of you, I'll make the data available 
soon and will report here once it's done.


Best
Vincent

Le 16/02/2024 à 13:39, Gerard Bricogne a écrit :

L'adresse mail de l'expéditeur est extérieure :owner-ccp...@jiscmail.ac.uk   En 
cas de doute : ne répondez pas, ne cliquez pas et signalez le message au 
Support Informatique


Dear Vincent,

  Thank you for chipping in as you did, with so much useful feedback
about the difficulty of re-using PDB depositions containing anisotropic
data. It is a very useful picture of what is definitely a "bleeding edge" on
the "R" side of the wwPDB's "FAIR" ideal.

  It would be most helpful if you could share with us (off-line) the
details of the specific problems you encountered and that led to the odd
recommendation that you should use a text editor to combine the contents of
mmCIF files.

  It is interesting that you used the turn of phrase "huge anisotropy
combined with lack of completeness in high resolution shells" as if they
were two distinct evils that had conspired to combine in order to make your
life difficult. In reality, as you are of course well aware, they are one
and the same evil, as strong anisotropy will necessarily result in low
completeness in the *spherical* shell to the highest diffraction limit. The
diagram at the beginning of the page

  https://staraniso.globalphasing.org/anisotropy_about.html

makes this totally obvious. The only way to have full completeness in the
outermost (spherical) shell is to have isotropic diffraction. Forgive me for
perhaps belabouring something you are perfectly aware of, but it provided an
opportunity to push back against "isotropic thinking" that is still so
deeply ingrained in the collective worldview.


  With best wishes,

   Gerard.

--
On Fri, Feb 16, 2024 at 09:59:59AM +0100, vincent Chaptal wrote:

Hi Clemens and all,

I've been following with a lot of interest of course, anisotropy has
taken a lot of space in my membrane-protein crystallography life.
I remember the many exchanges we've had with you and Global Phasing,
as well as many other software developpers over the years.

I would like to chip in to give the point of view of a user
struggling with it, and since 6r72 was mentioned and I'm the
author...

6r72 has been an epic battle experimentally, and afterwards with
data processing as well. The huge anisotropy combined with lack of
completeness in high resolution shells, and low resolution, made a
cocktail for not so great electron density. But since this was all I
got, I had to make something out of it. Since the quality of the
data was what it was, I definitely wanted others to be able to
re-investigate it with modern tools, or just see for themselves the
map I had to make the choices in modelling.
Remember, I asked for your help to combine all the data of the
entry, and GlobalPhasing was very helpful to make the tool for this
entry. I believe, this was the moment you created the deposition
tool you mention below.

The fact that the data is not easily retrievable is my point here.
Pavel knows about this entry as we've been exchanging about it. I
completely share with him the difficulty to parse through PDB
entries, having performed several statistical analysis of the PDB to
search for the root cause of anisotropy in membrane proteins myself
(https://pubmed.ncbi.nlm.nih.gov/36206830/).
I encountered the same issue for a recent entry (8qq7, soon
available I hope) where we tried to combine the same files. I
vaguely remembered the deposition tool you mention but I couldn't
locate it (sorry), so we engaged in a dialog with the PDB who made
us use text editors to merge the files basically. On the top of my
head, I think we labelled the blocks "original_data", "modified_SA".

A standardization of this deposition is basically what you all want,
to perform all the different downstream analysis you want. This is
not available at the moment. I've been going out of my way to
combine the files, but it would have been much easier to just give
the SA_modified file to the PDB. I thus encourage all of you
software developpers, despite your different point of views, to
design and agree on such a tool to be placed at OneDep.

To finish, in case you want to reprocess a high-anisotropy, large
difference in high-resolution limits, and low resolution, I have the
images available for 6r72, 2y5y, and 8qq7 (all of which have the
structure factors with original data, maps and modified data).

All the best
Vincent

Le 15/02/2024 à 19:25, Clemens Vonrhein a écrit :

L'adresse mail de l'expéditeur est extérieure :owner-ccp...@jiscmail.ac.uk   En 

Re: [ccp4bb] About Staraniso

2024-02-16 Thread Gerard Bricogne
Dear Vincent,

 Thank you for chipping in as you did, with so much useful feedback
about the difficulty of re-using PDB depositions containing anisotropic
data. It is a very useful picture of what is definitely a "bleeding edge" on
the "R" side of the wwPDB's "FAIR" ideal.

 It would be most helpful if you could share with us (off-line) the
details of the specific problems you encountered and that led to the odd
recommendation that you should use a text editor to combine the contents of
mmCIF files. 

 It is interesting that you used the turn of phrase "huge anisotropy
combined with lack of completeness in high resolution shells" as if they
were two distinct evils that had conspired to combine in order to make your
life difficult. In reality, as you are of course well aware, they are one
and the same evil, as strong anisotropy will necessarily result in low
completeness in the *spherical* shell to the highest diffraction limit. The
diagram at the beginning of the page 

 https://staraniso.globalphasing.org/anisotropy_about.html

makes this totally obvious. The only way to have full completeness in the
outermost (spherical) shell is to have isotropic diffraction. Forgive me for
perhaps belabouring something you are perfectly aware of, but it provided an
opportunity to push back against "isotropic thinking" that is still so
deeply ingrained in the collective worldview.


 With best wishes,

  Gerard.

--
On Fri, Feb 16, 2024 at 09:59:59AM +0100, vincent Chaptal wrote:
> Hi Clemens and all,
> 
> I've been following with a lot of interest of course, anisotropy has
> taken a lot of space in my membrane-protein crystallography life.
> I remember the many exchanges we've had with you and Global Phasing,
> as well as many other software developpers over the years.
> 
> I would like to chip in to give the point of view of a user
> struggling with it, and since 6r72 was mentioned and I'm the
> author...
> 
> 6r72 has been an epic battle experimentally, and afterwards with
> data processing as well. The huge anisotropy combined with lack of
> completeness in high resolution shells, and low resolution, made a
> cocktail for not so great electron density. But since this was all I
> got, I had to make something out of it. Since the quality of the
> data was what it was, I definitely wanted others to be able to
> re-investigate it with modern tools, or just see for themselves the
> map I had to make the choices in modelling.
> Remember, I asked for your help to combine all the data of the
> entry, and GlobalPhasing was very helpful to make the tool for this
> entry. I believe, this was the moment you created the deposition
> tool you mention below.
> 
> The fact that the data is not easily retrievable is my point here.
> Pavel knows about this entry as we've been exchanging about it. I
> completely share with him the difficulty to parse through PDB
> entries, having performed several statistical analysis of the PDB to
> search for the root cause of anisotropy in membrane proteins myself
> (https://pubmed.ncbi.nlm.nih.gov/36206830/).
> I encountered the same issue for a recent entry (8qq7, soon
> available I hope) where we tried to combine the same files. I
> vaguely remembered the deposition tool you mention but I couldn't
> locate it (sorry), so we engaged in a dialog with the PDB who made
> us use text editors to merge the files basically. On the top of my
> head, I think we labelled the blocks "original_data", "modified_SA".
> 
> A standardization of this deposition is basically what you all want,
> to perform all the different downstream analysis you want. This is
> not available at the moment. I've been going out of my way to
> combine the files, but it would have been much easier to just give
> the SA_modified file to the PDB. I thus encourage all of you
> software developpers, despite your different point of views, to
> design and agree on such a tool to be placed at OneDep.
> 
> To finish, in case you want to reprocess a high-anisotropy, large
> difference in high-resolution limits, and low resolution, I have the
> images available for 6r72, 2y5y, and 8qq7 (all of which have the
> structure factors with original data, maps and modified data).
> 
> All the best
> Vincent
> 
> Le 15/02/2024 à 19:25, Clemens Vonrhein a écrit :
> >L'adresse mail de l'expéditeur est extérieure :owner-ccp...@jiscmail.ac.uk   
> >En cas de doute : ne répondez pas, ne cliquez pas et signalez le message au 
> >Support Informatique
> >
> >
> >Dear Pavel & CCP4bb readers,
> >
> >On Wed, Feb 14, 2024 at 08:28:03PM -0800, Pavel Afonine wrote:
> >>What follows below is not very specific to the particular program
> >>(STAIRSANISO) nor the original questions, but nonetheless, I believe it is
> >>relevant.
> >Thanks for joining the discussion: always good to have different viewpoints
> >or opinions made visible - especially for less knowledgeable users and
> >readers of the CCP4bb.
> >
> >And apologies to anyone 

Re: [ccp4bb] About Staraniso

2024-02-16 Thread vincent Chaptal

Hi Clemens and all,

I've been following with a lot of interest of course, anisotropy has 
taken a lot of space in my membrane-protein crystallography life.
I remember the many exchanges we've had with you and Global Phasing, as 
well as many other software developpers over the years.


I would like to chip in to give the point of view of a user struggling 
with it, and since 6r72 was mentioned and I'm the author...


6r72 has been an epic battle experimentally, and afterwards with data 
processing as well. The huge anisotropy combined with lack of 
completeness in high resolution shells, and low resolution, made a 
cocktail for not so great electron density. But since this was all I 
got, I had to make something out of it. Since the quality of the data 
was what it was, I definitely wanted others to be able to re-investigate 
it with modern tools, or just see for themselves the map I had to make 
the choices in modelling.
Remember, I asked for your help to combine all the data of the entry, 
and GlobalPhasing was very helpful to make the tool for this entry. I 
believe, this was the moment you created the deposition tool you mention 
below.


The fact that the data is not easily retrievable is my point here. Pavel 
knows about this entry as we've been exchanging about it. I completely 
share with him the difficulty to parse through PDB entries, having 
performed several statistical analysis of the PDB to search for the root 
cause of anisotropy in membrane proteins myself 
(https://pubmed.ncbi.nlm.nih.gov/36206830/).
I encountered the same issue for a recent entry (8qq7, soon available I 
hope) where we tried to combine the same files. I vaguely remembered the 
deposition tool you mention but I couldn't locate it (sorry), so we 
engaged in a dialog with the PDB who made us use text editors to merge 
the files basically. On the top of my head, I think we labelled the 
blocks "original_data", "modified_SA".


A standardization of this deposition is basically what you all want, to 
perform all the different downstream analysis you want. This is not 
available at the moment. I've been going out of my way to combine the 
files, but it would have been much easier to just give the SA_modified 
file to the PDB. I thus encourage all of you software developpers, 
despite your different point of views, to design and agree on such a 
tool to be placed at OneDep.


To finish, in case you want to reprocess a high-anisotropy, large 
difference in high-resolution limits, and low resolution, I have the 
images available for 6r72, 2y5y, and 8qq7 (all of which have the 
structure factors with original data, maps and modified data).


All the best
Vincent

Le 15/02/2024 à 19:25, Clemens Vonrhein a écrit :

L'adresse mail de l'expéditeur est extérieure :owner-ccp...@jiscmail.ac.uk   En 
cas de doute : ne répondez pas, ne cliquez pas et signalez le message au 
Support Informatique


Dear Pavel & CCP4bb readers,

On Wed, Feb 14, 2024 at 08:28:03PM -0800, Pavel Afonine wrote:

What follows below is not very specific to the particular program
(STAIRSANISO) nor the original questions, but nonetheless, I believe it is
relevant.

Thanks for joining the discussion: always good to have different viewpoints
or opinions made visible - especially for less knowledgeable users and
readers of the CCP4bb.

And apologies to anyone getting tired of "another long post" here, but
some remarks do require follow-ups that hopefully will help keep the
discussion at a level useful to all readers.


In the past, performing any adjustments to the diffraction data intended
for solving and refining atomic models was more or less considered taboo.

That is a very broad statement that I have trouble making sense of: what do
you mean with "adjustments" and what do you mean with "diffraction data"?
If we are truly looking at diffraction data as it comes out of our
experiment, we are looking at the raw images, right?  Those are then
handled roughly as follows (as an example for MX):

   * initial integrated intensities (simplifying 3D pixel data)
   
   * profile fitting of integrated intensities


   * scaling (with various parametrisation models)

   * selection of data (excluding image ranges due to radiation damage or
 because a crystal moves out of the beam, excluding/handling ice-ring
 contamination, selecting datasets in SSX etc)

   * adjustment of error model (to get "meaningful" error estimates,
 i.e. sigma values)

   * outlier rejection (based largely on those sigmas)

   * merging (inverse-variance weighted)

Maybe all those "adjustments to the diffraction data" are not what you are
referring to in your remark above? So let's assume you are referring to the
merged intensity data after all of the above steps as being "the
diffraction data" ... and what we are doing after that:

   * conversion from intensities to amplitudes (using different methods and
 priors) which most often will include an adjustment of weak and
 negative 

Re: [ccp4bb] About Staraniso

2024-02-15 Thread Clemens Vonrhein
Dear Pavel & CCP4bb readers,

On Wed, Feb 14, 2024 at 08:28:03PM -0800, Pavel Afonine wrote:
> What follows below is not very specific to the particular program
> (STAIRSANISO) nor the original questions, but nonetheless, I believe it is
> relevant.

Thanks for joining the discussion: always good to have different viewpoints
or opinions made visible - especially for less knowledgeable users and
readers of the CCP4bb.

And apologies to anyone getting tired of "another long post" here, but
some remarks do require follow-ups that hopefully will help keep the
discussion at a level useful to all readers.

> In the past, performing any adjustments to the diffraction data intended
> for solving and refining atomic models was more or less considered taboo.

That is a very broad statement that I have trouble making sense of: what do
you mean with "adjustments" and what do you mean with "diffraction data"?
If we are truly looking at diffraction data as it comes out of our
experiment, we are looking at the raw images, right?  Those are then
handled roughly as follows (as an example for MX):

  * initial integrated intensities (simplifying 3D pixel data)
  
  * profile fitting of integrated intensities

  * scaling (with various parametrisation models)

  * selection of data (excluding image ranges due to radiation damage or
because a crystal moves out of the beam, excluding/handling ice-ring
contamination, selecting datasets in SSX etc)

  * adjustment of error model (to get "meaningful" error estimates,
i.e. sigma values)

  * outlier rejection (based largely on those sigmas)

  * merging (inverse-variance weighted)

Maybe all those "adjustments to the diffraction data" are not what you are
referring to in your remark above? So let's assume you are referring to the
merged intensity data after all of the above steps as being "the
diffraction data" ... and what we are doing after that:

  * conversion from intensities to amplitudes (using different methods and
priors) which most often will include an adjustment of weak and
negative intensities (e.g. via the French & Wilson method [1]).

  * decision what reflections to use for subsequent steps

- defined by geometric constraints (we can only use those that hit the
  detector)

- defined by some significance criterion

resulting in an adjustment of the dataset (not the values themselves)
coming out of the raw diffraction data.

Maybe this is still not what you are referring to as "adjustments to
the diffraction data"? Let's see what additional "adjustments to the
diffraction data" might happen further along ...

  * anisotropic scaling of the diffraction data /without/ the use of any
atomic model, as provided e.g. by

- the UCLA Anisotropy Server [2,3], using the anisotropy analysis
  from Phaser [4]

- STARANISO [5], using its own analysis

  * relative anisotropic scaling of the diffraction data and the current
model in refinement, e.g. in

- REFMAC [6,7]

Note: this includes writing a set of observed amplitudes into the
output MTZ file that have been corrected using the model-based
overall anisotropy factors (as far we know and at least up to
version 5.8.0352). So any "structure factor" deposition using only
the output reflection data from such a run will have anisotropy
corrected observed observed data in the PDB archive. Our
aB_deposition_combine tool described below detects and undoes this
(when combining the reflection mmCIF data from processing with the
reflection data after refinement) to ensure that data exactly as
used as /input/ to the refinement program is deposited.

- CCTBX [8] and Phenix [9,10]

- SHELX [11,12,13]

- BUSTER [14]

  * classification and rejection of model-based outlier reflections

- in Phenix [15] (still default?)

  * DFc completion for missing observations in 2mFo-DFc electron density maps

- default in REFMAC (into single FWT/PHWT by default as far as we know)

- default in Phenix (into an additional set of map coefficients?)

- default in BUSTER (into two additional sets of map coefficients,
  2FOFCWT_iso-fill/PH2FOFCWT_iso-fill using a sphere and
  2FOFCWT_aniso-fill/PH2FOFCWT_aniso-fill using the anisotropic cut-off
  information from STARANISO).

Which of all of the above are you referring to as being "considered taboo"?
It would be helpful if you could clarify this so that we can then focus on
that particular point in our discussion.

> When cryo-EM emerged as a competitor to x-ray crystallography, the paradigm
> began to shift. In cryo-EM, manipulations applied to the data (the map) are
> a standard practice. The map can be boxed, filtered (sharpened, blurred,
> etc.), modified (e.g., setting something outside the molecular region), and
> so forth; you name it. One might wonder why the same isn't done to x-ray data.

I don't think it is true that 

Re: [ccp4bb] About Staraniso

2024-02-14 Thread Pavel Afonine
Dear All,

What follows below is not very specific to the particular program
(STAIRSANISO) nor the original questions, but nonetheless, I believe it is
relevant.

In the past, performing any adjustments to the diffraction data intended
for solving and refining atomic models was more or less considered taboo.
When cryo-EM emerged as a competitor to x-ray crystallography, the paradigm
began to shift. In cryo-EM, manipulations applied to the data (the map) are
a standard practice. The map can be boxed, filtered (sharpened, blurred,
etc.), modified (e.g., setting something outside the molecular region), and
so forth; you name it. One might wonder why the same isn't done to x-ray
data. Historical analogies include truncating data beyond 6-8Å resolution
to avoid dealing with the bulk solvent or default sharpening (a feature
available in X-plor for some time, then removed for obvious reasons,
AFAIK), choosing resolution limits (PAIRREF), and anisotropic data
massaging by the UCLA server as a more recent example. STAIRSANISO is the
leader in doing things along these lines as of today.

Indeed, why not if this is helpful to solve the structure? However, it's
important that the deposition clearly contains and annotates at least the
following:

- the original unmanipulated data;
- modified data (by whatever method or program);
- accessible information about the data that was used to obtain the final
deposited atomic model.

Note *accessible* above as this is the key for what follows below.

Let's consider this example: https://files.rcsb.org/download/6R72-sf.cif ,
which is representative of the class of problems I'm trying to convey here.

The file has everything, kudos to the authors: The original data, the
manipulated data and a whole lot more.

Are these data accessible?

YES, if you download the file, open it in your favorite text editor, and
carefully scroll and read through its 76,566 lines and use your best guess
to infer what are the original data arrays, what are the modified data
arrays and so on.

NO, absolutely NO, if you parse data files in PDB automatically with a
script, and attempt to extract particular data (eg., original unmanipulated
data). And this is what I find problematic, especially given 215+k entries
in PDB as of today.

Hope someone does something about it!

All the best!
Pavel



On Tue, Feb 13, 2024 at 4:57 PM Arpita Goswami  wrote:

> Dear all,
>
> Good day. Thank you all for the very extensive discussions. Both on- and
> off-list discussions were very helpful.
>
> Thank you and a very happy Valentine's day to all..
>
> Best regards,
> Arpita
>
>
> On Wed, Feb 14, 2024, 01:25 Kay Diederichs 
> wrote:
>
>> Dear readers of CCP4BB,
>>
>> for various reasons I don't feel inclined to reply to this.
>>
>> I'm really sorry,
>> Kay
>>
>> On Tue, 13 Feb 2024 15:25:03 +, Gerard Bricogne <
>> g...@globalphasing.com> wrote:
>>
>> >Dear Kay,
>> >
>> ...
>>
>> 
>>
>> To unsubscribe from the CCP4BB list, click the following link:
>> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>>
>> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
>> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
>> available at https://www.jiscmail.ac.uk/policyandsecurity/
>>
>
> --
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] About Staraniso

2024-02-13 Thread Arpita Goswami
Dear all,

Good day. Thank you all for the very extensive discussions. Both on- and
off-list discussions were very helpful.

Thank you and a very happy Valentine's day to all..

Best regards,
Arpita


On Wed, Feb 14, 2024, 01:25 Kay Diederichs 
wrote:

> Dear readers of CCP4BB,
>
> for various reasons I don't feel inclined to reply to this.
>
> I'm really sorry,
> Kay
>
> On Tue, 13 Feb 2024 15:25:03 +, Gerard Bricogne <
> g...@globalphasing.com> wrote:
>
> >Dear Kay,
> >
> ...
>
> 
>
> To unsubscribe from the CCP4BB list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1
>
> This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a
> mailing list hosted by www.jiscmail.ac.uk, terms & conditions are
> available at https://www.jiscmail.ac.uk/policyandsecurity/
>



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] About Staraniso

2024-02-13 Thread Kay Diederichs
Dear readers of CCP4BB,

for various reasons I don't feel inclined to reply to this.

I'm really sorry,
Kay

On Tue, 13 Feb 2024 15:25:03 +, Gerard Bricogne  
wrote:

>Dear Kay,
>
...



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/


Re: [ccp4bb] About Staraniso

2024-02-13 Thread Gerard Bricogne
Dear Kay,

 I think I should add a few comments and annotations to your reply to
Arpita, as it was addressed not just to her but to the general readership of
the CCP4BB. This will involve introducing an extra level of interleaving,
which can get a bit unsightly but can hardly be avoided.

 Please see below.


On Mon, Feb 12, 2024 at 04:55:02PM +, Kay Diederichs wrote:
> Dear Arpita,
> 
> I'll try to answer below -
> 
> On Mon, 12 Feb 2024 13:57:28 +0530, Arpita Goswami  
> wrote:
> 
> >Dear all,
> >
> >Greetings to all! Apologies if the below query seems very naive!
> >
> >This is to query on the consensus to use Staraniso for pdb submission. We
> >have solved a structure previously at 2.3 A resolution. The same data
> >(after reindexing the diffraction images in autoPROC) and after
> >reprocessing by ellipsoidal scaling in Staraniso gave structure at ~2.16 A.
> >
> >The previously solved structure did not have significant anisotropy
> >according to Aimless, so anisotropic scaling was not performed that time.
> 
> Maybe different resolution cutoffs were used: previously the 
> data were cut a 2.3A (according to some rule, e.g. "cut at =2"), 
> now they are cut at ~2.16A (according to a different rule, e.g. "cut at 
> =1.5").
> 
> Alternatively, different programs were used for data processing, and for
> the processing by the one used previously, the rule said "you should cut at 
> 2.3A", and for the one used now, the same rule said "cut at ~2.16A".
> After all, different programs give different data!
> 
> Speaking of rules, there are different "schools of thought" concerning the 
> "best" or "correct" rule for finding the resolution cutoff. I don't want to 
> repeat 
> the arguments here. It is important to know that there exists an experimental 
> and practical way (called "paired refinement") to find a meaningful 
> resolution 
> cutoff, and this requires refinement of a model, and comparison of 
> (basically) 
> Rfree values. The CCP4 program for this purpose is called PAIREF.
> There is also a web server called PDB-REDO that does paired refinement with
> your data and model.

 Arpita may not have come across the arguments that you don't want to
repeat, so she may find it useful to peruse the following item in the CCP4BB
archive: 

   https://www.mail-archive.com/ccp4bb@jiscmail.ac.uk/msg54145.html

that provides a useful introduction to the content of your paragraph above.
In particular, her use of PAIREF as available from CCP4 or PDB-REDO would
give her a resolution cut-off as a *single* number: applying such a cut-off
would then get her back to the situation where either good data will be
discarded in the better-diffracting direction(s), or essentially pure noise
will be kept in the worse-diffracting one(s). So this would be a huge step
backward from the results she has been alluding to, where we happen to know
that the default STARANISO analysis shows that these tetragonal crystals
diffract to 2.526 A along a* and b*, and to 2.039 A along c*. Trusting
PAIREF to give a useful indication of a single-number resolution in this
situation would seem totally unrealistic.

 There has been a publication on the simultaneous use of PAIREF and
STARANISO, using verbatim (and without acknowledgement) the procedure that
was suggested in the archive e-mail cited above (paragraph starting with
"Quite the contrary:"), namely using datasets obtained by running STARANISO
with different cut-off values for the local average of I/sig(I) and applying
the PAIREF procedure to assess which one is the most appropriate, and then
taking the associated diffraction limits as being the *three* diffraction
limits of the dataset. This way, one preserves the benefits of the
anisotropy analysis and one simply tweaks the STARANISO cut-off threshold
around its default value (1.2) by specifying another value through a
command-line parameter provided for that purpose since the inception of the
program. The publication in question can be found at 

  https://journals.iucr.org/j/issues/2023/04/00/ap5049/index.html

The process of exploring different STARANISO thresholds is manually driven,
and none of the available versions of PAIREF offer that feature - so
directing Arpita towards them may not be to her advantage. Before investing
too much hope in this type of investigation, it should be noted that this
article concludes that 

 "Corrections for diffraction anisotropy using the STARANISO server [with a
 default cut-off of 1.2] and [its output] data dramatically improved the
 quality of the observed electron density maps. No differences were observed
 with the extension of data from [those obtained with a cut-off value of
 1.2] to [those obtained with a cut-off value of 0.5]".


> >The overall spherical completeness of Staraniso structure is low (~73%)
> >while Ellipsoidal completeness is ~94%. Parallel isotropic scaling gives
> >structure with 99.6% completeness (but 2.3 A resolution). The statistics (R
> >merge 

Re: [ccp4bb] About Staraniso

2024-02-12 Thread Kay Diederichs
Dear Arpita,

I'll try to answer below -

On Mon, 12 Feb 2024 13:57:28 +0530, Arpita Goswami  wrote:

>Dear all,
>
>Greetings to all! Apologies if the below query seems very naive!
>
>This is to query on the consensus to use Staraniso for pdb submission. We
>have solved a structure previously at 2.3 A resolution. The same data
>(after reindexing the diffraction images in autoPROC) and after
>reprocessing by ellipsoidal scaling in Staraniso gave structure at ~2.16 A.
>
>The previously solved structure did not have significant anisotropy
>according to Aimless, so anisotropic scaling was not performed that time.

Maybe different resolution cutoffs were used: previously the 
data were cut a 2.3A (according to some rule, e.g. "cut at =2"), 
now they are cut at ~2.16A (according to a different rule, e.g. "cut at 
=1.5").

Alternatively, different programs were used for data processing, and for
the processing by the one used previously, the rule said "you should cut at 
2.3A", and for the one used now, the same rule said "cut at ~2.16A".
After all, different programs give different data!

Speaking of rules, there are different "schools of thought" concerning the 
"best" or "correct" rule for finding the resolution cutoff. I don't want to 
repeat 
the arguments here. It is important to know that there exists an experimental 
and practical way (called "paired refinement") to find a meaningful resolution 
cutoff, and this requires refinement of a model, and comparison of (basically) 
Rfree values. The CCP4 program for this purpose is called PAIREF.
There is also a web server called PDB-REDO that does paired refinement with
your data and model.

>
>The overall spherical completeness of Staraniso structure is low (~73%)
>while Ellipsoidal completeness is ~94%. Parallel isotropic scaling gives
>structure with 99.6% completeness (but 2.3 A resolution). The statistics (R
>merge and others) are better for Staraniso structure (also benefited from
>removing specific frames with high R merge as indicated by Staraniso). Also

I understand that, based on what you wrote, autoPROC was used for processing
the most recent data. Actually, Staraniso is run from autoPROC; it is part of 
the
autoPROC expert system (but there is also a web server for it).
The removal of specific frames is done by a different part of autoPROC, 
not by Staraniso.

>the interatomic distances in regions of interest in the staraniso structure
>is on par with parallel molecular dynamics simulation data.

This needs a lot more explanation and context information, and discussion.

>
>The questions are:
>
>1. Can the Staraniso structure be submitted to pdb saying reprocessed
>structure at higher resolution (through Staraniso)?

As recently reported here on CCP4BB (IIRC), there exists a PDB 
policy that only the previous authors of the PDB entry can re-submit.
If existing data are re-processed and a new PDB entry submitted, then this 
must be accompanied by a paper.

>
>2. What is the factor more important for a structure: completeness
>(spherical vs ellipsoidal) or R statistics?

You ask the correct general question: how do I obtain the best structure?
The answer to your specific question is: it depends. There is not one answer 
that fits all situations.
Paired refinement should help you to find the best dataset for refinement.

>
>3. Why is the extra resolution not detected during indexing by iMOSFLM or
>XDS (using default setup)? The indexed outputs of either of them did not
>give extra resolution (through anisotropic scaling) in Staraniso, although
>it said some data was missing.

I'm afraid there is some confusion here.
To try and clarify: indexing has nothing to do with resolution (indexing means
assigning indices, i.e. hkl triples, to reflections that are found during an 
initial
spot finding procedure), so it is not clear that "extra resolution not detected 
during indexing" is an appropriate description for what you see.

>
>4. Is there any option for using all reflections detection (like autoPROC)
>in iMOSFLM or XDS?

Again, there is some confusion here: autoPROC is an expert system that uses 
XDS. More explicitly: autoPROC creates XDS.INP and then runs XDS. Then it 
analyzes the XDS results, and afterwards uses that analysis to create an 
improved XDS.INP, and re-runs XDS. 

There is nothing to prevent you from running XDS manually, to obtain results
like those that autoPROC obtains (or even better!). If you want to learn
about these options: some of them are discussed at 
https://wiki.uni-konstanz.de/xds/index.php/Optimisation , or elsewhere in 
XDSwiki. 

Similarly, iMOSFLM can be optimized for your data, and this will result in 
better
data, and higher resolution than is obtained with default parameters. I'd
google "imosflm tutorial"; there are some interesting hits.

There are workshops, organized by CCP4 and others, where you can learn these 
things.

Hope this helps,
Kay

>
>
>Thanks in advance,
>
>Best Regards,
>
>Arpita
>

Re: [ccp4bb] About Staraniso

2024-02-12 Thread Clemens Vonrhein
Dear Arpita,

> Apologies if the below query seems very naive!

Your query is not at all naive, it is very probing.  Sorry for the
necessarily long reply to your questions - but there are a number of topics
you raise where we think a large amount of confusion still exists.  Please
note that this reply represents /our/ view of things only of course.

> This is to query on the consensus to use Staraniso for pdb submission. We
> have solved a structure previously at 2.3 A resolution.

So you had a dataset where you decided on a sphere in reciprocal space with
a radius of 2.3A as a cut-off surface - based on some kind of local
analysis that convinced you that all the measured reflections within that
sphere (i.e. 2.3A and lower) are observed and should be kept, while all
measured reflections outside that sphere should be regarded as unobserved
and can be discarded as pure noise.

> The same data (after reindexing the diffraction images in autoPROC) and
> after reprocessing by ellipsoidal scaling in Staraniso gave structure at
> ~2.16 A.

OK, first some clarification:

  * The scaling of the unmerged reflection data in autoPROC (using AIMLESS)
is neither spherical nor ellipsoidal in itself: it uses the data as it
is with the typical scale parameterisation in AIMLESS, i.e. a scale k
and an image B-factor (plus some absorption), all with default
smoothing.

This then leads to two output reflection files:

 aimless_alldata_unmerged.mtz  = Scaled and unmerged reflections without 
cut-off.
 aimless_alldata.mtz   = Scaled and merged reflections without 
cut-off.

  * The latter (scaled and merged reflection data without any cut-off) is
then given to STARANISO to do the following:

(a) Compute various local statistics that are then used to define a
cut-off surface.

(b) Assume that all reflections within that cut-off surface should be
kept (and could have been observed) and all those outside should be
ignored.

==> See how that is extremely similar to the type of analysis you did
with the initial 2.3A data?  One type of analysis (local 1D-shells
of data in d*) lead to an isotropic sphere as a cut-off surface,
while another (local 3D-spheres in reciprocal space) lead to an
anisotropic cut-off surface.

Remember that "anisotropic" just means "not isotropic" - it doesn't
mean "ellipsoidal" (diffraction from a cubic crystal can be
anisotropic since the [100], [110] and [111] directions have quite
different properties, yet attempts to fit an ellipsoid to it will
produce a sphere).  The cut-off surface assigned this way by
STARANISO can have any shape really (including being a sphere)
because the analysis via local spheres doesn't assume/enforce
isotropy - while the analysis via spherical shells does.

So up to that point there is no difference really between the two
approaches: using a criterion to define a cut-off surface and
considering data within the surface as observable and data outside
as unobservable. It is only the assumptions on which the criterion
is based that differ: one assumes the data is isotropic, while the
other doesn't.

==> The notion of "resolution" is a bit complicated in general here: if
your crystal diffracted better in some directions than in others, a
better description is the use of "diffraction limit" in some
directions - e.g.  defined as a the principal axes of an ellipsoid
fitted to the cut-off surface. This is what autoPROC/STARANISO
provides.

   (c) Analyse the anisotropic fall-off in intensity of the data within the
   cut-off surface to derive anisotropic correction factors and apply
   them to the data.

   This is similar to the anisotropic scaling a refinement program
   would perform using the current model as a reference (to
   anisotropically scale the observed data to the model).  Here we
   apply an internal anisotropic scaling, without a reference to any
   model.

> The previously solved structure did not have significant anisotropy
> according to Aimless, so anisotropic scaling was not performed that time.

See above: you most likely did use anisotropic scaling during refinement
(with the model as the reference).

Please note that using AIMLESS alone is not the best way to detect
anisotropy; that is not its main purpose.  As far as we know, it looks only
along the crystal axes for anisotropy (whereas STARANISO looks in all
directions). That means it will not detect anisotropy eigenvectors lying
close to diagonals as can happen in monoclinic (a*-c* plane only since an
anisotropy eigenvector is constrained to be parallel to the b* axis) and
triclinic lattices.  So if AIMLESS says there is significant anisotropy you
can believe it; OTOH if it says no anisotropy was detected you should

[ccp4bb] About Staraniso

2024-02-12 Thread Arpita Goswami
Dear all,

Greetings to all! Apologies if the below query seems very naive!

This is to query on the consensus to use Staraniso for pdb submission. We
have solved a structure previously at 2.3 A resolution. The same data
(after reindexing the diffraction images in autoPROC) and after
reprocessing by ellipsoidal scaling in Staraniso gave structure at ~2.16 A.

The previously solved structure did not have significant anisotropy
according to Aimless, so anisotropic scaling was not performed that time.

The overall spherical completeness of Staraniso structure is low (~73%)
while Ellipsoidal completeness is ~94%. Parallel isotropic scaling gives
structure with 99.6% completeness (but 2.3 A resolution). The statistics (R
merge and others) are better for Staraniso structure (also benefited from
removing specific frames with high R merge as indicated by Staraniso). Also
the interatomic distances in regions of interest in the staraniso structure
is on par with parallel molecular dynamics simulation data.

The questions are:

1. Can the Staraniso structure be submitted to pdb saying reprocessed
structure at higher resolution (through Staraniso)?

2. What is the factor more important for a structure: completeness
(spherical vs ellipsoidal) or R statistics?

3. Why is the extra resolution not detected during indexing by iMOSFLM or
XDS (using default setup)? The indexed outputs of either of them did not
give extra resolution (through anisotropic scaling) in Staraniso, although
it said some data was missing.

4. Is there any option for using all reflections detection (like autoPROC)
in iMOSFLM or XDS?


Thanks in advance,

Best Regards,

Arpita



To unsubscribe from the CCP4BB list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/WA-JISC.exe?SUBED1=CCP4BB=1

This message was issued to members of www.jiscmail.ac.uk/CCP4BB, a mailing list 
hosted by www.jiscmail.ac.uk, terms & conditions are available at 
https://www.jiscmail.ac.uk/policyandsecurity/