subject:"Re\: \[ccp4bb\] To scale or not to scale\: XDS_ASCII.HKL input to POINTLESS\/AIMLESS"

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-18 Thread Kay Diederichs

Dear Graeme,

some points for further clarification-

The CORRECT corrections you mention all depend on the geometric description of
the experiment.

This geometric description of the experiment is refined by CORRECT, to come up
with accurate values for
a) application of polarization correction (which you were mentioning)
b) application of the zeta factor which has to do with the the lengthening of
the path of a reflection passing through the Ewald sphere at an angle
c) intensity correction depending on finite detector thickness, detector
material, wavelength, and angle (you were also mentioning this; it is the 1.
item in the SILICON article in XDSwiki)
d) positions of reflections on the surface of the detector. This is the second
item in the SILICON article in XDSwiki. See also
http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#SILICON=
e) air absorption
(http://xds.mpimf-heidelberg.mpg.de/html_doc/xds_parameters.html#AIR=)

Maybe I forgot something, but this may complete the picture somewhat.

best,

Kay

On Mon, 17 Nov 2014 09:44:13 +, Graeme Winter graeme.win...@gmail.com
wrote:

Dear Nukri,

The following is my opinion which I think is worth discussion, and are
based on my understanding of what XDS does in the CORRECT step.

Firstly, I tend to find the global refinement in the CORRECT step useful
for getting a good unit cell recycling the orientation matrix etc. for
reintegration. This is not related to scaling, but is useful, e.g.:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles

More relevant to the intensities: in integration the LP correction is
calculated assuming an unpolarized beam - if the data are from a
synchrotron these need to be corrected again for the correct polarization -
something which the correct step does (obviously given this on the
command-line). Pointless will also do this but assumes unless given a
correct value that the beam is quite polarized. Mostly: care needs to be
taken, particularly if using a wavelength which may be confused with a lab
source...

I also understand that the XDS CORRECT step applies a DQE correction for
Pilatus data, taking into account the geometry of the experiment, the
sensor thickness photon energy. If you have a two theta offset and are
using relatively high energy (say 14 keV or so?) then this may have odd
effects on your data. At detector two theta = 0 this is less of a problem.
This can be a gotcha with processing small molecule data recorded with a
little Pilatus.

Best wishes Graeme

On Fri Nov 14 2014 at 6:15:31 PM Sanishvili, Ruslan rsanishv...@anl.gov
wrote:

Dear Graeme,

Could you elaborate on There are also some subtleties to making (b) work
properly... some more? I have a feeling, from observing the beamline
users, that many choose to use this option. It would be very helpful for
them to know what are those subtleties and how to best make it work
properly.
Many thanks,
Nukri

Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

--
*From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Graeme
Winter [graeme.win...@gmail.com]
*Sent:* Thursday, November 13, 2014 2:15 AM
*To:* CCP4BB@JISCMAIL.AC.UK
*Subject:* Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to
POINTLESS/AIMLESS

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want
to do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
way to get a report on the merging statistics which includes all of the
AIMLESS analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e) think of all of these
(c) is by far the worst idea, from gut instinct. There are also some
subtleties to making (b) work properly...

For anyone who has time on their hands would like to do this study, be
sure to consider a range of crystal symmetries as it is possible that some
strategies which are safe in PG 422 (say) are not in PG 2.

Best wishes Graeme

On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs
kay.diederi...@uni-konstanz.de wrote:

Hi Wolfram,

it took me a while until I realized that you mean overfitting when you
said o-word.

You can abuse XDS in a number of ways, and I would call them overfitting
the data although that would be using the word in a somewhat strained way:
reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50
come to mind, but in an extended sense there are other ways: rejecting
frames for no other reason than that they have low I/sigma or high Rmeas,
...

People always seem to find ways to beautify their precision indicators,
but

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-17 Thread Graeme Winter

Dear Nukri,

The following is my opinion which I think is worth discussion, and are
based on my understanding of what XDS does in the CORRECT step.

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles

Best wishes Graeme

On Fri Nov 14 2014 at 6:15:31 PM Sanishvili, Ruslan rsanishv...@anl.gov
wrote:

Dear Graeme,

Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want
to do this, yet this is exactly what xia2 -3d does :o)

Like you, I look forward to studies of (a) - (e) think of all of these
(c) is by far the worst idea, from gut instinct. There are also some
subtleties to making (b) work properly...

Best wishes Graeme

On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs
kay.diederi...@uni-konstanz.de wrote:

Hi Wolfram,

it took me a while until I realized that you mean overfitting when you
said o-word.

People always seem to find ways to beautify their precision indicators,
but they are just fooling themselves, because rejecting data just for
cosmetic reasons creates bias. In other words, they trade random error
against systematic error. Guess what is worse. A deeper reason of the
problem is that crystallographers have been fixated on data R-factors for
decades, and have become really spoilt by this. Our science has been
completely mis-lead when it comes to data statistics, and is recovering
only slowly.

Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
know of no systematic studies in this respect. But I know one thing: it is
better to be critical with respect to recipes, than to follow them blindly.
So I suggest the following project: compare SAD structure solution with the
following routes
a) INTEGRATE - CORRECT scaling - SHELXD
b) INTEGRATE - AIMLESS scaling - SHELXD
c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling -
SHELXD
e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off -
SHELXD
and report here.
You can add XSCALE into the mix but that won't change the

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-17 Thread Phil Evans

Actually Pointless knows that the INTEGRATE file is corrected for an
unpolarised beam and recorrects for a synchrotron unless the wavelength is one
of the home source ones. See docs. You can specify explicitly I think
Phil

Sent from my iPhone

On 17 Nov 2014, at 09:44, Graeme Winter graeme.win...@gmail.com wrote:

Dear Nukri,

The following is my opinion which I think is worth discussion, and are based
on my understanding of what XDS does in the CORRECT step.

Firstly, I tend to find the global refinement in the CORRECT step useful for
getting a good unit cell recycling the orientation matrix etc. for
reintegration. This is not related to scaling, but is useful, e.g.:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles

More relevant to the intensities: in integration the LP correction is
calculated assuming an unpolarized beam - if the data are from a synchrotron
these need to be corrected again for the correct polarization - something
which the correct step does (obviously given this on the command-line).
Pointless will also do this but assumes unless given a correct value that the
beam is quite polarized. Mostly: care needs to be taken, particularly if
using a wavelength which may be confused with a lab source...

I also understand that the XDS CORRECT step applies a DQE correction for
Pilatus data, taking into account the geometry of the experiment, the sensor
thickness photon energy. If you have a two theta offset and are using
relatively high energy (say 14 keV or so?) then this may have odd effects on
your data. At detector two theta = 0 this is less of a problem. This can be a
gotcha with processing small molecule data recorded with a little Pilatus.

Best wishes Graeme

On Fri Nov 14 2014 at 6:15:31 PM Sanishvili, Ruslan rsanishv...@anl.gov
wrote:
Dear Graeme,

Could you elaborate on There are also some subtleties to making (b) work
properly... some more? I have a feeling, from observing the beamline users,
that many choose to use this option. It would be very helpful for them to
know what are those subtleties and how to best make it work properly.
Many thanks,
Nukri

Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Graeme Winter
[graeme.win...@gmail.com]
Sent: Thursday, November 13, 2014 2:15 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to
POINTLESS/AIMLESS

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want to
do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a way
to get a report on the merging statistics which includes all of the AIMLESS
analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e) think of all of these (c)
is by far the worst idea, from gut instinct. There are also some subtleties
to making (b) work properly...

Best wishes Graeme

On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs
kay.diederi...@uni-konstanz.de wrote:
Hi Wolfram,

it took me a while until I realized that you mean overfitting when you
said o-word.

People always seem to find ways to beautify their precision indicators, but
they are just fooling themselves, because rejecting data just for cosmetic
reasons creates bias. In other words, they trade random error against
systematic error. Guess what is worse. A deeper reason of the problem is
that crystallographers have been fixated on data R-factors for decades, and
have become really spoilt by this. Our science has been completely mis-lead
when it comes to data statistics, and is recovering only slowly.

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-17 Thread Graeme Winter

HI Phil,

Yep: I was not clear perhaps but I did say this:

Pointless will also do this but assumes unless given a correct value that
the beam is quite polarized. Mostly: care needs to be taken, particularly
if using a wavelength which may be confused with a lab source...

For others: if you do wish to set the polarization exactly the keyword is
polarization:

POLARISATION [XDS | MOSFLM] [polarisation_factor]

INTEGRATE.HKL files from XDS have been corrected for the polarisation from
an unpolarised incident beam, but not for the additional polarisation
correction from a synchrotron source, so this additional correction needs
to be applied, and will be applied by default for this file type. This
command has no effect on other types of input files.

There seem to be two definitions for the polarisation ratio
polarisation_factor:
the definition used in eg Mosflm, which follows Kahn et al. (ref below), J'
in their Appendix: this has a value of 0 for an unpolarised beam and +1.0
for a fully polarised synchrotron beam
the definition used in XDS, parameter FRACTION_OF_POLARIZATION: this has a
value 0.5 for an unpolarised beam and +1.0 for a fully polarised
synchrotron beam
Here the value given is assumed to be in the XDS convention, unless the
subkeyword MOSFLM is given, and it is then converted to the Kahn/Mosflm
convention for internal use. Set polarisation_factor = 0.0 for an
unpolarised beam.

The default value if not set explicitly = 0.99, = XDS 0.98, unless the
wavelength corresponds to a likely in-house source, in which case the
unpolarised value is left unchanged (recognised wavelengths are CuKalpha
1.5418 +- 0.0019, Mo 0.7107 +- 0.0002, Cr 2.29 +- 0.01)

(Reference: Kahn, Fourme, Gadet, Janin, Dumas, André, J. Appl. Cryst.
(1982). 15, 330-337)

From:

http://www.ccp4.ac.uk/html/pointless.html#polarisation

Cheerio Graeme

On Mon Nov 17 2014 at 10:22:56 AM Phil Evans p...@mrc-lmb.cam.ac.uk wrote:

Actually Pointless knows that the INTEGRATE file is corrected for an
unpolarised beam and recorrects for a synchrotron unless the wavelength is
one of the home source ones. See docs. You can specify explicitly I think
Phil

Sent from my iPhone

On 17 Nov 2014, at 09:44, Graeme Winter graeme.win...@gmail.com wrote:

Dear Nukri,

The following is my opinion which I think is worth discussion, and are
based on my understanding of what XDS does in the CORRECT step.

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Optimisation#Re-INTEGRATEing_with_the_correct_spacegroup.2C_refined_geometry_and_fine-slicing_of_profiles

Best wishes Graeme

On Fri Nov 14 2014 at 6:15:31 PM Sanishvili, Ruslan rsanishv...@anl.gov
wrote:

Dear Graeme,

Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want
to do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as
a way to get a report on the merging statistics which

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-14 Thread Sanishvili, Ruslan

Dear Graeme,

Could you elaborate on There are also some subtleties to making (b) work 
properly... some more? I have a feeling, from observing the beamline users, 
that many choose to use this option. It would be very helpful for them to know 
what are those subtleties and how to best make it work properly.
Many thanks,
Nukri


Ruslan Sanishvili (Nukri)
Macromolecular Crystallographer
GM/CA@APS
X-ray Science Division, ANL
9700 S. Cass Ave.
Lemont, IL 60439

Tel: (630)252-0665
Fax: (630)252-0667
rsanishv...@anl.gov


From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of Graeme Winter 
[graeme.win...@gmail.com]
Sent: Thursday, November 13, 2014 2:15 AM
To: CCP4BB@JISCMAIL.AC.UK
Subject: Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to 
POINTLESS/AIMLESS

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want to do 
this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a way to 
get a report on the merging statistics which includes all of the AIMLESS 
analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e)  think of all of these (c) is 
by far the worst idea, from gut instinct. There are also some subtleties to 
making (b) work properly...

For anyone who has time on their hands  would like to do this study, be sure 
to consider a range of crystal symmetries as it is possible that some 
strategies which are safe in PG 422 (say) are not in PG 2.

Best wishes Graeme



On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs 
kay.diederi...@uni-konstanz.demailto:kay.diederi...@uni-konstanz.de wrote:
Hi Wolfram,

it took me a while until I realized that you mean overfitting when you said 
o-word.

You can abuse XDS in a number of ways, and I would call them overfitting the 
data although that would be using the word in a somewhat strained way: 
reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50 come 
to mind, but in an extended sense there are other ways: rejecting frames for no 
other reason than that they have low I/sigma or high Rmeas, ...

People always seem to find ways to beautify their precision indicators, but 
they are just fooling themselves, because rejecting data just for cosmetic 
reasons creates bias. In other words, they trade random error against 
systematic error. Guess what is worse. A deeper reason of the problem is that 
crystallographers have been fixated on data R-factors for decades, and have 
become really spoilt by this. Our science has been completely mis-lead when it 
comes to data statistics, and is recovering only slowly.

Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I know of 
no systematic studies in this respect. But I know one thing: it is better to be 
critical with respect to recipes, than to follow them blindly. So I suggest the 
following project: compare SAD structure solution with the following routes
a) INTEGRATE - CORRECT scaling  - SHELXD
b) INTEGRATE - AIMLESS scaling - SHELXD
c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling - SHELXD
e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off - SHELXD
and report here.
You can add XSCALE into the mix but that won't change the picture, since it 
does the exact same calculations for multiple datasets as CORRECT does for 
single datasets.
Personally, I don't understand why people would _want_ to do c),d) or e) 
because that's just added complexity, and additional sources of error.

I'm looking forward to the results of such studies!

Kay


On Wed, 12 Nov 2014 12:41:28 -0500, wtempel 
wtem...@gmail.commailto:wtem...@gmail.com wrote:

Hello Kay,
you said the o-word, and you are familiar with the inner workings of XDS.
Has the data-to-parameter ratio in even complex scaling models become so
small that a doubling (worst case) of model parameters would be a serious
concern? Could one detect such overfitting by, say, comparing (molecular)
model R-factors between refinement against the once (CORRECT) scaled or
twice (CORRECT+AIMLESS) scaled data?
Thank you,
Wolfram

On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs 
kay.diederi...@uni-konstanz.demailto:kay.diederi...@uni-konstanz.de wrote:

 Hi Tim,

 this is incorrect.

 XSCALE determines the relative scale and B in a first step (this is what
 you describe).

 It then, in a second step, re-determines all scale factors (exactly as
 CORRECT does for the individual data sets), at the exact same supporting
 points that CORRECT used.  (This avoids over-fitting which would result
 from a scaling model with different basis functions; a worry that I have
 when people use SCALA/AIMLESS after CORRECT without taking precautions.)
 The resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
 ABSORP*.cbf for inspection.

 Thirdly, it produces statistics and writes output

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-13 Thread Graeme Winter

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want
to do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
way to get a report on the merging statistics which includes all of the
AIMLESS analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e)  think of all of these
(c) is by far the worst idea, from gut instinct. There are also some
subtleties to making (b) work properly...

For anyone who has time on their hands  would like to do this study, be
sure to consider a range of crystal symmetries as it is possible that some
strategies which are safe in PG 422 (say) are not in PG 2.

Best wishes Graeme



On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:

 Hi Wolfram,

 it took me a while until I realized that you mean overfitting when you
 said o-word.

 You can abuse XDS in a number of ways, and I would call them overfitting
 the data although that would be using the word in a somewhat strained way:
 reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50
 come to mind, but in an extended sense there are other ways: rejecting
 frames for no other reason than that they have low I/sigma or high Rmeas,
 ...

 People always seem to find ways to beautify their precision indicators,
 but they are just fooling themselves, because rejecting data just for
 cosmetic reasons creates bias. In other words, they trade random error
 against systematic error. Guess what is worse. A deeper reason of the
 problem is that crystallographers have been fixated on data R-factors for
 decades, and have become really spoilt by this. Our science has been
 completely mis-lead when it comes to data statistics, and is recovering
 only slowly.

 Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
 know of no systematic studies in this respect. But I know one thing: it is
 better to be critical with respect to recipes, than to follow them blindly.
 So I suggest the following project: compare SAD structure solution with the
 following routes
 a) INTEGRATE - CORRECT scaling  - SHELXD
 b) INTEGRATE - AIMLESS scaling - SHELXD
 c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
 d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling -
 SHELXD
 e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off -
 SHELXD
 and report here.
 You can add XSCALE into the mix but that won't change the picture, since
 it does the exact same calculations for multiple datasets as CORRECT does
 for single datasets.
 Personally, I don't understand why people would _want_ to do c),d) or e)
 because that's just added complexity, and additional sources of error.

 I'm looking forward to the results of such studies!

 Kay


 On Wed, 12 Nov 2014 12:41:28 -0500, wtempel wtem...@gmail.com wrote:

 Hello Kay,
 you said the o-word, and you are familiar with the inner workings of XDS.
 Has the data-to-parameter ratio in even complex scaling models become so
 small that a doubling (worst case) of model parameters would be a serious
 concern? Could one detect such overfitting by, say, comparing (molecular)
 model R-factors between refinement against the once (CORRECT) scaled or
 twice (CORRECT+AIMLESS) scaled data?
 Thank you,
 Wolfram
 
 On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs 
 kay.diederi...@uni-konstanz.de wrote:
 
  Hi Tim,
 
  this is incorrect.
 
  XSCALE determines the relative scale and B in a first step (this is what
  you describe).
 
  It then, in a second step, re-determines all scale factors (exactly as
  CORRECT does for the individual data sets), at the exact same supporting
  points that CORRECT used.  (This avoids over-fitting which would result
  from a scaling model with different basis functions; a worry that I have
  when people use SCALA/AIMLESS after CORRECT without taking precautions.)
  The resulting scale factors are written to files MODPIX*.cbf,
 DECAY*.cbf,
  ABSORP*.cbf for inspection.
 
  Thirdly, it produces statistics and writes output files.
 
  best,
 
  Kay
 
 
  On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene t...@shelx.uni-ac.gwdg.de
 
  wrote:
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  Dear Wolfram Tempel,
  
  there might be some confusion about terms.
  
  It is correct that xscale scales several data sets together. However,
  in crystallography, 'merging' might be the better term for this
 process.
  
  Crystallographic 'Scaling' is far more complicated than 'merging'. It
  applies correction factors which try to make up for experimental
  errors in your data set. These corrections include the sigma-values,
  which is particularly important for experimental phasing. In that
  respect it can actually hamper the data quality if you
  (crystallographically) scale your data twice, although the effect is
  rather subtle.
  
  CORRECT carries out these corrections, hence

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-13 Thread Kay Diederichs

Dear Graeme,

good that you set this straight.

I consider getting the statistics output from AIMLESS is a perfectly valid 
reason for going e), and as long as this is well-tested (which I'd bet in case 
of xia2) it's ok. There is one issue I can see: 99% (obviously my guess could 
be wrong; just an estimate based on reading the Methods section of papers) of 
xia2 -3d users are not aware that their data then are _not_ scaled by AIMLESS. 
They see the AIMLESS tables and think so it must have been AIMLESS that scaled 
the data. And they publish and PDB-deposit their misconception. This is how 
the misunderstanding spreads, which is then why I get asked can CORRECT scale 
a data set? and other misunderstandings along these lines ...

best,

Kay

On Thu, 13 Nov 2014 08:15:12 +, Graeme Winter graeme.win...@gmail.com 
wrote:

Dear Kay

Just to comment on (e) since you say you don't know why anyone would want
to do this, yet this is exactly what xia2 -3d does :o)

I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
way to get a report on the merging statistics which includes all of the
AIMLESS analysis, and to generate harvesting files for deposition.

Like you, I look forward to studies of (a) - (e)  think of all of these
(c) is by far the worst idea, from gut instinct. There are also some
subtleties to making (b) work properly...

For anyone who has time on their hands  would like to do this study, be
sure to consider a range of crystal symmetries as it is possible that some
strategies which are safe in PG 422 (say) are not in PG 2.

Best wishes Graeme



On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:

 Hi Wolfram,

 it took me a while until I realized that you mean overfitting when you
 said o-word.

 You can abuse XDS in a number of ways, and I would call them overfitting
 the data although that would be using the word in a somewhat strained way:
 reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50
 come to mind, but in an extended sense there are other ways: rejecting
 frames for no other reason than that they have low I/sigma or high Rmeas,
 ...

 People always seem to find ways to beautify their precision indicators,
 but they are just fooling themselves, because rejecting data just for
 cosmetic reasons creates bias. In other words, they trade random error
 against systematic error. Guess what is worse. A deeper reason of the
 problem is that crystallographers have been fixated on data R-factors for
 decades, and have become really spoilt by this. Our science has been
 completely mis-lead when it comes to data statistics, and is recovering
 only slowly.

 Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
 know of no systematic studies in this respect. But I know one thing: it is
 better to be critical with respect to recipes, than to follow them blindly.
 So I suggest the following project: compare SAD structure solution with the
 following routes
 a) INTEGRATE - CORRECT scaling  - SHELXD
 b) INTEGRATE - AIMLESS scaling - SHELXD
 c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
 d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling -
 SHELXD
 e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off -
 SHELXD
 and report here.
 You can add XSCALE into the mix but that won't change the picture, since
 it does the exact same calculations for multiple datasets as CORRECT does
 for single datasets.
 Personally, I don't understand why people would _want_ to do c),d) or e)
 because that's just added complexity, and additional sources of error.

 I'm looking forward to the results of such studies!

 Kay


 On Wed, 12 Nov 2014 12:41:28 -0500, wtempel wtem...@gmail.com wrote:

 Hello Kay,
 you said the o-word, and you are familiar with the inner workings of XDS.
 Has the data-to-parameter ratio in even complex scaling models become so
 small that a doubling (worst case) of model parameters would be a serious
 concern? Could one detect such overfitting by, say, comparing (molecular)
 model R-factors between refinement against the once (CORRECT) scaled or
 twice (CORRECT+AIMLESS) scaled data?
 Thank you,
 Wolfram
 
 On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs 
 kay.diederi...@uni-konstanz.de wrote:
 
  Hi Tim,
 
  this is incorrect.
 
  XSCALE determines the relative scale and B in a first step (this is what
  you describe).
 
  It then, in a second step, re-determines all scale factors (exactly as
  CORRECT does for the individual data sets), at the exact same supporting
  points that CORRECT used.  (This avoids over-fitting which would result
  from a scaling model with different basis functions; a worry that I have
  when people use SCALA/AIMLESS after CORRECT without taking precautions.)
  The resulting scale factors are written to files MODPIX*.cbf,
 DECAY*.cbf,
  ABSORP*.cbf for inspection.
 
  Thirdly, it produces statistics and writes output files.
 
  best,
 
  Kay
 
 
  On

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-13 Thread Graeme Winter

Dear Kay

I cannot comment on the accuracy or otherwise of your 99%, but every time I
talk about xia2 or write down what the options do, I try to make it clear
that XDS / XSCALE is used for integration  scaling then AIMLESS to merge
the data. I have had an interest for a while in scaling the data with
AIMLESS from INTEGRATE.HKL purely for the purpose of performing the
analysis you described, but this would be a different option to xia2 *which
does not yet exist*

If you have a way of avoiding misconceptions in users I am sure I will not
be alone in my interest :o) and on a more practical note if you think the
description of how xia2 uses XDS / XSCALE can be improved I would welcome
that. It does always list the appropriate references for users to cite at
the end...

Best wishes Graeme



On Thu Nov 13 2014 at 8:35:20 AM Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:

 Dear Graeme,

 good that you set this straight.

 I consider getting the statistics output from AIMLESS is a perfectly valid
 reason for going e), and as long as this is well-tested (which I'd bet in
 case of xia2) it's ok. There is one issue I can see: 99% (obviously my
 guess could be wrong; just an estimate based on reading the Methods section
 of papers) of xia2 -3d users are not aware that their data then are _not_
 scaled by AIMLESS. They see the AIMLESS tables and think so it must have
 been AIMLESS that scaled the data. And they publish and PDB-deposit their
 misconception. This is how the misunderstanding spreads, which is then why
 I get asked can CORRECT scale a data set? and other misunderstandings
 along these lines ...

 best,

 Kay

 On Thu, 13 Nov 2014 08:15:12 +, Graeme Winter graeme.win...@gmail.com
 wrote:

 Dear Kay
 
 Just to comment on (e) since you say you don't know why anyone would want
 to do this, yet this is exactly what xia2 -3d does :o)
 
 I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
 way to get a report on the merging statistics which includes all of the
 AIMLESS analysis, and to generate harvesting files for deposition.
 
 Like you, I look forward to studies of (a) - (e)  think of all of these
 (c) is by far the worst idea, from gut instinct. There are also some
 subtleties to making (b) work properly...
 
 For anyone who has time on their hands  would like to do this study, be
 sure to consider a range of crystal symmetries as it is possible that some
 strategies which are safe in PG 422 (say) are not in PG 2.
 
 Best wishes Graeme
 
 
 
 On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs 
 kay.diederi...@uni-konstanz.de wrote:
 
  Hi Wolfram,
 
  it took me a while until I realized that you mean overfitting when you
  said o-word.
 
  You can abuse XDS in a number of ways, and I would call them
 overfitting
  the data although that would be using the word in a somewhat strained
 way:
  reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below
 50
  come to mind, but in an extended sense there are other ways: rejecting
  frames for no other reason than that they have low I/sigma or high
 Rmeas,
  ...
 
  People always seem to find ways to beautify their precision indicators,
  but they are just fooling themselves, because rejecting data just for
  cosmetic reasons creates bias. In other words, they trade random error
  against systematic error. Guess what is worse. A deeper reason of the
  problem is that crystallographers have been fixated on data R-factors
 for
  decades, and have become really spoilt by this. Our science has been
  completely mis-lead when it comes to data statistics, and is recovering
  only slowly.
 
  Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
  know of no systematic studies in this respect. But I know one thing: it
 is
  better to be critical with respect to recipes, than to follow them
 blindly.
  So I suggest the following project: compare SAD structure solution with
 the
  following routes
  a) INTEGRATE - CORRECT scaling  - SHELXD
  b) INTEGRATE - AIMLESS scaling - SHELXD
  c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
  d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling -
  SHELXD
  e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off -
  SHELXD
  and report here.
  You can add XSCALE into the mix but that won't change the picture, since
  it does the exact same calculations for multiple datasets as CORRECT
 does
  for single datasets.
  Personally, I don't understand why people would _want_ to do c),d) or e)
  because that's just added complexity, and additional sources of error.
 
  I'm looking forward to the results of such studies!
 
  Kay
 
 
  On Wed, 12 Nov 2014 12:41:28 -0500, wtempel wtem...@gmail.com wrote:
 
  Hello Kay,
  you said the o-word, and you are familiar with the inner workings of
 XDS.
  Has the data-to-parameter ratio in even complex scaling models become
 so
  small that a doubling (worst case) of model parameters would be a
 serious
  concern? Could

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread Tim Gruene

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Wolfram Tempel,

there might be some confusion about terms.

It is correct that xscale scales several data sets together. However,
in crystallography, 'merging' might be the better term for this process.

Crystallographic 'Scaling' is far more complicated than 'merging'. It
applies correction factors which try to make up for experimental
errors in your data set. These corrections include the sigma-values,
which is particularly important for experimental phasing. In that
respect it can actually hamper the data quality if you
(crystallographically) scale your data twice, although the effect is
rather subtle.

CORRECT carries out these corrections, hence CORRECT scales your data
set, while XSCALE does not repeat this step - it only merges your
data in the sense that it puts your data on a common scale. This is
the application of a not too difficult mathematical formula (which is
listed in the xds wiki, but I don't remember the URL).

Regards,
Tim

On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale

XSCALE
http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html

is the scaling program of the XDS suite. It scales reflection files
(typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT
step of XDS already scales an individual dataset, XSCALE is only
/needed/ if several datasets should be scaled relative to another.
However, it does not deterioriate a dataset if it is scaled again
in XSCALE, since the supporting points of the scalefactors are at
the same positions in detector and batch space. The advantage of
using XSCALE for a single dataset is that the user can specify the
limits of the resolution shells.

_Scaling with scala/aimless_

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29

-Sudhir

*** Sudhir Babu Pothineni GM/CA @ APS 436D
Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439

Ph : 630 252 0672

On 11/11/14 14:42, wtempel wrote:
Thank you Boaz. So if CORRECT can do a fully corrected scaling,
are there no corrections that XSCALE might apply to XDS_ASCII.HKL
data that are beyond CORRECT's capabilities? Wolfram

On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il wrote:

Hi,

I actually choose the option 'constant' further down in the
aimless gui but I guess the effect is similar to 'onlymege'.

Boaz

/Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
of the Negev Beer-Sheva 84105 Israel

E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il Phone:
972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
972-8-646-1710 / // // /

*From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK
mailto:CCP4BB@JISCMAIL.AC.UK] on behalf of wtempel
[wtem...@gmail.com mailto:wtem...@gmail.com] *Sent:* Tuesday,
November 11, 2014 9:50 PM *To:* CCP4BB@JISCMAIL.AC.UK
mailto:CCP4BB@JISCMAIL.AC.UK *Subject:* [ccp4bb] To scale or
not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Hello all, in a discussion

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901

on this board, Kay Diederichs questioned the effect of scaling
data in AIMLESS after prior scaling in XDS (CORRECT). I
understand that the available alternatives in this work flow are
to specify the AIMLESS ‘onlymerge’ command, or not. Are there any
arguments for the preference of one alternative over the other?
Thank you for your insights, Wolfram Tempel

- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
67VgyyqCTX6j5vOz3xMVwqE=
=ooTC
-END PGP SIGNATURE-

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread Kay Diederichs

Hi Tim,

this is incorrect.

XSCALE determines the relative scale and B in a first step (this is what you
describe).

It then, in a second step, re-determines all scale factors (exactly as CORRECT
does for the individual data sets), at the exact same supporting points that
CORRECT used. (This avoids over-fitting which would result from a scaling
model with different basis functions; a worry that I have when people use
SCALA/AIMLESS after CORRECT without taking precautions.) The resulting scale
factors are written to files MODPIX*.cbf, DECAY*.cbf, ABSORP*.cbf for
inspection.

Thirdly, it produces statistics and writes output files.

best,

Kay

On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene t...@shelx.uni-ac.gwdg.de
wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Wolfram Tempel,

there might be some confusion about terms.

It is correct that xscale scales several data sets together. However,
in crystallography, 'merging' might be the better term for this process.

Regards,
Tim

On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale

XSCALE
http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html

_Scaling with scala/aimless_

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29

-Sudhir

*** Sudhir Babu Pothineni GM/CA @ APS 436D
Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439

Ph : 630 252 0672

On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il wrote:

Hi,

I actually choose the option 'constant' further down in the
aimless gui but I guess the effect is similar to 'onlymege'.

Boaz

/Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
of the Negev Beer-Sheva 84105 Israel

E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il Phone:
972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
972-8-646-1710 / // // /

Hello all, in a discussion

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901

- --
- --
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)

iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
67VgyyqCTX6j5vOz3xMVwqE=
=ooTC
-END PGP SIGNATURE-

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread Kay Diederichs

On Wed, 12 Nov 2014 15:32:04 +, Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:
...
It then, in a second step, re-determines all scale factors (exactly as CORRECT 
does for the individual data sets), at the exact same supporting points that 
CORRECT used.  (This avoids over-fitting which would result from a scaling 
model with different basis functions; a worry that I have when people use 
SCALA/AIMLESS after CORRECT without taking precautions.) The resulting scale 
factors are written to files MODPIX*.cbf, DECAY*.cbf, ABSORP*.cbf for 
inspection.


Maybe needless to add, but I'll write it nevertheless. XSCALE _also_ adjust the 
error model in this step, and adjusts the sigmas accordingly.

Kay

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread Tim Gruene

Hi Kay,

thank you for the clarification. I had understood that using XSCALE
after CORRECT does no harm, but did not understand that the reason lies
in the consistent choice of support points rather than not repeating
what might already having been done.

Regards,
Tim

On 11/12/2014 04:32 PM, Kay Diederichs wrote:
Hi Tim,

this is incorrect.

XSCALE determines the relative scale and B in a first step (this is what you
describe).

It then, in a second step, re-determines all scale factors (exactly as
CORRECT does for the individual data sets), at the exact same supporting
points that CORRECT used. (This avoids over-fitting which would result from
a scaling model with different basis functions; a worry that I have when
people use SCALA/AIMLESS after CORRECT without taking precautions.) The
resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
ABSORP*.cbf for inspection.

Thirdly, it produces statistics and writes output files.

best,

Kay

On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene t...@shelx.uni-ac.gwdg.de
wrote:

Dear Wolfram Tempel,

there might be some confusion about terms.

It is correct that xscale scales several data sets together. However,
in crystallography, 'merging' might be the better term for this process.

Regards,
Tim

On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale

XSCALE
http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html

_Scaling with scala/aimless_

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29

-Sudhir

*** Sudhir Babu Pothineni GM/CA @ APS 436D
Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439

Ph : 630 252 0672

On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il wrote:

Hi,

I actually choose the option 'constant' further down in the
aimless gui but I guess the effect is similar to 'onlymege'.

Boaz

/Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
of the Negev Beer-Sheva 84105 Israel

E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il Phone:
972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
972-8-646-1710 / // // /

Hello all, in a discussion

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901

--
Dr Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A

signature.asc
Description: OpenPGP digital signature

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread wtempel

Hello Kay,
you said the o-word, and you are familiar with the inner workings of XDS.
Has the data-to-parameter ratio in even complex scaling models become so
small that a doubling (worst case) of model parameters would be a serious
concern? Could one detect such overfitting by, say, comparing (molecular)
model R-factors between refinement against the once (CORRECT) scaled or
twice (CORRECT+AIMLESS) scaled data?
Thank you,
Wolfram

On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs
kay.diederi...@uni-konstanz.de wrote:

Hi Tim,

this is incorrect.

XSCALE determines the relative scale and B in a first step (this is what
you describe).

It then, in a second step, re-determines all scale factors (exactly as
CORRECT does for the individual data sets), at the exact same supporting
points that CORRECT used. (This avoids over-fitting which would result
from a scaling model with different basis functions; a worry that I have
when people use SCALA/AIMLESS after CORRECT without taking precautions.)
The resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
ABSORP*.cbf for inspection.

Thirdly, it produces statistics and writes output files.

best,

Kay

On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene t...@shelx.uni-ac.gwdg.de
wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Wolfram Tempel,

there might be some confusion about terms.

It is correct that xscale scales several data sets together. However,
in crystallography, 'merging' might be the better term for this process.

Regards,
Tim

On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale

XSCALE

http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html

_Scaling with scala/aimless_

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29

-Sudhir

*** Sudhir Babu Pothineni GM/CA @ APS 436D
Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439

Ph : 630 252 0672

On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il wrote:

Hi,

I actually choose the option 'constant' further down in the
aimless gui but I guess the effect is similar to 'onlymege'.

Boaz

/Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
of the Negev Beer-Sheva 84105 Israel

E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il Phone:
972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or
972-8-646-1710 / // // /

Hello all, in a discussion

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-12 Thread Kay Diederichs

Hi Wolfram,

it took me a while until I realized that you mean overfitting when you said 
o-word.

You can abuse XDS in a number of ways, and I would call them overfitting the 
data although that would be using the word in a somewhat strained way: 
reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50 come 
to mind, but in an extended sense there are other ways: rejecting frames for no 
other reason than that they have low I/sigma or high Rmeas, ...

People always seem to find ways to beautify their precision indicators, but 
they are just fooling themselves, because rejecting data just for cosmetic 
reasons creates bias. In other words, they trade random error against 
systematic error. Guess what is worse. A deeper reason of the problem is that 
crystallographers have been fixated on data R-factors for decades, and have 
become really spoilt by this. Our science has been completely mis-lead when it 
comes to data statistics, and is recovering only slowly.

Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I know of 
no systematic studies in this respect. But I know one thing: it is better to be 
critical with respect to recipes, than to follow them blindly. So I suggest the 
following project: compare SAD structure solution with the following routes
a) INTEGRATE - CORRECT scaling  - SHELXD
b) INTEGRATE - AIMLESS scaling - SHELXD
c) INTEGRATE - CORRECT+AIMLESS scaling - SHELXD
d) INTEGRATE - CORRECT but scaling switched off - AIMLESS scaling - SHELXD
e) INTEGRATE - CORRECT scaling - AIMLESS but scaling switched off - SHELXD
and report here.
You can add XSCALE into the mix but that won't change the picture, since it 
does the exact same calculations for multiple datasets as CORRECT does for 
single datasets.
Personally, I don't understand why people would _want_ to do c),d) or e) 
because that's just added complexity, and additional sources of error. 

I'm looking forward to the results of such studies!

Kay


On Wed, 12 Nov 2014 12:41:28 -0500, wtempel wtem...@gmail.com wrote:

Hello Kay,
you said the o-word, and you are familiar with the inner workings of XDS.
Has the data-to-parameter ratio in even complex scaling models become so
small that a doubling (worst case) of model parameters would be a serious
concern? Could one detect such overfitting by, say, comparing (molecular)
model R-factors between refinement against the once (CORRECT) scaled or
twice (CORRECT+AIMLESS) scaled data?
Thank you,
Wolfram

On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs 
kay.diederi...@uni-konstanz.de wrote:

 Hi Tim,

 this is incorrect.

 XSCALE determines the relative scale and B in a first step (this is what
 you describe).

 It then, in a second step, re-determines all scale factors (exactly as
 CORRECT does for the individual data sets), at the exact same supporting
 points that CORRECT used.  (This avoids over-fitting which would result
 from a scaling model with different basis functions; a worry that I have
 when people use SCALA/AIMLESS after CORRECT without taking precautions.)
 The resulting scale factors are written to files MODPIX*.cbf, DECAY*.cbf,
 ABSORP*.cbf for inspection.

 Thirdly, it produces statistics and writes output files.

 best,

 Kay


 On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene t...@shelx.uni-ac.gwdg.de
 wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Dear Wolfram Tempel,
 
 there might be some confusion about terms.
 
 It is correct that xscale scales several data sets together. However,
 in crystallography, 'merging' might be the better term for this process.
 
 Crystallographic 'Scaling' is far more complicated than 'merging'. It
 applies correction factors which try to make up for experimental
 errors in your data set. These corrections include the sigma-values,
 which is particularly important for experimental phasing. In that
 respect it can actually hamper the data quality if you
 (crystallographically) scale your data twice, although the effect is
 rather subtle.
 
 CORRECT carries out these corrections, hence CORRECT scales your data
 set, while XSCALE does not repeat this step - it only merges your
 data in the sense that it puts your data on a common scale. This is
 the application of a not too difficult mathematical formula (which is
 listed in the xds wiki, but I don't remember the URL).
 
 Regards,
 Tim
 
 On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:
 
  http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale
 
  XSCALE
  
 http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html
 
 
 
 is the scaling program of the XDS suite. It scales reflection files
  (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT
  step of XDS already scales an individual dataset, XSCALE is only
  /needed/ if several datasets should be scaled relative to another.
  However, it does not deterioriate a dataset if it is scaled again
  in XSCALE, since the supporting points of the scalefactors

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-11 Thread Boaz Shaanan





Hi,


I actually choose the option 'constant' further down in the aimless gui but I guess the effect is similar to 'onlymege'.


 Boaz




Boaz Shaanan, Ph.D.

Dept. of Life Sciences 
Ben-Gurion University of the Negev 
Beer-Sheva 84105 
Israel 
 
E-mail: bshaa...@bgu.ac.il
Phone: 972-8-647-2220Skype: boaz.shaanan 
Fax: 972-8-647-2992 or 972-8-646-1710










From: CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of wtempel [wtem...@gmail.com]
Sent: Tuesday, November 11, 2014 9:50 PM
To: CCP4BB@JISCMAIL.AC.UK
Subject: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS






Hello all,
in a 
discussion on this board, Kay Diederichs questioned the effect of scaling data in AIMLESS after prior scaling in XDS (CORRECT). I understand that the available alternatives in this work flow are to specify the AIMLESS ‘onlymerge’ command, or not.
Are there any arguments for the preference of one alternative over the other?
Thank you for your insights,
Wolfram Tempel

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-11 Thread wtempel

Thank you Boaz.
So if CORRECT can do a fully corrected scaling, are there no corrections
that XSCALE might apply to XDS_ASCII.HKL data that are beyond CORRECT's
capabilities?
Wolfram


On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan bshaa...@bgu.ac.il wrote:

  Hi,

  I actually choose the option 'constant' further down in the aimless gui
 but I guess the effect is similar to 'onlymege'.

Boaz











 *Boaz Shaanan, Ph.D. Dept. of Life
 Sciences  Ben-Gurion University of the
 Negev  Beer-Sheva
 84105
 Israel
 E-mail:
 bshaa...@bgu.ac.il bshaa...@bgu.ac.il Phone: 972-8-647-2220  Skype:
 boaz.shaanan  Fax:   972-8-647-2992 or 972-8-646-1710*





   --
 *From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK] on behalf of wtempel [
 wtem...@gmail.com]
 *Sent:* Tuesday, November 11, 2014 9:50 PM
 *To:* CCP4BB@JISCMAIL.AC.UK
 *Subject:* [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to
 POINTLESS/AIMLESS

Hello all,
 in a discussion
 https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901
 on this board, Kay Diederichs questioned the effect of scaling data in
 AIMLESS after prior scaling in XDS (CORRECT). I understand that the
 available alternatives in this work flow are to specify the AIMLESS
 ‘onlymerge’ command, or not.
 Are there any arguments for the preference of one alternative over the
 other?
 Thank you for your insights,
 Wolfram Tempel

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-11 Thread Sudhir Babu Pothineni



http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale

XSCALE 
http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/xscale_parameters.html 
is the scaling program of the XDS suite. It scales reflection files 
(typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT step 
of XDS already scales an individual dataset, XSCALE is only /needed/ if 
several datasets should be scaled relative to another. However, it does 
not deterioriate a dataset if it is scaled again in XSCALE, since the 
supporting points of the scalefactors are at the same positions in 
detector and batch space. The advantage of using XSCALE for a single 
dataset is that the user can specify the limits of the resolution shells.


_Scaling with scala/aimless_

http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Scaling_with_SCALA_%28or_better:_aimless%29

-Sudhir


***
Sudhir Babu Pothineni
GM/CA @ APS 436D
Argonne National Laboratory
9700 S Cass Ave
Argonne IL 60439

Ph : 630 252 0672




On 11/11/14 14:42, wtempel wrote:

Thank you Boaz.
So if CORRECT can do a fully corrected scaling, are there no 
corrections that XSCALE might apply to XDS_ASCII.HKL data that are 
beyond CORRECT's capabilities?

Wolfram


On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan bshaa...@bgu.ac.il 
mailto:bshaa...@bgu.ac.il wrote:


Hi,

I actually choose the option 'constant' further down in the
aimless gui but I guess the effect is similar to 'onlymege'.

  Boaz

/Boaz Shaanan, Ph.D.
Dept. of Life Sciences
Ben-Gurion University of the Negev
Beer-Sheva 84105
Israel

E-mail: bshaa...@bgu.ac.il mailto:bshaa...@bgu.ac.il
Phone: 972-8-647-2220  Skype: boaz.shaanan
Fax:   972-8-647-2992 or 972-8-646-1710 /
//
//
/

/

*From:* CCP4 bulletin board [CCP4BB@JISCMAIL.AC.UK
mailto:CCP4BB@JISCMAIL.AC.UK] on behalf of wtempel
[wtem...@gmail.com mailto:wtem...@gmail.com]
*Sent:* Tuesday, November 11, 2014 9:50 PM
*To:* CCP4BB@JISCMAIL.AC.UK mailto:CCP4BB@JISCMAIL.AC.UK
*Subject:* [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input
to POINTLESS/AIMLESS

Hello all,
in a discussion

https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307L=CCP4BBH=1P=186901
on this board, Kay Diederichs questioned the effect of scaling
data in AIMLESS after prior scaling in XDS (CORRECT). I understand
that the available alternatives in this work flow are to specify
the AIMLESS ‘onlymerge’ command, or not.
Are there any arguments for the preference of one alternative over
the other?
Thank you for your insights,
Wolfram Tempel

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

2014-11-11 Thread Phil Evans

You can take XDS data into Pointless  Aimless (the CCP4 Data Reduction task) 
either from the unscaled INTEGRATE.HKL or the scaled XDS_ASCII.HKL file (or 
files). In the case of a single XDS_ASCII.HKL you don't need to rescale it in 
Aimless, though you can if you want.

Aimless uses a similar but not identical scaling model to XDS, which may be 
better or worse (and how do you judge?).

Phil

On 11 Nov 2014, at 19:50, wtempel wtem...@gmail.com wrote:

 Hello all,
 in a discussion on this board, Kay Diederichs questioned the effect of 
 scaling data in AIMLESS after prior scaling in XDS (CORRECT). I understand 
 that the available alternatives in this work flow are to specify the AIMLESS 
 ‘onlymerge’ command, or not.
 Are there any arguments for the preference of one alternative over the other?
 Thank you for your insights,
 Wolfram Tempel

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

18 matches

Site Navigation

Mail list logo

Footer information