Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Graeme Winter Thu, 13 Nov 2014 00:59:11 -0800

Dear Kay

I cannot comment on the accuracy or otherwise of your 99%, but every time I
talk about xia2 or write down what the options do, I try to make it clear
that XDS / XSCALE is used for integration & scaling then AIMLESS to merge
the data. I have had an interest for a while in scaling the data with
AIMLESS from INTEGRATE.HKL purely for the purpose of performing the
analysis you described, but this would be a different option to xia2 *which
does not yet exist*


If you have a way of avoiding misconceptions in users I am sure I will not
be alone in my interest :o) and on a more practical note if you think the
description of how xia2 uses XDS / XSCALE can be improved I would welcome
that. It does always list the appropriate references for users to cite at
the end...

Best wishes Graeme



On Thu Nov 13 2014 at 8:35:20 AM Kay Diederichs <
[email protected]> wrote:

> Dear Graeme,
>
> good that you set this straight.
>
> I consider getting the statistics output from AIMLESS is a perfectly valid
> reason for going e), and as long as this is well-tested (which I'd bet in
> case of xia2) it's ok. There is one issue I can see: 99% (obviously my
> guess could be wrong; just an estimate based on reading the Methods section
> of papers) of xia2 -3d users are not aware that their data then are _not_
> scaled by AIMLESS. They see the AIMLESS tables and think "so it must have
> been AIMLESS that scaled the data". And they publish and PDB-deposit their
> misconception. This is how the misunderstanding spreads, which is then why
> I get asked "can CORRECT scale a data set?" and other misunderstandings
> along these lines ...
>
> best,
>
> Kay
>
> On Thu, 13 Nov 2014 08:15:12 +0000, Graeme Winter <[email protected]>
> wrote:
>
> >Dear Kay
> >
> >Just to comment on (e) since you say you don't know why anyone would want
> >to do this, yet this is exactly what xia2 -3d does :o)
> >
> >I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a
> >way to get a report on the merging statistics which includes all of the
> >AIMLESS analysis, and to generate harvesting files for deposition.
> >
> >Like you, I look forward to studies of (a) - (e) & think of all of these
> >(c) is by far the worst idea, from gut instinct. There are also some
> >subtleties to making (b) work properly...
> >
> >For anyone who has time on their hands & would like to do this study, be
> >sure to consider a range of crystal symmetries as it is possible that some
> >strategies which are "safe" in PG 422 (say) are not in PG 2.
> >
> >Best wishes Graeme
> >
> >
> >
> >On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs <
> >[email protected]> wrote:
> >
> >> Hi Wolfram,
> >>
> >> it took me a while until I realized that you mean "overfitting" when you
> >> said "o-word".
> >>
> >> You can abuse XDS in a number of ways, and I would call them
> "overfitting
> >> the data" although that would be using the word in a somewhat strained
> way:
> >> reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below
> 50
> >> come to mind, but in an extended sense there are other ways: rejecting
> >> frames for no other reason than that they have low I/sigma or high
> Rmeas,
> >> ...
> >>
> >> People always seem to find ways to beautify their precision indicators,
> >> but they are just fooling themselves, because rejecting data just for
> >> cosmetic reasons creates bias. In other words, they trade random error
> >> against systematic error. Guess what is worse. A deeper reason of the
> >> problem is that crystallographers have been fixated on data R-factors
> for
> >> decades, and have become really spoilt by this. Our science has been
> >> completely mis-lead when it comes to data statistics, and is recovering
> >> only slowly.
> >>
> >> Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I
> >> know of no systematic studies in this respect. But I know one thing: it
> is
> >> better to be critical with respect to recipes, than to follow them
> blindly.
> >> So I suggest the following project: compare SAD structure solution with
> the
> >> following routes
> >> a) INTEGRATE -> CORRECT scaling  -> SHELXD
> >> b) INTEGRATE -> AIMLESS scaling -> SHELXD
> >> c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD
> >> d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling ->
> >> SHELXD
> >> e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off ->
> >> SHELXD
> >> and report here.
> >> You can add XSCALE into the mix but that won't change the picture, since
> >> it does the exact same calculations for multiple datasets as CORRECT
> does
> >> for single datasets.
> >> Personally, I don't understand why people would _want_ to do c),d) or e)
> >> because that's just added complexity, and additional sources of error.
> >>
> >> I'm looking forward to the results of such studies!
> >>
> >> Kay
> >>
> >>
> >> On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[email protected]> wrote:
> >>
> >> >Hello Kay,
> >> >you said the o-word, and you are familiar with the inner workings of
> XDS.
> >> >Has the data-to-parameter ratio in even complex scaling models become
> so
> >> >small that a doubling (worst case) of model parameters would be a
> serious
> >> >concern? Could one detect such overfitting by, say, comparing
> (molecular)
> >> >model R-factors between refinement against the once (CORRECT) scaled or
> >> >twice (CORRECT+AIMLESS) scaled data?
> >> >Thank you,
> >> >Wolfram
> >> >
> >> >On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs <
> >> >[email protected]> wrote:
> >> >
> >> >> Hi Tim,
> >> >>
> >> >> this is incorrect.
> >> >>
> >> >> XSCALE determines the relative scale and B in a first step (this is
> what
> >> >> you describe).
> >> >>
> >> >> It then, in a second step, re-determines all scale factors (exactly
> as
> >> >> CORRECT does for the individual data sets), at the exact same
> supporting
> >> >> points that CORRECT used.  (This avoids over-fitting which would
> result
> >> >> from a scaling model with different basis functions; a worry that I
> have
> >> >> when people use SCALA/AIMLESS after CORRECT without taking
> precautions.)
> >> >> The resulting scale factors are written to files MODPIX*.cbf,
> >> DECAY*.cbf,
> >> >> ABSORP*.cbf for inspection.
> >> >>
> >> >> Thirdly, it produces statistics and writes output files.
> >> >>
> >> >> best,
> >> >>
> >> >> Kay
> >> >>
> >> >>
> >> >> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene <
> [email protected]
> >> >
> >> >> wrote:
> >> >>
> >> >> >-----BEGIN PGP SIGNED MESSAGE-----
> >> >> >Hash: SHA1
> >> >> >
> >> >> >Dear Wolfram Tempel,
> >> >> >
> >> >> >there might be some confusion about terms.
> >> >> >
> >> >> >It is correct that xscale scales several data sets together.
> However,
> >> >> >in crystallography, 'merging' might be the better term for this
> >> process.
> >> >> >
> >> >> >Crystallographic 'Scaling' is far more complicated than 'merging'.
> It
> >> >> >applies correction factors which try to make up for experimental
> >> >> >errors in your data set. These corrections include the sigma-values,
> >> >> >which is particularly important for experimental phasing. In that
> >> >> >respect it can actually hamper the data quality if you
> >> >> >(crystallographically) scale your data twice, although the effect is
> >> >> >rather subtle.
> >> >> >
> >> >> >CORRECT carries out these corrections, hence CORRECT scales your
> data
> >> >> >set, while XSCALE does not repeat this step - it "only" merges your
> >> >> >data in the sense that it puts your data on a common scale. This is
> >> >> >the application of a not too difficult mathematical formula (which
> is
> >> >> >listed in the xds wiki, but I don't remember the URL).
> >> >> >
> >> >> >Regards,
> >> >> >Tim
> >> >> >
> >> >> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote:
> >> >> >>
> >> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale
> >> >> >>
> >> >> >> XSCALE
> >> >> >> <
> >> >> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/
> >> xscale_parameters.html
> >> >> >
> >> >> >>
> >> >> >>
> >> >> >is the scaling program of the XDS suite. It scales reflection files
> >> >> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the
> CORRECT
> >> >> >> step of XDS already scales an individual dataset, XSCALE is only
> >> >> >> /needed/ if several datasets should be scaled relative to another.
> >> >> >> However, it does not deterioriate a dataset if it is "scaled
> again"
> >> >> >> in XSCALE, since the supporting points of the scalefactors are at
> >> >> >> the same positions in detector and batch space. The advantage of
> >> >> >> using XSCALE for a single dataset is that the user can specify the
> >> >> >> limits of the resolution shells.
> >> >> >>
> >> >> >> _Scaling with scala/aimless_
> >> >> >>
> >> >> >>
> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/
> >> Scaling_with_SCALA_%28or_better:_aimless%29
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> -Sudhir
> >> >> >>
> >> >> >>
> >> >> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D
> >> >> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439
> >> >> >>
> >> >> >> Ph : 630 252 0672
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On 11/11/14 14:42, wtempel wrote:
> >> >> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling,
> >> >> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL
> >> >> >>> data that are beyond CORRECT's capabilities? Wolfram
> >> >> >>>
> >> >> >>>
> >> >> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan
> >> >> >>> <[email protected] <mailto:[email protected]>> wrote:
> >> >> >>>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> I actually choose the option 'constant' further down in the
> >> >> >>> aimless gui but I guess the effect is similar to 'onlymege'.
> >> >> >>>
> >> >> >>> Boaz
> >> >> >>>
> >> >> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University
> >> >> >>> of the Negev Beer-Sheva 84105 Israel
> >> >> >>>
> >> >> >>> E-mail: [email protected] <mailto:[email protected]> Phone:
> >> >> >>> 972-8-647-2220  Skype: boaz.shaanan Fax:   972-8-647-2992 or
> >> >> >>> 972-8-646-1710 / // // /
> >> >> >>>
> >> >> >>> /
> >> >> >>>
> >> >> >>>
> >> >> ------------------------------------------------------------
> >> ------------
> >> >> >>>
> >> >> >>>
> >> >> >*From:* CCP4 bulletin board [[email protected]
> >> >> >>> <mailto:[email protected]>] on behalf of wtempel
> >> >> >>> [[email protected] <mailto:[email protected]>] *Sent:* Tuesday,
> >> >> >>> November 11, 2014 9:50 PM *To:* [email protected]
> >> >> >>> <mailto:[email protected]> *Subject:* [ccp4bb] To scale or
> >> >> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS
> >> >> >>>
> >> >> >>> Hello all, in a discussion
> >> >> >>>
> >> >> >>> <
> >> >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L=
> >> CCP4BB&H=1&P=186901
> >> >> >
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >on this board, Kay Diederichs questioned the effect of scaling
> >> >> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I
> >> >> >>> understand that the available alternatives in this work flow are
> >> >> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any
> >> >> >>> arguments for the preference of one alternative over the other?
> >> >> >>> Thank you for your insights, Wolfram Tempel
> >> >> >>>
> >> >> >>> 
> >> >> >>>
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >- --
> >> >> >- --
> >> >> >Dr Tim Gruene
> >> >> >Institut fuer anorganische Chemie
> >> >> >Tammannstr. 4
> >> >> >D-37077 Goettingen
> >> >> >
> >> >> >GPG Key ID = A46BEE1A
> >> >> >
> >> >> >-----BEGIN PGP SIGNATURE-----
> >> >> >Version: GnuPG v1.4.12 (GNU/Linux)
> >> >> >
> >> >> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En
> >> >> >67VgyyqCTX6j5vOz3xMVwqE=
> >> >> >=ooTC
> >> >> >-----END PGP SIGNATURE-----
> >> >>
> >> >
> >>
> >
>
>
>

Re: [ccp4bb] To scale or not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS

Reply via email to