Dear Graeme, good that you set this straight.
I consider getting the statistics output from AIMLESS is a perfectly valid reason for going e), and as long as this is well-tested (which I'd bet in case of xia2) it's ok. There is one issue I can see: 99% (obviously my guess could be wrong; just an estimate based on reading the Methods section of papers) of xia2 -3d users are not aware that their data then are _not_ scaled by AIMLESS. They see the AIMLESS tables and think "so it must have been AIMLESS that scaled the data". And they publish and PDB-deposit their misconception. This is how the misunderstanding spreads, which is then why I get asked "can CORRECT scale a data set?" and other misunderstandings along these lines ... best, Kay On Thu, 13 Nov 2014 08:15:12 +0000, Graeme Winter <[email protected]> wrote: >Dear Kay > >Just to comment on (e) since you say you don't know why anyone would want >to do this, yet this is exactly what xia2 -3d does :o) > >I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a >way to get a report on the merging statistics which includes all of the >AIMLESS analysis, and to generate harvesting files for deposition. > >Like you, I look forward to studies of (a) - (e) & think of all of these >(c) is by far the worst idea, from gut instinct. There are also some >subtleties to making (b) work properly... > >For anyone who has time on their hands & would like to do this study, be >sure to consider a range of crystal symmetries as it is possible that some >strategies which are "safe" in PG 422 (say) are not in PG 2. > >Best wishes Graeme > > > >On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs < >[email protected]> wrote: > >> Hi Wolfram, >> >> it took me a while until I realized that you mean "overfitting" when you >> said "o-word". >> >> You can abuse XDS in a number of ways, and I would call them "overfitting >> the data" although that would be using the word in a somewhat strained way: >> reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below 50 >> come to mind, but in an extended sense there are other ways: rejecting >> frames for no other reason than that they have low I/sigma or high Rmeas, >> ... >> >> People always seem to find ways to beautify their precision indicators, >> but they are just fooling themselves, because rejecting data just for >> cosmetic reasons creates bias. In other words, they trade random error >> against systematic error. Guess what is worse. A deeper reason of the >> problem is that crystallographers have been fixated on data R-factors for >> decades, and have become really spoilt by this. Our science has been >> completely mis-lead when it comes to data statistics, and is recovering >> only slowly. >> >> Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I >> know of no systematic studies in this respect. But I know one thing: it is >> better to be critical with respect to recipes, than to follow them blindly. >> So I suggest the following project: compare SAD structure solution with the >> following routes >> a) INTEGRATE -> CORRECT scaling -> SHELXD >> b) INTEGRATE -> AIMLESS scaling -> SHELXD >> c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD >> d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling -> >> SHELXD >> e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off -> >> SHELXD >> and report here. >> You can add XSCALE into the mix but that won't change the picture, since >> it does the exact same calculations for multiple datasets as CORRECT does >> for single datasets. >> Personally, I don't understand why people would _want_ to do c),d) or e) >> because that's just added complexity, and additional sources of error. >> >> I'm looking forward to the results of such studies! >> >> Kay >> >> >> On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[email protected]> wrote: >> >> >Hello Kay, >> >you said the o-word, and you are familiar with the inner workings of XDS. >> >Has the data-to-parameter ratio in even complex scaling models become so >> >small that a doubling (worst case) of model parameters would be a serious >> >concern? Could one detect such overfitting by, say, comparing (molecular) >> >model R-factors between refinement against the once (CORRECT) scaled or >> >twice (CORRECT+AIMLESS) scaled data? >> >Thank you, >> >Wolfram >> > >> >On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs < >> >[email protected]> wrote: >> > >> >> Hi Tim, >> >> >> >> this is incorrect. >> >> >> >> XSCALE determines the relative scale and B in a first step (this is what >> >> you describe). >> >> >> >> It then, in a second step, re-determines all scale factors (exactly as >> >> CORRECT does for the individual data sets), at the exact same supporting >> >> points that CORRECT used. (This avoids over-fitting which would result >> >> from a scaling model with different basis functions; a worry that I have >> >> when people use SCALA/AIMLESS after CORRECT without taking precautions.) >> >> The resulting scale factors are written to files MODPIX*.cbf, >> DECAY*.cbf, >> >> ABSORP*.cbf for inspection. >> >> >> >> Thirdly, it produces statistics and writes output files. >> >> >> >> best, >> >> >> >> Kay >> >> >> >> >> >> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene <[email protected] >> > >> >> wrote: >> >> >> >> >-----BEGIN PGP SIGNED MESSAGE----- >> >> >Hash: SHA1 >> >> > >> >> >Dear Wolfram Tempel, >> >> > >> >> >there might be some confusion about terms. >> >> > >> >> >It is correct that xscale scales several data sets together. However, >> >> >in crystallography, 'merging' might be the better term for this >> process. >> >> > >> >> >Crystallographic 'Scaling' is far more complicated than 'merging'. It >> >> >applies correction factors which try to make up for experimental >> >> >errors in your data set. These corrections include the sigma-values, >> >> >which is particularly important for experimental phasing. In that >> >> >respect it can actually hamper the data quality if you >> >> >(crystallographically) scale your data twice, although the effect is >> >> >rather subtle. >> >> > >> >> >CORRECT carries out these corrections, hence CORRECT scales your data >> >> >set, while XSCALE does not repeat this step - it "only" merges your >> >> >data in the sense that it puts your data on a common scale. This is >> >> >the application of a not too difficult mathematical formula (which is >> >> >listed in the xds wiki, but I don't remember the URL). >> >> > >> >> >Regards, >> >> >Tim >> >> > >> >> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote: >> >> >> >> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale >> >> >> >> >> >> XSCALE >> >> >> < >> >> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/ >> xscale_parameters.html >> >> > >> >> >> >> >> >> >> >> >is the scaling program of the XDS suite. It scales reflection files >> >> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the CORRECT >> >> >> step of XDS already scales an individual dataset, XSCALE is only >> >> >> /needed/ if several datasets should be scaled relative to another. >> >> >> However, it does not deterioriate a dataset if it is "scaled again" >> >> >> in XSCALE, since the supporting points of the scalefactors are at >> >> >> the same positions in detector and batch space. The advantage of >> >> >> using XSCALE for a single dataset is that the user can specify the >> >> >> limits of the resolution shells. >> >> >> >> >> >> _Scaling with scala/aimless_ >> >> >> >> >> >> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/ >> Scaling_with_SCALA_%28or_better:_aimless%29 >> >> >> >> >> >> >> >> >> >> >> >> -Sudhir >> >> >> >> >> >> >> >> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D >> >> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439 >> >> >> >> >> >> Ph : 630 252 0672 >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 11/11/14 14:42, wtempel wrote: >> >> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling, >> >> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL >> >> >>> data that are beyond CORRECT's capabilities? Wolfram >> >> >>> >> >> >>> >> >> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan >> >> >>> <[email protected] <mailto:[email protected]>> wrote: >> >> >>> >> >> >>> Hi, >> >> >>> >> >> >>> I actually choose the option 'constant' further down in the >> >> >>> aimless gui but I guess the effect is similar to 'onlymege'. >> >> >>> >> >> >>> Boaz >> >> >>> >> >> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University >> >> >>> of the Negev Beer-Sheva 84105 Israel >> >> >>> >> >> >>> E-mail: [email protected] <mailto:[email protected]> Phone: >> >> >>> 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or >> >> >>> 972-8-646-1710 / // // / >> >> >>> >> >> >>> / >> >> >>> >> >> >>> >> >> ------------------------------------------------------------ >> ------------ >> >> >>> >> >> >>> >> >> >*From:* CCP4 bulletin board [[email protected] >> >> >>> <mailto:[email protected]>] on behalf of wtempel >> >> >>> [[email protected] <mailto:[email protected]>] *Sent:* Tuesday, >> >> >>> November 11, 2014 9:50 PM *To:* [email protected] >> >> >>> <mailto:[email protected]> *Subject:* [ccp4bb] To scale or >> >> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS >> >> >>> >> >> >>> Hello all, in a discussion >> >> >>> >> >> >>> < >> >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L= >> CCP4BB&H=1&P=186901 >> >> > >> >> >>> >> >> >>> >> >> >>> >> >> >on this board, Kay Diederichs questioned the effect of scaling >> >> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I >> >> >>> understand that the available alternatives in this work flow are >> >> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any >> >> >>> arguments for the preference of one alternative over the other? >> >> >>> Thank you for your insights, Wolfram Tempel >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >> >> >> >> >> >> > >> >> >- -- >> >> >- -- >> >> >Dr Tim Gruene >> >> >Institut fuer anorganische Chemie >> >> >Tammannstr. 4 >> >> >D-37077 Goettingen >> >> > >> >> >GPG Key ID = A46BEE1A >> >> > >> >> >-----BEGIN PGP SIGNATURE----- >> >> >Version: GnuPG v1.4.12 (GNU/Linux) >> >> > >> >> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En >> >> >67VgyyqCTX6j5vOz3xMVwqE= >> >> >=ooTC >> >> >-----END PGP SIGNATURE----- >> >> >> > >> >
