Dear Kay I cannot comment on the accuracy or otherwise of your 99%, but every time I talk about xia2 or write down what the options do, I try to make it clear that XDS / XSCALE is used for integration & scaling then AIMLESS to merge the data. I have had an interest for a while in scaling the data with AIMLESS from INTEGRATE.HKL purely for the purpose of performing the analysis you described, but this would be a different option to xia2 *which does not yet exist*
If you have a way of avoiding misconceptions in users I am sure I will not be alone in my interest :o) and on a more practical note if you think the description of how xia2 uses XDS / XSCALE can be improved I would welcome that. It does always list the appropriate references for users to cite at the end... Best wishes Graeme On Thu Nov 13 2014 at 8:35:20 AM Kay Diederichs < [email protected]> wrote: > Dear Graeme, > > good that you set this straight. > > I consider getting the statistics output from AIMLESS is a perfectly valid > reason for going e), and as long as this is well-tested (which I'd bet in > case of xia2) it's ok. There is one issue I can see: 99% (obviously my > guess could be wrong; just an estimate based on reading the Methods section > of papers) of xia2 -3d users are not aware that their data then are _not_ > scaled by AIMLESS. They see the AIMLESS tables and think "so it must have > been AIMLESS that scaled the data". And they publish and PDB-deposit their > misconception. This is how the misunderstanding spreads, which is then why > I get asked "can CORRECT scale a data set?" and other misunderstandings > along these lines ... > > best, > > Kay > > On Thu, 13 Nov 2014 08:15:12 +0000, Graeme Winter <[email protected]> > wrote: > > >Dear Kay > > > >Just to comment on (e) since you say you don't know why anyone would want > >to do this, yet this is exactly what xia2 -3d does :o) > > > >I use AIMLESS to merge data already scaled by XDS CORRECT or XSCALE as a > >way to get a report on the merging statistics which includes all of the > >AIMLESS analysis, and to generate harvesting files for deposition. > > > >Like you, I look forward to studies of (a) - (e) & think of all of these > >(c) is by far the worst idea, from gut instinct. There are also some > >subtleties to making (b) work properly... > > > >For anyone who has time on their hands & would like to do this study, be > >sure to consider a range of crystal symmetries as it is possible that some > >strategies which are "safe" in PG 422 (say) are not in PG 2. > > > >Best wishes Graeme > > > > > > > >On Wed Nov 12 2014 at 10:07:10 PM Kay Diederichs < > >[email protected]> wrote: > > > >> Hi Wolfram, > >> > >> it took me a while until I realized that you mean "overfitting" when you > >> said "o-word". > >> > >> You can abuse XDS in a number of ways, and I would call them > "overfitting > >> the data" although that would be using the word in a somewhat strained > way: > >> reducing WFAC1 below 1, decreasing REFLECTIONS/CORRECTION_FACTOR below > 50 > >> come to mind, but in an extended sense there are other ways: rejecting > >> frames for no other reason than that they have low I/sigma or high > Rmeas, > >> ... > >> > >> People always seem to find ways to beautify their precision indicators, > >> but they are just fooling themselves, because rejecting data just for > >> cosmetic reasons creates bias. In other words, they trade random error > >> against systematic error. Guess what is worse. A deeper reason of the > >> problem is that crystallographers have been fixated on data R-factors > for > >> decades, and have become really spoilt by this. Our science has been > >> completely mis-lead when it comes to data statistics, and is recovering > >> only slowly. > >> > >> Concerning non-cautious use of SCALA/AIMLESS after CORRECT: actually I > >> know of no systematic studies in this respect. But I know one thing: it > is > >> better to be critical with respect to recipes, than to follow them > blindly. > >> So I suggest the following project: compare SAD structure solution with > the > >> following routes > >> a) INTEGRATE -> CORRECT scaling -> SHELXD > >> b) INTEGRATE -> AIMLESS scaling -> SHELXD > >> c) INTEGRATE -> CORRECT+AIMLESS scaling -> SHELXD > >> d) INTEGRATE -> CORRECT but scaling switched off -> AIMLESS scaling -> > >> SHELXD > >> e) INTEGRATE -> CORRECT scaling -> AIMLESS but scaling switched off -> > >> SHELXD > >> and report here. > >> You can add XSCALE into the mix but that won't change the picture, since > >> it does the exact same calculations for multiple datasets as CORRECT > does > >> for single datasets. > >> Personally, I don't understand why people would _want_ to do c),d) or e) > >> because that's just added complexity, and additional sources of error. > >> > >> I'm looking forward to the results of such studies! > >> > >> Kay > >> > >> > >> On Wed, 12 Nov 2014 12:41:28 -0500, wtempel <[email protected]> wrote: > >> > >> >Hello Kay, > >> >you said the o-word, and you are familiar with the inner workings of > XDS. > >> >Has the data-to-parameter ratio in even complex scaling models become > so > >> >small that a doubling (worst case) of model parameters would be a > serious > >> >concern? Could one detect such overfitting by, say, comparing > (molecular) > >> >model R-factors between refinement against the once (CORRECT) scaled or > >> >twice (CORRECT+AIMLESS) scaled data? > >> >Thank you, > >> >Wolfram > >> > > >> >On Wed, Nov 12, 2014 at 10:32 AM, Kay Diederichs < > >> >[email protected]> wrote: > >> > > >> >> Hi Tim, > >> >> > >> >> this is incorrect. > >> >> > >> >> XSCALE determines the relative scale and B in a first step (this is > what > >> >> you describe). > >> >> > >> >> It then, in a second step, re-determines all scale factors (exactly > as > >> >> CORRECT does for the individual data sets), at the exact same > supporting > >> >> points that CORRECT used. (This avoids over-fitting which would > result > >> >> from a scaling model with different basis functions; a worry that I > have > >> >> when people use SCALA/AIMLESS after CORRECT without taking > precautions.) > >> >> The resulting scale factors are written to files MODPIX*.cbf, > >> DECAY*.cbf, > >> >> ABSORP*.cbf for inspection. > >> >> > >> >> Thirdly, it produces statistics and writes output files. > >> >> > >> >> best, > >> >> > >> >> Kay > >> >> > >> >> > >> >> On Wed, 12 Nov 2014 11:22:51 +0100, Tim Gruene < > [email protected] > >> > > >> >> wrote: > >> >> > >> >> >-----BEGIN PGP SIGNED MESSAGE----- > >> >> >Hash: SHA1 > >> >> > > >> >> >Dear Wolfram Tempel, > >> >> > > >> >> >there might be some confusion about terms. > >> >> > > >> >> >It is correct that xscale scales several data sets together. > However, > >> >> >in crystallography, 'merging' might be the better term for this > >> process. > >> >> > > >> >> >Crystallographic 'Scaling' is far more complicated than 'merging'. > It > >> >> >applies correction factors which try to make up for experimental > >> >> >errors in your data set. These corrections include the sigma-values, > >> >> >which is particularly important for experimental phasing. In that > >> >> >respect it can actually hamper the data quality if you > >> >> >(crystallographically) scale your data twice, although the effect is > >> >> >rather subtle. > >> >> > > >> >> >CORRECT carries out these corrections, hence CORRECT scales your > data > >> >> >set, while XSCALE does not repeat this step - it "only" merges your > >> >> >data in the sense that it puts your data on a common scale. This is > >> >> >the application of a not too difficult mathematical formula (which > is > >> >> >listed in the xds wiki, but I don't remember the URL). > >> >> > > >> >> >Regards, > >> >> >Tim > >> >> > > >> >> >On 11/11/2014 10:07 PM, Sudhir Babu Pothineni wrote: > >> >> >> > >> >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/Xscale > >> >> >> > >> >> >> XSCALE > >> >> >> < > >> >> http://www.mpimf-heidelberg.mpg.de/%7Ekabsch/xds/html_doc/ > >> xscale_parameters.html > >> >> > > >> >> >> > >> >> >> > >> >> >is the scaling program of the XDS suite. It scales reflection files > >> >> >> (typically called XDS_ASCII.HKL) produced by XDS. Since the > CORRECT > >> >> >> step of XDS already scales an individual dataset, XSCALE is only > >> >> >> /needed/ if several datasets should be scaled relative to another. > >> >> >> However, it does not deterioriate a dataset if it is "scaled > again" > >> >> >> in XSCALE, since the supporting points of the scalefactors are at > >> >> >> the same positions in detector and batch space. The advantage of > >> >> >> using XSCALE for a single dataset is that the user can specify the > >> >> >> limits of the resolution shells. > >> >> >> > >> >> >> _Scaling with scala/aimless_ > >> >> >> > >> >> >> > >> >> http://strucbio.biologie.uni-konstanz.de/xdswiki/index.php/ > >> Scaling_with_SCALA_%28or_better:_aimless%29 > >> >> >> > >> >> >> > >> >> >> > >> >> >> -Sudhir > >> >> >> > >> >> >> > >> >> >> *************************** Sudhir Babu Pothineni GM/CA @ APS 436D > >> >> >> Argonne National Laboratory 9700 S Cass Ave Argonne IL 60439 > >> >> >> > >> >> >> Ph : 630 252 0672 > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> On 11/11/14 14:42, wtempel wrote: > >> >> >>> Thank you Boaz. So if CORRECT can do a fully corrected scaling, > >> >> >>> are there no corrections that XSCALE might apply to XDS_ASCII.HKL > >> >> >>> data that are beyond CORRECT's capabilities? Wolfram > >> >> >>> > >> >> >>> > >> >> >>> On Tue, Nov 11, 2014 at 3:05 PM, Boaz Shaanan > >> >> >>> <[email protected] <mailto:[email protected]>> wrote: > >> >> >>> > >> >> >>> Hi, > >> >> >>> > >> >> >>> I actually choose the option 'constant' further down in the > >> >> >>> aimless gui but I guess the effect is similar to 'onlymege'. > >> >> >>> > >> >> >>> Boaz > >> >> >>> > >> >> >>> /Boaz Shaanan, Ph.D. Dept. of Life Sciences Ben-Gurion University > >> >> >>> of the Negev Beer-Sheva 84105 Israel > >> >> >>> > >> >> >>> E-mail: [email protected] <mailto:[email protected]> Phone: > >> >> >>> 972-8-647-2220 Skype: boaz.shaanan Fax: 972-8-647-2992 or > >> >> >>> 972-8-646-1710 / // // / > >> >> >>> > >> >> >>> / > >> >> >>> > >> >> >>> > >> >> ------------------------------------------------------------ > >> ------------ > >> >> >>> > >> >> >>> > >> >> >*From:* CCP4 bulletin board [[email protected] > >> >> >>> <mailto:[email protected]>] on behalf of wtempel > >> >> >>> [[email protected] <mailto:[email protected]>] *Sent:* Tuesday, > >> >> >>> November 11, 2014 9:50 PM *To:* [email protected] > >> >> >>> <mailto:[email protected]> *Subject:* [ccp4bb] To scale or > >> >> >>> not to scale: XDS_ASCII.HKL input to POINTLESS/AIMLESS > >> >> >>> > >> >> >>> Hello all, in a discussion > >> >> >>> > >> >> >>> < > >> >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?A2=ind1307&L= > >> CCP4BB&H=1&P=186901 > >> >> > > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >on this board, Kay Diederichs questioned the effect of scaling > >> >> >>> data in AIMLESS after prior scaling in XDS (CORRECT). I > >> >> >>> understand that the available alternatives in this work flow are > >> >> >>> to specify the AIMLESS ‘onlymerge’ command, or not. Are there any > >> >> >>> arguments for the preference of one alternative over the other? > >> >> >>> Thank you for your insights, Wolfram Tempel > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >> > >> >> >> > >> >> > > >> >> >- -- > >> >> >- -- > >> >> >Dr Tim Gruene > >> >> >Institut fuer anorganische Chemie > >> >> >Tammannstr. 4 > >> >> >D-37077 Goettingen > >> >> > > >> >> >GPG Key ID = A46BEE1A > >> >> > > >> >> >-----BEGIN PGP SIGNATURE----- > >> >> >Version: GnuPG v1.4.12 (GNU/Linux) > >> >> > > >> >> >iD8DBQFUYzT7UxlJ7aRr7hoRAuO2AJ9P3kJAjP+8wWjXRvkZwgDs9UOo3ACfb1En > >> >> >67VgyyqCTX6j5vOz3xMVwqE= > >> >> >=ooTC > >> >> >-----END PGP SIGNATURE----- > >> >> > >> > > >> > > > > >
