>>>>> Simon Urbanek >>>>> on Wed, 18 Dec 2024 13:19:04 +1300 writes:
> It seems benign, but has implications since checking time > is actually not a cheap operation: adding jus ta time > check alone incurs a penalty of ca. 700% compared with the > time it takes to call R_CheckUserInterrupt(). Whoa! > Generally, it makes no sense to check interrupts at every iteration - you'll find code like if (++i % 10000 == 0) R_CheckUserInterrupt(); in loops to make sure it's not called unnecessarily. Thank you, Simaon. Tomas Kalibera proposed an even faster version of if's to do the same, e.g. in src/main/scan.c IIRC I've patched (not yet committed) my version of wilcox.c and compared in R-devel (inside ESS; i.e., not Rstudio-crippled) both without and with the patch: ## code on 1 line to easier cut'n'paste: set.seed(101); twRdev <- replicate(20, {cat("."); W <- rwilcox(4e4,40,60); system.time(qwilcox(pwilcox(W,40,60), 40, 60))[1]}) summary(twRdev) ## _un_patched R Under development (unstable) (2024-12-17 r87446) -- "Unsuffered Consequences" ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.9060 0.9185 0.9255 0.9524 0.9768 1.0910 ## *PATCHED* R Under development (unstable) (2024-12-17 r87446) -- "Unsuffered Consequences" summary(twRdev) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0.5000 0.5058 0.5075 0.5230 0.5210 0.6380 I plan to commit a version of this later / tomorrow. Martin > Cheers, > Simon >> On Dec 18, 2024, at 4:04 AM, Ben Bolker <bbol...@gmail.com> wrote: >> >> This seems like a great idea. Would it help to escalate this to a post on R-bugzilla, so it is less likely to fall through the cracks? >> >> On 12/17/24 09:51, Jeroen Ooms wrote: >>> A more generic solution would be for R to throttle calls to >>> R_CheckUserInterrupt(), because it makes no sense to check 1000 times >>> per second if a user has interrupted, but it is difficult for the >>> caller to know when R_CheckUserInterrupt() has been last called, or do >>> it regularly without over-doing it. >>> Here is a simple patch: https://github.com/r-devel/r-svn/pull/125 >>> See also: https://stat.ethz.ch/pipermail/r-devel/2023-May/082597.html >>> On Tue, Dec 17, 2024 at 10:47 AM Martin Becker >>> <martin.bec...@mx.uni-saarland.de> wrote: >>>> >>>> tl;dr: R_CheckUserInterrupt() can be a performance bottleneck >>>> within GUIs. This also affects functions in the 'stats' >>>> package, which could be improved by changing the position >>>> of calls to R_CheckUserInterrupt(). >>>> >>>> >>>> Dear all, >>>> >>>> Recently I was puzzled because some code in a package under development, >>>> which consisted almost entirely of a .Call() to a function written in C, >>>> was running much slower within RStudio compared to R in a terminal. It >>>> took me some time to identify the cause, so I thought I would share my >>>> findings; perhaps they will be helpful to others. >>>> >>>> The performance drop was caused by R_CheckUserInterrupt(), which I call >>>> (perhaps too often) in my C code. While calling R_CheckUserInterrupt() >>>> seems to be quite cheap when running R or Rscript in a terminal, it is >>>> more expensive when running R within a GUI, especially within RStudio, >>>> as I noticed (but also, e.g., within R.app on MacOS). In fact, using a >>>> GUI (especially RStudio) can change the cost of (frequent) calls to >>>> R_CheckUserInterrupt() from negligible to critical (in real-world >>>> applications). Significant performance drops are also visible for >>>> functions in the 'stats' package, e.g., pwilcox(). >>>> >>>> The following MWE (using Rcpp) illustrates the problem. Consider the >>>> following code: >>>> >>>> --- >>>> >>>> library(Rcpp) >>>> cppFunction('double nonsense(const int n, const int m, const int check) { >>>> int i, j; >>>> double result; >>>> for (i=0;i<n;i++) { >>>> if (check) R_CheckUserInterrupt(); >>>> result = 1.; >>>> for (j=1;j<=m;j++) if (j%2) result *= j; else result /=j; >>>> } >>>> return(result); >>>> }') >>>> >>>> tmp1 <- system.time(nonsense(1e8,10,0))[1] >>>> tmp2 <- system.time(nonsense(1e8,10,1))[1] >>>> cat("w/o check:",tmp1,"sec., with check:",tmp2,"sec., >>>> diff.:",tmp2-tmp1,"sec.\n") >>>> >>>> tmp3 <- system.time(pwilcox(rwilcox(1e5,40,60),40,60))[1] >>>> cat("wilcox example:",tmp3,"sec.\n") >>>> >>>> --- >>>> >>>> Running this code when R (4.4.2) is started in a terminal window >>>> produces the following measurements/output (Apple M1, MacOS 15.1.1): >>>> >>>> w/o check: 0.525 sec., with check: 0.752 sec., diff.: 0.227 sec. >>>> wilcox example: 1.028 sec. >>>> >>>> Running the same code when R is used within R.app (1.81 (8462) >>>> aarch64-apple-darwin20) on the same machine results in: >>>> >>>> w/o check: 0.525 sec., with check: 1.683 sec., diff.: 1.158 sec. >>>> wilcox example: 2.13 sec. >>>> >>>> Running the same code when R is used within RStudio Desktop (2024.12.0 >>>> Build 467) on the same machine results in: >>>> >>>> w/o check: 0.507 sec., with check: 22.905 sec., diff.: 22.398 sec. >>>> wilcox example: 29.686 sec. >>>> >>>> So, the performance drop is already remarkable for R.app, but really >>>> huge for RStudio. >>>> >>>> Presumably, checking for user interrupts within a GUI is more involved >>>> than within a terminal window, so there may not be much room for >>>> improvement in R.app or RStudio (and I know that this list is not the >>>> right place to suggest improvements for RStudio or to report unwanted >>>> behaviour). However, it might be worth considering >>>> >>>> 1. an addition to the documentation in WRE (explaining that too many >>>> calls to R_CheckUserInterrupt() can cause a performance bottleneck, >>>> especially when the code is running within a GUI), >>>> 2. check (and possibly change) the position of R_CheckUserInterrupt() in >>>> some base R functions. For example, moving R_CheckUserInterrupt() from >>>> cwilcox() to pwilcox() and qwilcox() in src/nmath/wilcox.c may lead to a >>>> significant improvement (while still being feasible in terms of response >>>> time). >>>> >>>> Best, >>>> Martin >>>> >>>> >>>> -- >>>> apl. Prof. Dr. Martin Becker, Akad. Oberrat >>>> Lehrstab Statistik >>>> Quantitative Methoden >>>> Fakultät für Empirische Humanwissenschaften und Wirtschaftswissenschaft >>>> Universität des Saarlandes >>>> Campus C3 1, Raum 2.17 >>>> 66123 Saarbrücken >>>> Deutschland >>>> >>>> ______________________________________________ >>>> R-devel@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> -- >> Dr. Benjamin Bolker >> Professor, Mathematics & Statistics and Biology, McMaster University >> Director, School of Computational Science and Engineering >> * E-mail is sent at my convenience; I don't expect replies outside of working hours. >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel