Re: [R-pkg-devel] Examples are too long in computation for CRAN
В Sun, 13 Aug 2023 00:59:40 -0700 Michael Topper пишет: >- Setting the nthreads argument to 2 in fixest::feels() in case > this is the problem as suggested. Any chance you could be hitting some other code paths in the fixest package while working on the objects returned by fixest::feols()? Try running trace(fixest::getFixest_nthreads) and then running examples in the same R session. Assuming that (1) your only problem is fixest and (2) every fixest function that uses OpenMP consults getFixest_nthreads() by default, it should be possible to catch them this way. If you see a call but aren't sure about its origin, try tracer = quote(traceback()) or tracer = quote(browser()). (As a precaution, untrace() the function before trace()ing it again.) FWIW, modelsummary depends on both fixest and data.table, but it doesn't look like you're creating threads via these. >- Tried to use skip_cran_test() on the tests that include fixest >regressions For tests, there's one more option: fixest::setFixest_nthreads(1) at the beginning of the file. This should eliminate any extra threads originating from fixest. If you do this and the problem persists, it must be something else. Unfortunately, this is global state, and using this in examples will involve saving the previous value and then restoring it later. (If you had separate test files in tests/*.R, R would be able to tell you which one causes excessive CPU time. Unfortunately, testthat idioms and core R idioms don't work well together.) It really is unfortunate that you cannot reproduce this without a computer with a lot of cores and working OpenMP. -- Best regards, Ivan __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Examples are too long in computation for CRAN
I have tried the following: - Trimming down the examples substantially to only run 1 regression per-function. - Setting the nthreads argument to 2 in fixest::feels() in case this is the problem as suggested. - Tried to use skip_cran_test() on the tests that include fixest regressions However, while the time has substantially been trimmed down, it still does not pass. At this point, I'm not sure what the next step is. Below is the results: Flavor: r-devel-linux-x86_64-debian-gcc Check: examples, Result: NOTE Examples with CPU time > 2.5 times elapsed time user system elapsed ratio panelsummary_raw 3.354 0.054 0.461 7.393 clean_raw3.436 0.091 0.571 6.177 panelsummary 3.636 0.136 0.824 4.578 Flavor: r-devel-linux-x86_64-debian-gcc Check: tests, Result: NOTE Running 'testthat.R' [39s/4s] Running R code in 'testthat.R' had CPU time 8.7 times elapsed time On Sat, Aug 12, 2023 at 11:26 PM Uwe Ligges wrote: > > > On 13.08.2023 08:14, Ivan Krylov wrote: > > В Sat, 12 Aug 2023 22:49:01 -0700 > > Michael Topper пишет: > > > >> It appears that some of my examples/tests are taking too > >> long to run for CRAN's standards. > > > > I don't think they are running too long; I think they are too parallel. > > The elapsed time is below 1s, but the "user" time (CPU time spent in > > the process) is 7 to 13 times that. This suggests that your code > > resulted in starting more threads than CRAN allows (up to 2 if you have > > to test parallellism). Are you using OpenMP? data.table? makeCluster()? > > It's simplest to always to default to a parallelism factor of 2 in > > examples an tests, because determining the right number is a hard > > problem. (What if the computer is busy doing something else? What if > > the BLAS is already parallel enough?) > > > >> Moreover, is there any insight as to why this would happen on the > >> third update of the package rather than on the first or second? > > > > The rule has always depended on the particular system running the > > checks (five seconds on my 12-year-old ThinkPad or on my ultraportative > > with an Intel Atom that had snails in its ancestry?). Maybe some > > dependency of your package has updated and started creating threads > > where it previously didn't. > > > > > Good points, not only for examples and tests, but also for defaults. > > On shared resources (such as clusters) users may not expect the > parallelization you use and then overallocate the resources. > > Example: 20 cores available to the user who runs makeCluster() for > paallelization, but the underlying code does multihreading on 20 cores. > Then we end up in 20*20 threads on the machine slowing down the machine > and processes of other uers. > Hence, defaults should also not be more than 2. Simply allow the user to > ask for more. > > Best, > Uwe Ligges > -- Michael Topper B.S. Economics and Mathematics, University of California San Diego 2015 M.A. Economics, San Diego State University 2018 Mobile: (805) 914-4285 miketopper...@gmail.com [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Examples are too long in computation for CRAN
On 13.08.2023 08:14, Ivan Krylov wrote: В Sat, 12 Aug 2023 22:49:01 -0700 Michael Topper пишет: It appears that some of my examples/tests are taking too long to run for CRAN's standards. I don't think they are running too long; I think they are too parallel. The elapsed time is below 1s, but the "user" time (CPU time spent in the process) is 7 to 13 times that. This suggests that your code resulted in starting more threads than CRAN allows (up to 2 if you have to test parallellism). Are you using OpenMP? data.table? makeCluster()? It's simplest to always to default to a parallelism factor of 2 in examples an tests, because determining the right number is a hard problem. (What if the computer is busy doing something else? What if the BLAS is already parallel enough?) Moreover, is there any insight as to why this would happen on the third update of the package rather than on the first or second? The rule has always depended on the particular system running the checks (five seconds on my 12-year-old ThinkPad or on my ultraportative with an Intel Atom that had snails in its ancestry?). Maybe some dependency of your package has updated and started creating threads where it previously didn't. Good points, not only for examples and tests, but also for defaults. On shared resources (such as clusters) users may not expect the parallelization you use and then overallocate the resources. Example: 20 cores available to the user who runs makeCluster() for paallelization, but the underlying code does multihreading on 20 cores. Then we end up in 20*20 threads on the machine slowing down the machine and processes of other uers. Hence, defaults should also not be more than 2. Simply allow the user to ask for more. Best, Uwe Ligges __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Examples are too long in computation for CRAN
В Sat, 12 Aug 2023 22:49:01 -0700 Michael Topper пишет: > It appears that some of my examples/tests are taking too > long to run for CRAN's standards. I don't think they are running too long; I think they are too parallel. The elapsed time is below 1s, but the "user" time (CPU time spent in the process) is 7 to 13 times that. This suggests that your code resulted in starting more threads than CRAN allows (up to 2 if you have to test parallellism). Are you using OpenMP? data.table? makeCluster()? It's simplest to always to default to a parallelism factor of 2 in examples an tests, because determining the right number is a hard problem. (What if the computer is busy doing something else? What if the BLAS is already parallel enough?) > Moreover, is there any insight as to why this would happen on the > third update of the package rather than on the first or second? The rule has always depended on the particular system running the checks (five seconds on my 12-year-old ThinkPad or on my ultraportative with an Intel Atom that had snails in its ancestry?). Maybe some dependency of your package has updated and started creating threads where it previously didn't. -- Best regards, Ivan __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Examples are too long in computation for CRAN
Hello all, Not sure how to handle this, as it had not been an issue on my previous CRAN submissions. It appears that some of my examples/tests are taking too long to run for CRAN's standards. Is there a way around this other than the simple "change the example" or "change the test"? Moreover, is there any insight as to why this would happen on the third update of the package rather than on the first or second? Thanks in advance, and see below for the NOTE: Flavor: r-devel-linux-x86_64-debian-gcc Check: examples, Result: NOTE Examples with CPU (user + system) or elapsed time > 5s user system elapsed panelsummary 9.574 0.273 0.817 panelsummary_raw 6.048 0.164 0.450 Examples with CPU time > 2.5 times elapsed time user system elapsed ratio panelsummary_raw 6.048 0.164 0.450 13.804 panelsummary 9.574 0.273 0.817 12.053 clean_raw3.684 0.064 0.514 7.292 Flavor: r-devel-linux-x86_64-debian-gcc Check: tests, Result: NOTE Running 'testthat.R' [54s/6s] Running R code in 'testthat.R' had CPU time 9.3 times elapsed time [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel