Re: [Rd] Apple M1 CRAN checks
Simon, Yes, I can imagine it is not trivial testing. I hope you have a stack of minis in a cluster. It looks like a trivial transgression, max.error = 1.065814e-14 where my tolerance is set at 1e-14. Many of the other tests already have a larger tolerance. Very possibly it is not actually the tolerance that is the problem. The test value itself is just determined by a run on another machine, so that may not be in the middle of the result distribution. I'll fix it shortly. Thanks, Paul On 2021-02-28 4:23 p.m., Simon Urbanek wrote: Paul, this is being worked on. As you can imagine testing over 17,000 package in a M1 Mac mini isn't quite trivial. The first priority was to get the nightly R builds to work. Second was to get CRAN package builds to work. Third is to provide checks. The first two have finished last week and the checks have been running for the past two days. Unfortunately, some pieces (like XQuartz) are still not quite stable so it takes more manual interventions than one would expect. We are at close to 16k packags checked, so we're getting there. As for EvalEst the check has finished so I have: Running ‘dse2tstgd2.R’ [13s/14s] Running the tests in ‘tests/dse2tstgd2.R’ failed. Last 13 lines of output: > ok <- fuzz.large > error > if (!ok) {if (is.na(max.error)) max.error <- error + else max.error <- max(error, max.error)} > all.ok <- all.ok & ok > {if (ok) cat("ok\n") else cat("failed! error= ", error,"\n") } ok > > cat("All Brief User Guide example tests part 2 completed") All Brief User Guide example tests part 2 completed> if (all.ok) cat(" OK\n") else + cat(", some FAILED! max.error = ", max.error,"\n") , some FAILED! max.error = 1.065814e-14 > > if (!all.ok) stop("Some tests FAILED") Error: Some tests FAILED Execution halted when I run it by hand I get ok for all but: Guide part 2 test 10... failed! error= 1.065814e-14 sum(fc1$forecastCov[[1]]) [1] 14.933660144821400806 sum(fc2$forecastCov[[1]]) [1] 14.933660144821400806 sum(fc2$forecastCov.zero) [1] 31.654672476928304548 sum(fc2$forecastCov.trend) [1] 18.324461923341957004 c(14.933660144821400806 - sum(fc1$forecastCov[[1]]), + 14.933660144821400806 - sum(fc2$forecastCov[[1]]), + 31.654672476928297442 - sum(fc2$forecastCov.zero), + 18.324461923341953451 - sum(fc2$forecastCov.trend) ) [1] 0.000e+00 0.000e+00 -1.0658141036401502788e-14 [4] -3.5527136788005009294e-15 I hope this helps you to track it down. Cheers, Simon On Mar 1, 2021, at 4:50 AM, Paul Gilbert wrote: If there was a response to the "how can I test it out" part of this question then I missed it. Can anyone point to a Win-builder like site for testing on M1mac, or to the M1mac results from testing packages already on CRAN? They still do not seem to be on the CRAN daily site. Even a link to the 'Additional issues' on M1 Mac on the results pages would be helpful because it does not seem to be in an obvious place. I am trying to respond to a demand to relax or remove some package testing that fails because M1mac gives results outside my specified tolerances. The tests in question (in package EvalEst) have been used since very early R versions (0.16 circa 1995), and used on Splus prior to that. There has been a need to adjust tolerances occasionally, but they have been stable for a long time (more than 20 years I believe). Since these tests date from a time when simple double precision was the norm, the tolerances are already fairly relaxed so I hesitate to adjust them with actually examining the results. Paul Gilbert On 2021-02-22 3:30 a.m., Travers Ching wrote: I noticed CRAN is now doing checks against Apple M1, and some packages are failing including a dependency I use. Is building on M1 now a requirement, or can the check be ignored? If it's a requirement, how can one test it out? Travers [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Apple M1 CRAN checks
If there was a response to the "how can I test it out" part of this question then I missed it. Can anyone point to a Win-builder like site for testing on M1mac, or to the M1mac results from testing packages already on CRAN? They still do not seem to be on the CRAN daily site. Even a link to the 'Additional issues' on M1 Mac on the results pages would be helpful because it does not seem to be in an obvious place. I am trying to respond to a demand to relax or remove some package testing that fails because M1mac gives results outside my specified tolerances. The tests in question (in package EvalEst) have been used since very early R versions (0.16 circa 1995), and used on Splus prior to that. There has been a need to adjust tolerances occasionally, but they have been stable for a long time (more than 20 years I believe). Since these tests date from a time when simple double precision was the norm, the tolerances are already fairly relaxed so I hesitate to adjust them with actually examining the results. Paul Gilbert On 2021-02-22 3:30 a.m., Travers Ching wrote: I noticed CRAN is now doing checks against Apple M1, and some packages are failing including a dependency I use. Is building on M1 now a requirement, or can the check be ignored? If it's a requirement, how can one test it out? Travers [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] CRAN check texi2dvi failure
Thanks Enrico for the great guess, and Georgi for the details. If I omit the space as seems to be implied in some documentation, changing \verb <https://www.bankofcanada.ca/2006/03/working-paper-2006-3> . to \verb<https://www.bankofcanada.ca/2006/03/working-paper-2006-3> . then the R CMD check error "\verb ended by end of line" happens on my linux machine. I did not try replacing the space with another deliminator, which I guess would now be the correct way to use \verb. The solution of adding \usepackage{url} and changing to \url{https://www.bankofcanada.ca/2006/03/working-paper-2006-3}. does seem to work. (No "on CRAN" confirmation yet but I have not had the immediate pre-test rejection that I got previously.) Paul On 2021-01-10 8:04 a.m., Georgi Boshnakov wrote: > The problem is not in the Warning from the example but from > the \verb commands in the references. > You use space to delimit the argument of \verb and I was surprised > that it worked since TeX ignores spaces after commands. > Apparently, this has been an exception for \verb but now this feature > is considered a bug and hs been recently fixed, see the atacjexchange > question below and the relevant paragraph from LaTeX News. Probably > the linux machines have updated their TeX installations. > > In short, changing the space tp say + delimiter for \verb command should fix the issue. > > Georgi Boshnakov > On 2021-01-09 6:52 p.m., Enrico Schumann wrote: When I run R CMD check on my Linux machine [*], I also do not get an error. But here is a guess: The error mentions \verb, and the LaTeX manual says that \verb should be followed by nonspace character. But in the vignette it is followed by a space. Maybe using \url in the vignette could fix the error? kind regards Enrico [*] R version 4.0.3 (2020-10-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.10 > On Sat, 09 Jan 2021, Paul Gilbert writes: I am trying to debug a problem that is appearing in the linux and Solaris checks, but not Windows or Mac checks, of my package tsfa as reported at https://cran.r-project.org/web/checks/check_results_tsfa.html The problem is with re-building the vignette ... this is package 'tsfa' version '2014.10-1' ... checking re-building of vignette outputs ... [6s/9s] WARNING Error(s) in re-building vignettes: ... Running 'texi2dvi' on 'Guide.tex' failed. LaTeX errors: ! LaTeX Error: \verb ended by end of line. ... In responding to the threat of removal I have also fixes some long standing warnings about adding imports to the NAMESPACE. The new version builds with --as-cran giving no errors or warnings with both R-devel on win-builder (2021-01-07 r79806) and on my linux machine (R 2021-01-08 r79812 on Linux Mint 19.3 Tricia). When I submit it to CRAN the Windows build is OK but the same error happens at the 'texi2dvi' step in the debian vignette re-build. This seems to happens after an example that correctly has a warning message (about Heywood cases). In my linux build the the warning happens but the message does not appear in the pdf output, so one possibility is that the handling of the warning on the CRAN Unix check machines fails to produce clean tex or suppress output. Another possibility is that my build using --as-cran is different from the actual CRAN build options. For example, my 00check.log shows ... * checking package vignettes in ‘inst/doc’ ... OK * checking re-building of vignette outputs ... OK * checking PDF version of manual ... OK * checking for non-standard things in the check directory ... OK ... so I am not sure if it uses texi2dvi. (I haven't used dvi myself for a long time.) I'm not sure how to debug this when I can't reproduce the error. Suggestions would be appreciated. Paul Gilbert __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] translation .mo files
I have been sent .po and .mo files with message translations for one of my packages. The .po file I know goes in the source package po/ directory but I have not had .mo files previously. The translator thinks the .mo file goes in inst/po. The .mo file seems to be generated from the .po file, but I am not sure if that happens in the install of the source package, or in some pre-process. I thought I could determine this by looking at an installed package, but I don't see .po or .mo files in installed packages. So far I have had no luck finding documentation on these details. So I have three questions. -Should the .mo file be included in the package, and if so, where? -When a package is installed, where does the translation information go in the directory structure of the library? -Is this documented somewhere? (Please not a vague reference to 'Writing R Extensions', I've looked there and many other places. I need a section or page reference.) Thanks, Paul Gilbert __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] survival changes
Terry Let me call this things to think about, rather than advice. I went through a similar process twice, once about 30 years ago and once about 20 years ago. I had fewer dependent packages of course, but still enough to cause headaches. I don't recommend doing it often. - I think you need to consider where you would like to end up before deciding how to get there. If you end up having to maintain a lot of legacy stuff I don't think you will be very happy. So then the problem becomes how to help people get off the part you want to abandon, rather then how to help them stay on it. - I know you are very experienced, but I will be really impressed if you get the new approach perfect on the first shot. That argues for having a new package with hardly any users so you can fiddle with the API more easily, and not deprecating the old one until you are really happy with the new one. - There may be a part which is common to both old and new and/or there may be a part which is what most dependent packages use. If you can separate that out as something like survivalBase it would make your life easier. That will be especial true if that part is more stable, so don't put in anything you are experimenting with. Good luck, Paul Gilbert On 6/1/19 8:02 PM, Therneau, Terry M., Ph.D. via R-devel wrote: On 6/1/19 1:32 PM, Marc Schwartz wrote: On Jun 1, 2019, at 12:59 PM, Peter Langfelder wrote: On Sat, Jun 1, 2019 at 3:22 AM Therneau, Terry M., Ph.D. via R-devel wrote: In the next version of the survival package I intend to make a non-upwardly compatable change to the survfit object. With over 600 dependent packages this is not something to take lightly, and I am currently undecided about the best way to go about it. I'm looking for advice. The change: 20+ years ago I had decided not to include the initial x=0,y=1 data point in the survfit object itself. It was not formally an estimand and the plot/points/lines etc routines could add this on themselves. That turns out to have been a mistake, and has led to a steady proliferation of extra bits as I realized that the time axis doesn't always start at 0, and later (with multi state) that y does not always start at 1 (though the states sum to 1), and later the the error doesn't always start at 0, and another realization with cumulative hazard, and ... The new survfit method for multi-state coxph models was going to add yet another special case. Basically every component is turning into a duplicate of "row 1" vs "all the others". (And inconsistently named.) Three possible solutions 1. Current working draft of survival_3.0.3: Add a 'version' element to the survfit object and a 'survfit2.3' function that converts old to new. All my downstream functions (print, plot,...) start with an "if (old) update to new" line. This has allowed me to stage updates to the functions that create survfit objects -- I expect it to happen slowly. There will also be a survfit3.2 function to go backwards. Both the forward and backwards functions leave objects alone if they are currently in the desired format. 2. Make a new class "survfit3" and the necessary 'as' functions. The package would contain plot.survfit and plot.survfit3 methods, the former a two line "convert and call the second" function. 3. Something I haven't thought of. A more "clean break" solution would be to start a whole new package (call it survival2) that would make these changes, and deprecate the current survival. You could add warnings about deprecation and urging users to switch in existing survival functions. You could continue bugfixes for survival but only add new features to survival2. The new survival2 and the current survival could live side by side on CRAN for quite some time, giving maintainers of dependent packages (and just plain users) enough time to switch. This could allow you to change/clean up other parts of the package that you could perhaps also use a rethink/rewrite, without too much concern for backward compatibility. Peter Hi, I would be cautious in going in that direction, bearing in mind that survival is a Recommended package, therefore included in the default R distribution from the R Foundation and other parties. To have two versions can/will result in substantial confusion, and I would argue against that approach. There is language in the CRAN submission policy that covers API changes, which strictly speaking, may or may not be the case here, depending upon which direction Terry elects to go: "If an update will change the package’s API and hence affect packages depending on it, it is expected that you will contact the maintainers of affected packages and suggest changes, and give them time (at least 2 weeks, ideally more) to prepare updates before submitting your updated package. Do mention in the submission email which packages are affected a
Re: [R-pkg-devel] Submitting a package whose unit tests sometimes fail because of server connections
5.4 In the spirit of simple & stupid you can also use the built in mechanism for doing this: organize some of your tests in subdirectories like inst/testWithInternet, inst/veryLongTests, inst/testsNeedingLicence, inst/testsNeedingSpecialCluster, etc. CRAN will only run the tests in the tests/ directory, but you can check them yourself using R CMD check --test-dir=inst/testWithInternet whatever.tar.gz > In a separate response On 4/16/19 2:06 PM, Steven Scott wrote: > Just don't include the live fire stuff in the package. Please do not do this. If you omit tests from your package then it cannot be properly checked by other people. Paul Gilbert On 4/16/19 2:16 PM, Dirk Eddelbuettel wrote: On 16 April 2019 at 11:40, Will wrote: | Some things I have considered include: | |1. Skipping all unit tests on CRAN (using something like |*testtht::skip_on_cran*). This would immediately fix the problem, and as |a mitigating factor we report automated test result and coverage on the |package's GitHub page (https://github.com/ropensci/suppdata). |2. Using HTTP-mocking to avoid requiring a call to a server during tests |at all. I would be uncomfortable relying solely on this for all tests, |since if the data hosters changed things we wouldn't know. Thus I would |still want the Internet-enabled tests, which would also have to be turned |off for CRAN (see 1 above). This would also be a lot of additional work. |3. Somehow bypassing the requirement for the unit tests to all pass |before the package is checked by the CRAN maintainers. I have no idea if |this is something CRAN would be willing to do, or if it is even possible. |It would be the easiest option for me, but I don't want to create extra |work for other people! |4. Slowing the tests with something like *Sys.sleep*. This might work, |but would slow the tests massively and so might that cause problems for |CRAN? | | Does anyone have any advice as to which of the above would be the best | option, or who I should email directly about this? 5. Run a hybrid scheme where you have multiple levels: 5.1 Do what eg Rcpp does and only opt into 'all tests' when an overall variable is set; that variable can be set conveniently in .travis.yml and conditionally in your test runner below ~/tests/ That way you can skip tests that would fail. 5.2 Do a lot of work and wrap 3. above into try() / tryCatch() and pass if _your own aggregation of tests_ passes a threshold. Overkill to me. 5.3 Turn all tests on / off based on some other toggle. I.e. I don't think I test all features of RcppRedis on CRAN as I can't assume a redis server, but I do run those tests at home, on Travis, ... Overall, I would recommend to 'keep it simple & stupid' (KISS) as life is to short to argue^Hdebate this with CRAN. And their time is too precious so we should try to make their life easier. Dirk __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] code for sum function
(I didn't see anyone else answer this, so ...) You can probably find the R code in src/main/ but I'm not sure. You are talking about a very simple calculation, so it seems unlike that the algorithm is the cause of the difference. I have done much more complicated things and usually get machine precision comparisons. There are four possibilities I can think of that could cause (small) differences. 0/ Your code is wrong, but that seems unlikely on such a simple calculations. 1/ You are summing a very large number of numbers, in which case the sum can become very large compared to numbers being added, then things can get a bit funny. 2/ You are using single precision in fortran rather than double. Double is needed for all floating point numbers you use! 3/ You have not zeroed the double precision numbers in fortran. (Some compilers do not do this automatically and you have to specify it.) Then if you accidentally put singles, like a constant 0.0 rather than a constant 0.0D+0, into a double you will have small junk in the lower precision part. (I am assuming you are talking about a sum of reals, not integer or complex.) HTH, Paul Gilbert On 2/14/19 2:08 PM, Rampal Etienne wrote: Hello, I am trying to write FORTRAN code to do the same as some R code I have. I get (small) differences when using the sum function in R. I know there are numerical routines to improve precision, but I have not been able to figure out what algorithm R is using. Does anyone know this? Or where can I find the code for the sum function? Regards, Rampal Etienne __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] package fails with parallel make - would forcing a serial version work?
(I didn't see an answer to this, so ...) I think using .NOTPARALLEL will usually get rid of the error but, in my experience, this problem is usually caused by an incorrect or incomplete Makefile. When not done in parallel this missing target is usually getting done first as a side-affect of something that happens before and usually finishes before it is needed. Your luck does not hold in parallel. The better fix is to correct your Makefile. Paul On 1/10/19 4:54 PM, Satyaprakash Nayak wrote: Dear R package developers I published a package on CRAN last year (sundialr) which is now failing with as it is not make to compile a static library with parallel make. In this package, I compile a static library (libsundials_all.a) from source files of a third party. The specifics of compiling the static library can be found at - https://github.com/sn248/sundialr/blob/master/src/Makevars Now, I got the following error message from CRAN (actually, I was informed of this before, but had neglected to fix it). Here is the message from one of the CRAN maintainers .. *** This have just failed to install for me with a parallel make: g++ -std=gnu++98 -std=gnu++98 -shared -L/data/blackswan/ripley/extras/lib64 -L/usrlocal/lib64 -o sundialr.so cvode.o RcppExports.o -L/data/blackswan/ripley/R/R-patched/lib -lRlapack -L/data/blackswan/ripley/R/R-patched/lib -lRblas -lgfortran -lm -lquadmath -L../inst/ ../inst/libsundials_all.a g++: error: ../inst/libsundials_all.a: No such file or directory make[1]: *** [/data/blackswan/ripley/R/R-patched/share/make/shlib.mk:6: sundialr.so] Error 1 * It seems the package fails to generate the static library with the parallel make. The easiest solution I could think of for this problem was to force a serial version of make using the .NOTPARALLEL phony command in Makevars and Makevars.win(https://github.com/sn248/sundialr/blob/master/src/Makevars). I have made this change and it seems to work on my machine and on testing with TravisCI and Appveyor(https://github.com/sn248/sundialr). However, before I re-submit to CRAN, I wanted to get an opinion as to will this be enough to get rid of the error with parallel make? Any suggestions would be very much appreciated, thank you! Satyaprakash [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed
I'll point out that there is there is a large literature on generating pseudo random numbers for parallel processes, and it is not as easy as one (at least me) would intuitively think. By a contra-positive like thinking one might guess that it will not be easy to pick seeds in a way that will produce independent sequences. (I'm a bit confused about the objective but) If the objective is to produce independent sequence from some different seeds then the RNGs for parallel processing might be a good place to start. (And, BTW, if you want to reproduce parallel generated random numbers you need to keep track of both the starting seed and the number of nodes.) Paul Gilbert On 11/05/2017 10:58 AM, peter dalgaard wrote: On 5 Nov 2017, at 15:17 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote: On 04/11/2017 10:20 PM, Daniel Nordlund wrote: Tirthankar, "random number generators" do not produce random numbers. Any given generator produces a fixed sequence of numbers that appear to meet various tests of randomness. By picking a seed you enter that sequence in a particular place and subsequent numbers in the sequence appear to be unrelated. There are no guarantees that if YOU pick a SET of seeds they won't produce a set of values that are of a similar magnitude. You can likely solve your problem by following Radford Neal's advice of not using the the first number from each seed. However, you don't need to use anything more than the second number. So, you can modify your function as follows: function(x) { set.seed(x, kind = "default") y = runif(2, 17, 26) return(y[2]) } Hope this is helpful, That's assuming that the chosen seeds are unrelated to the function output, which seems unlikely on the face of it. You can certainly choose a set of seeds that give high values on the second draw just as easily as you can choose seeds that give high draws on the first draw. The interesting thing about this problem is that Tirthankar doesn't believe that the seed selection process is aware of the function output. I would say that it must be, and he should be investigating how that happens if he is worried about the output, he shouldn't be worrying about R's RNG. Hmm, no. The basic issue is that RNGs are constructed so that with x_{n+1} = f(x_n), x_1, x_2, x_3,... will look random, not so that f(s_1), f(s_2), f(s_3), ... will look random for any s_1, s_2, ... . This is true, even if seeds s_1, s_2, ... are not chosen so as to mess with the RNG. In the present case, it seems that the seeds around 86e6 tend to give similar output. On the other hand, it is not _just_ the similarity in magnitude that does it, try e.g. s <- as.integer(runif(100, 86.54e6, 86.98e6)) r <- sapply(s, function(s){set.seed(s); runif(1,17,26)}) plot(s,r, pch=".") and no obvious pattern emerges. My best guess is that the seeds are not only of similar magnitude, but also have other bit-pattern similarities. (Isn't there a Knuth quote to the effect that "Every random number generator will fail in at least one application"?) One remaining issue is whether it is really true that the same seeds givee different output on different platforms. That shouldn't happen, I believe. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] parallel::detectCores() bug on Raspberry Pi B+
In R 3.3.2 detectCores() in package parallel reports 2 rather than 1 on Raspberry Pi B+ running Raspbian. (This report is just 'for the record'. The model is superseded and I think no longer produced.) The problem seems to be caused by grep processor /proc/cpuinfo processor : 0 model name : ARMv6-compatible processor rev 7 (v6l) (On Raspberry Pi 2 and 3 there is no error because the model name lines are model name : ARMv7 Processor rev 5 (v7l) model name : ARMv7 Processor rev 4 (v7l) ) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
On 09/08/2016 05:06 PM, robin hankin wrote: Could we take a cue from min() and max()? x <- 1:10 min(x[x>7]) [1] 8 min(x[x>11]) [1] Inf Warning message: In min(x[x > 11]) : no non-missing arguments to min; returning Inf As ?min says, this is implemented to preserve transitivity, and this makes a lot of sense. I think the issuing of a warning here is a good compromise; I can always turn off warnings if I want. I fear you are thinking of this as an end user, rather than as a package developer. Warnings are for end users, when they do something they possibly should be warned about. A package really should not generate warnings unless they are for end user consumption. In package development I treat warnings the same way I treat errors: build fails, program around it. So what you call a compromise is no compromise at all as far as I am concerned. But perhaps there is a use for an end user version, maybe All() or ALL() that issues an error or warning. There are a lot of functions and operators in R that could warn about mistakes that a user may be making. Paul I find this behaviour of min() and max() to be annoying in the *right* way: it annoys me precisely when I need to be annoyed, that is, when I haven't thought through the consequences of sending zero-length arguments. On Fri, Sep 9, 2016 at 6:00 AM, Paul Gilbert <pgilbert...@gmail.com> wrote: On 09/08/2016 01:22 PM, Gabriel Becker wrote: On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdun...@tibco.com> wrote: Shouldn't binary operators (arithmetic and logical) should throw an error when one operand is NULL (or other type that doesn't make sense)? This is a different case than a zero-length operand of a legitimate type. E.g., any(x < 0) should return FALSE if x is number-like and length(x)==0 but give an error if x is NULL. Bill, That is a good point. I can see the argument for this in the case that the non-zero length is 1. I'm not sure which is better though. If we switch any() to all(), things get murky. Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and all(x>0)), but the likelihood of this being a thought-bug on the author's part is exceedingly high, imho. I suspect there may be more R users than you think that understand and use vacuously true in code. I don't really like the idea of turning a perfectly good and properly documented mathematical test into an error in order to protect against a possible "thought-bug". Paul So the desirable behavior seems to depend on the angle we look at it from. My personal opinion is that x < y with length(x)==0 should fail if length(y) 1, at least, and I'd be for it being an error even if y is length 1, though I do acknowledge this is more likely (though still quite unlikely imho) to be the intended behavior. ~G I.e., I think the type check should be done before the length check. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbec...@ucdavis.edu> wrote: Martin, Like Robin and Oliver I think this type of edge-case consistency is important and that it's fantastic that R-core - and you personally - are willing to tackle some of these "gotcha" behaviors. "Little" stuff like this really does combine to go a long way to making R better and better. I do wonder a bit about the x = 1:2 y = NULL x < y case. Returning a logical of length 0 is more backwards compatible, but is it ever what the author actually intended? I have trouble thinking of a case where that less-than didn't carry an implicit assumption that y was non-NULL. I can say that in my own code, I've never hit that behavior in a case that wasn't an error. My vote (unless someone else points out a compelling use for the behavior) is for the to throw an error. As a developer, I'd rather things like this break so the bug in my logic is visible, rather than propagating as the 0-length logical is &'ed or |'ed with other logical vectors, or used to subset, or (in the case it should be length 1) passed to if() (if throws an error now, but the rest would silently "work"). Best, ~G On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler < maech...@stat.math.ethz.ch> wrote: robin hankin <hankin.ro...@gmail.com> on Thu, 8 Sep 2016 10:05:21 +1200 writes: > Martin I'd like to make a comment; I think that R's > behaviour on 'edge' cases like this is an important thing > and it's great that you are working on it. > I make heavy use of zero-extent arrays, chiefly because > the dimnames are an efficient and logical way to keep > track of certain types of information. > If I have, for example, > a <- array(0,c(2,0,2)) > dimnames(a) <- list(name=c('Mike','Kevin'), NULL,item=c("hat","scarf")) > Then in R-3.3.1, 70800 I get a
Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays
On 09/08/2016 01:22 PM, Gabriel Becker wrote: On Thu, Sep 8, 2016 at 10:05 AM, William Dunlapwrote: Shouldn't binary operators (arithmetic and logical) should throw an error when one operand is NULL (or other type that doesn't make sense)? This is a different case than a zero-length operand of a legitimate type. E.g., any(x < 0) should return FALSE if x is number-like and length(x)==0 but give an error if x is NULL. Bill, That is a good point. I can see the argument for this in the case that the non-zero length is 1. I'm not sure which is better though. If we switch any() to all(), things get murky. Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and all(x>0)), but the likelihood of this being a thought-bug on the author's part is exceedingly high, imho. I suspect there may be more R users than you think that understand and use vacuously true in code. I don't really like the idea of turning a perfectly good and properly documented mathematical test into an error in order to protect against a possible "thought-bug". Paul So the desirable behavior seems to depend on the angle we look at it from. My personal opinion is that x < y with length(x)==0 should fail if length(y) 1, at least, and I'd be for it being an error even if y is length 1, though I do acknowledge this is more likely (though still quite unlikely imho) to be the intended behavior. ~G I.e., I think the type check should be done before the length check. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker wrote: Martin, Like Robin and Oliver I think this type of edge-case consistency is important and that it's fantastic that R-core - and you personally - are willing to tackle some of these "gotcha" behaviors. "Little" stuff like this really does combine to go a long way to making R better and better. I do wonder a bit about the x = 1:2 y = NULL x < y case. Returning a logical of length 0 is more backwards compatible, but is it ever what the author actually intended? I have trouble thinking of a case where that less-than didn't carry an implicit assumption that y was non-NULL. I can say that in my own code, I've never hit that behavior in a case that wasn't an error. My vote (unless someone else points out a compelling use for the behavior) is for the to throw an error. As a developer, I'd rather things like this break so the bug in my logic is visible, rather than propagating as the 0-length logical is &'ed or |'ed with other logical vectors, or used to subset, or (in the case it should be length 1) passed to if() (if throws an error now, but the rest would silently "work"). Best, ~G On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler < maech...@stat.math.ethz.ch> wrote: robin hankin on Thu, 8 Sep 2016 10:05:21 +1200 writes: > Martin I'd like to make a comment; I think that R's > behaviour on 'edge' cases like this is an important thing > and it's great that you are working on it. > I make heavy use of zero-extent arrays, chiefly because > the dimnames are an efficient and logical way to keep > track of certain types of information. > If I have, for example, > a <- array(0,c(2,0,2)) > dimnames(a) <- list(name=c('Mike','Kevin'), NULL,item=c("hat","scarf")) > Then in R-3.3.1, 70800 I get a> 0 > logical(0) >> > But in 71219 I get a> 0 > , , item = hat > name > Mike > Kevin > , , item = scarf > name > Mike > Kevin > (which is an empty logical array that holds the names of the people and > their clothes). I find the behaviour of 71219 very much preferable because > there is no reason to discard the information in the dimnames. Thanks a lot, Robin, (and Oliver) ! Yes, the above is such a case where the new behavior makes much sense. And this behavior remains identical after the 71222 amendment. Martin > Best wishes > Robin > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler < maech...@stat.math.ethz.ch> > wrote: >> > Martin Maechler >> > on Tue, 6 Sep 2016 22:26:31 +0200 writes: >> >> > Yesterday, changes to R's development version were committed, >> relating >> > to arithmetic, logic ('&' and '|') and >> > comparison/relational ('<', '==') binary operators >> > which in NEWS are described as >> >> > SIGNIFICANT USER-VISIBLE CHANGES: >> >> > [.] >> >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now >> > behave consistently, notably for arrays of length zero. >> >> > Arithmetic between length-1 arrays and longer non-arrays had >> > silently dropped the array attributes and recycled. This >> > now gives a warning and will signal an
Re: [Rd] A bug in the R Mersenne Twister (RNG) code?
On 08/30/2016 06:29 PM, Duncan Murdoch wrote: I don't see evidence of a bug. There have been several versions of the MT; we may be using a different version than you are. Ours is the 1999/10/28 version; the web page you cite uses one from 2002. Perhaps the newer version fixes some problems, and then it would be worth considering a change. But changing the default RNG definitely introduces problems in reproducibility, Well "problems in reproducibility" is a bit vague. Results would always be reproducible by specifying kind="Mersenne-Twister" or kind="Buggy Kinderman-Ramage" for older results, so there is no problem reproducing results. The only problem is that users expecting to reproduce results twenty years later will need to know what random generator they used. (BTW, they may also need to record information about the normal or other generator, as well as the seed.) Of course, these changes are recorded pretty well for R, so the history of "default" can always be found. I think it is a mistake to encourage users into thinking they do not need to keep track of some information if they want reproducibility. Perhaps the default should be changed more often in order to encourage better user habits. More seriously, I think "default" should continue to be something that is currently considered to be good. So, if there really is a known problem, then I think "default" should be changed. (And, no I did not get burned by the R 1.7.0 change in the default generator. I got burned by a much earlier, unadvertised, and more subtle change in the Splus generator.) Paul Gilbert so it's not obvious that we would do it. Duncan Murdoch On 30/08/2016 5:45 PM, Mark Roberts wrote: Whomever, I recently sent the "bug report" below tor-c...@r-project.org and have just been asked to instead submit it to you. Although I am basically not an R user, I have installed version 3.3.1 and am also the author of a statistics program written in Visual Basic that contains a component which correctly implements the Mersenne Twister (MT) algorithm. I believe that it is not possible to generate the correct stream of pseudorandom numbers using the MT default random number generator in R, and am not the first person to notice this. Here is a posted 2013 entry (www.r-bloggers.com/reproducibility-and-randomness/) on an R website that asserts that the SAS computer program implementation of the MT algorithm produces different numbers than R does when using the same starting seed number. The author of this post didn’t get anyone to respond to his query about the reason for this SAS vs. R discrepancy. There are two ways of initializing the original MT computer program (written in C) so that an identical stream of numbers can be repeatedly generated: 1) with a particular integer seed number, and 2) with a particular array of integers. In the 'compilation and usage' section of this webpage (https://github.com/cslarsen/mersenne-twister) there is a listing of the first 200 random numbers the MT algorithm should produce for seed number = 1. The inventors of the Mersenne Twister random number generator provided two different sets of the first 1000 numbers produced by a correctly coded 32-bit implementation of the MT algorithm when initializing it with a particular array of integers at: www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.out. [There is a link to this output at: www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html.] My statistics program obtains exactly those 200 numbers from the first site mentioned in the previous paragraph and also obtains those same numbers from the second website (though I didn't check all 2000 values). Assuming that the MT code within R uses the 32-bit MT algorithm, I suspect that the current version of R can't do that. If you (i.e., anyone who might knowledgeably respond to this report) is able to duplicate those reference test-values, then please send me the R code to initialize the MT code within R to successfully do that, and I apologize for having wasted your time. If you (collectively) can't do that, then R is very likely using incorrectly implemented MT code. And if this latter possibility is true, it seems to me that this is something that should be fixed. Mark Roberts, Ph.D. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] Handling Not-Always-Needed Dependencies? - Part 2
On 08/04/2016 11:51 AM, Dirk Eddelbuettel wrote: On 4 August 2016 at 11:46, Paul Gilbert wrote: | If my package has a test that needs another package, but that package is | not needed in the /R code of my package, then I indicate it as | "Suggests", not as "Depends" nor as "Imports". If that package is not | available when I run R CMD check, should the test pass? Wrong question. Better question: Should the test be running? My preference is for only inside of a requireNamespace() (or equivalent) block as the package is not guaranteed to be present. In theory. At the level of R CMD check throwing an error or not, I think this is arguing that it should be possible to pass the tests (not throw an error) even though they are not run, isn't it? (So your answer to my question is yes, at least the way I was thinking of the question.) Or do you mean you would just like the tests to fail with a more appropriate error message? Or do you mean, as Duncan suggests, that the person writing the test should be allowed to code in something to decide if the test is really important or not? In practice people seem to unconditionally install it anyway, and think that is a good idea. I disagree on both counts but remain in the vocal minority. Actually, I think you are in agreement with Uwe and Duncan on this point, Duncan having added the refinement that the test writer gets to decide. No one so far seems to be advocating for my position that the tests should necessarily fail if they cannot be run. So I guess I am the one in the minority. Paul Dirk __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Handling Not-Always-Needed Dependencies? - Part 2
(One question from the thread Handling Not-Always-Needed Dependencies?) I hope not to start another long tangled thread, but I have a basic confusion which I think has a yes/no answer and I would like to know if there is agreement on this point (or is it only me that is confused as usual). If my package has a test that needs another package, but that package is not needed in the /R code of my package, then I indicate it as "Suggests", not as "Depends" nor as "Imports". If that package is not available when I run R CMD check, should the test pass? Yes or no: ? (I realize my own answer might be different if the package was used in an example or demo in place of a test, but that is just the confusion caused by too many uses for Suggests. In the case of a test, my own thought is that the test must fail, so my own answer is no. If the test does not fail then there is no real testing being done, thus missing code coverage in the testing. If the answer is no, then the tests do not need to be run if the package is not available, because it is known that they must fail. I think that not bothering to run the tests because the result is known is even more efficient than other suggestions. I also think it is the status quo.) Hoping my confusion is cleared up, and this does not become another long tangled thread, Paul __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] Suggested dependencies in context of R CMD check
On 04/04/2016 01:56 PM, Duncan Murdoch wrote: On 04/04/2016 1:35 PM, Dirk Eddelbuettel wrote: On 4 April 2016 at 07:25, Hadley Wickham wrote: | On Sat, Apr 2, 2016 at 5:33 AM, Jan Górecki <j.gore...@wit.edu.pl> wrote: | | In principle, I believe a package should pass R CMD check if no | suggested packages are installed. However, since this is not currently The relevant manual says The 'Suggests' field uses the same syntax as 'Depends' and lists packages that are not necessarily needed. This includes packages used only in examples, tests or vignettes (*note Writing package vignettes::), and packages loaded in the body of functions. E.g., suppose an example(1) from package *foo* uses a dataset from package *bar*. Then it is not necessary to have *bar* use *foo* unless one wants to execute all the examples/tests/vignettes: it is useful to have *bar*, but not necessary. Version requirements can be specified, and will be used by 'R CMD check'. and later * All packages that are needed(2) to successfully run 'R CMD check' on the package must be listed in one of 'Depends' or 'Suggests' or 'Imports'. Packages used to run examples or tests conditionally (e.g. _via_ 'if(require(PKGNAME))') should be listed in 'Suggests' or 'Enhances'. (This allows checkers to ensure that all the packages needed for a complete check are installed.) | automatically checked, many packages will fail to cleanly pass R CMD | check if suggested packages are missing. I consider that to be a bug in those 'many packages'. It essentially takes away the usefulness of having a Suggests: to provide a more fine-grained dependency graph. So I am with Jan here. I think I agree with Jan, but not for the reason you state. Suggests is useful even if "R CMD check" treats it as Depends, because most users never need to run "R CMD check". It allows them to use a subset of the functionality of a package without installing tons of dependencies. I agree that packages that fail on examples when Suggested packages are missing are broken. (Using if (require()) to skip particular examples isn't failing.) It would be useful to be able to detect failure; I don't think that's easy now with "R CMD check". That's why you should be able to run it with Suggested packages missing. Perhaps I'm confused, it would not be the first time, but I have the impression that some/all? of you are arguing for a different philosophy around R CMD check and Suggests/Depends. But the current design is not broken, it is working the way it has been advertised for many years now. It provides a fine-grained dependency graph for end users, not developers and testers. Being able to suggest packages for use in testing, when they are not needed for regular use is a good thing. A package failing R CMD check when the suggested packages are not available is not a bug, it is a feature following the rules as they have been designed. If you want to check a package then you need to install things that are needed to check it. If R CMD check skipped testing because suggested packages were not available then you will have many packages not being tested properly, that is, lots of broken packages passing R CMD check. (I've done this to myself sometimes using if(require()).) There are situations where some testing needs to be skipped, for example, license requirements and special databases, but this needs to be done carefully, and my impression is that if(require()) provides most of what is necessary, sometimes along with environment variables. Perhaps this is not elegant, but it does work and is not difficult. The ideal situation would be to be able to run all possible combinations of missing Suggested packages, but that's probably far too slow to be a default. But how do you decide pass/fail when you do this? I think it will only pass when all the suggested packages are available? Paul Gilbert BTW, I'm not completely sure it needs to be possible to run vignettes without the Suggested packages they need. Vignettes are allowed to depend on things that aren't available to all users, and adding all the require() tests could make them less clear. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Best way to implement optional functions?
On 10/22/2015 03:55 PM, Duncan Murdoch wrote: I'm planning on adding some new WebGL functionality to the rgl package, but it will pull in a very large number of dependencies. Since many people won't need it, I'd like to make the new parts optional. The general idea I'm thinking of is to put the new stuff into a separate package, and have rgl "Suggest" it. But I'm not sure whether these functions should only be available in the new package (so users would have to attach it to use them), or whether they should be in rgl, but fail if the new package is not available for loading. Can people suggest other packages that solve this kind of problem in a good way? I do something similar in several packages. I would distinguish between the situation where the new functions have some functionality without all the extra dependencies, and the case where they really do not. In the former case it makes sense to put the functions in rgl and then fail when the extra functionality is demanded and not available. In the latter case, it "feels like" you are trying to defeat Depends: or Imports:. That route has usually gotten me in trouble. Another thing you might want to consider is that, at least for awhile, the new functions in rglPlus will probably be less stable then those in rgl. Being able to change those and update rglPlus without needing to update rgl can be a real advantage (i.e. if the API for the new functions is in rgl, and you need to change it, then you are required to notify all the package maintainers that depend on rgl, do reverse testing, and you have to explain that your update of rgl is going to break rglPlus and you have a new version of that but you cannot submit that yet because it will not work until the new rgl is in place.) Paul Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] download.file and https
On 07/02/2015 10:52 PM, Henrik Bengtsson wrote: From R 3.2.0, check: capabilities(libcurl) libcurl TRUE TRUE means R was built such that HTTPS is supported. If you see FALSE, make sure libcurl is available when/if you build R from source. I do have TRUE for this. The default behaviour still does not work. Paul /Henrik On Thu, Jul 2, 2015 at 7:46 PM, Paul Gilbert pgilbert...@gmail.com wrote: (This problem with download.file() affects quantmod, and possibly several other packages. e.g. getSymbols('M2',src='FRED') fails.) I think the St Louis Fed has moved to using https for connections, and I believe all the US government web sites are doing this. An http request is automatically switched to https. The default download.file method does not seem to handle this, but method=wget does: tmp - tempfile() download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;, destfile = tmp) trying URL 'http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv' Error in download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;, : cannot open URL 'http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv' download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;, destfile = tmp, method=wget) --2015-07-02 22:29:49-- http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv Resolving research.stlouisfed.org (research.stlouisfed.org)... 65.89.18.120 Connecting to research.stlouisfed.org (research.stlouisfed.org)|65.89.18.120|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv [following] --2015-07-02 22:29:49-- https://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv Connecting to research.stlouisfed.org (research.stlouisfed.org)|65.89.18.120|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/x-comma-separated-values] Saving to: ‘/tmp/RtmpOX7kA1/file1ba639d7fd0f’ [ = ] 34,519 178KB/s in 0.2s 2015-07-02 22:29:50 (178 KB/s) - ‘/tmp/RtmpOX7kA1/file1ba639d7fd0f’ saved [34519] Paul __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] Defining a `show` function breaks the print-ing of S4 object -- bug or expected?
On 06/30/2015 11:33 AM, Duncan Murdoch wrote: On 30/06/2015 5:27 PM, Lorenz, David wrote: There is something I'm really missing here. The function show is a standardGeneric function, so the correct way to write it as method like this: That describes methods::show. The problem is that the default print mechanism isn't calling methods::show() (or base::print() as Luke says), it's calling show() or print() in the global environment, so the user's function overrides the generic, and you get the error. These are two different problems aren't they? I can see that you might want to ensure that base::print() calls methods::show(), but forcing the default print to go to base::print(), rather than whatever print() is first on the search path, would seem like a real change of philosophy. What about all the other base functions that can be overridden by something in the global environment? Paul Luke, are you going to look at this, or should I? Duncan Murdoch setMethod(show, Person, function(object) { for an object of class Person for example. Dave On Tue, Jun 30, 2015 at 10:11 AM, luke-tier...@uiowa.edu wrote: Same thing happens with S3 if you redefine print(). I thought that code was actually calculating the function to call rather than the symbol to use, but apparently not. Shouldn't be too hard to fix. luke On Tue, 30 Jun 2015, Hadley Wickham wrote: On Tue, Jun 30, 2015 at 2:20 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 30/06/2015 1:57 PM, Hadley Wickham wrote: A slightly simpler formulation of the problem is: show - function(...) stop(My show!) methods::setClass(Person, slots = list(name = character)) methods::new(Person, name = Tom) # Error in (function (...) : My show! Just to be clear: the complaint is that the auto-called show() is not methods::show? I.e. after x - methods::new(Person, name = Tom) you would expect show(x) to give the error, but not x ?? Correct - I'd expect print() to always call methods::show(), not whatever show() is first on the search path. Hadley -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] appropriate directory for data downloads in examples, demos and vignettes
Regarding alternative places for scripts, you can add a directory (eg inst/testLocalScripts) and then with a recently added R CMD feature you can do R CMD check --test-dir=inst/testLocalScripts your-package.tar.gz This will not (automatically) be checked on CRAN. Beware that you also need to run R CMD check without this option to run your regular tests. Paul On 06/29/2015 11:25 AM, Jonathan Callahan wrote: Hi, The MazamaSpatialUtils http://cran.r-project.org/package=MazamaSpatialUtils package has a required package state variable which users set to specify where they want to store large amounts of GIS data that is being downloaded and converted by the package. The implementation of this follows Hadley's advice here: http://adv-r.had.co.nz/Environments.html#explicit-envs The functionality is implemented with package environment and getter and setter functions: spatialEnv - new.env(parent = emptyenv()) spatialEnv$dataDir - NULL getSpatialDataDir - function() { if (is.null(spatialEnv$dataDir)) { stop('No data directory found. Please set a data directory with setSpatialDataDir(YOUR_DATA_DIR).',call.=FALSE) } else { return(spatialEnv$dataDir) } } setSpatialDataDir - function(dataDir) { old - spatialEnv$dataDir dataDir - path.expand(dataDir) tryCatch({ if (!file.exists(dataDir)) dir.create(dataDir) spatialEnv$dataDir - dataDir }, warning = function(warn) { warning(Invalid path name.) }, error = function(err) { stop(paste0(Error in setSpatialDataDir(,dataDir,).)) }) return(invisible(old)) } My question is: *What is an appropriate directory to specify for vignettes (or demos or examples) that need to go through CRAN testing?* The R code in vignettes need to specify a directory that is writable during the package build process but that will also be available to users. Should we create a /tmp/hash directory? Would that be available on all systems? Alternatively, *What is an alternative to vignettes and demos for tutorial scripts that should not be tested upon submission to CRAN?* Thanks for any suggestions. Jon __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] Print output during long tests?
If your tests can be divided into multiple files in the tests/ directory then you will get lines like * checking tests ... Running ‘test1.R’ Running ‘test2.R’ Running ‘test3.R’ ... Paul On 05/04/2015 11:52 AM, Toby Hocking wrote: I am the author of R package animint which uses testthat for unit tests. This means that there is a single test file (animint/tests/testthat.R) and during R CMD check we will see the following output * checking tests ... Running ‘testthat.R’ I run these tests on Travis, which has a policy that if no output is received after 10 minutes, it will kill the check. Because animint's testthat tests take a total of over 10 minutes, Travis kills the R CMD check job before it has finished all the tests. This is a problem since we would like to run animint tests on Travis. One solution to this problem would be if R CMD check could output more lines other than just Running testthat.R. Can I give some command line switch to R CMD check or set some environment variable, so that some more verbose test output could be shown on R CMD check? [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check and missing imports from base packages
On 04/29/2015 05:38 PM, William Dunlap wrote: And in general a developer would avoid masking a function in a base package, so as not to require the user to distinguish between stats::density() and igraph::density(). Maybe the example is not meant literally. The 'filter' function in the popular 'dplyr' package masks the one that has been in the stats package forever, and they have nothing in common, so that may give you an example. As I recall, several packages mask the simulate generic in stats, if you are looking for examples. Paul Gilbert Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Apr 29, 2015 at 1:24 PM, Martin Morgan mtmor...@fredhutch.org wrote: On 04/28/2015 01:04 PM, Gábor Csárdi wrote: When a symbol in a package is resolved, R looks into the package's environment, and then into the package's imports environment. Then, if the symbol is still not resolved, it looks into the base package. So far so good. If still not found, it follows the 'search()' path, starting with the global environment and then all attached packages, finishing with base and recommended packages. This can be a problem if a package uses a function from a base package, but it does not formally import it via the NAMESPACE file. If another package on the search path also defines a function with the same name, then this second function will be called. E.g. if package 'ggplot2' uses 'stats::density()', and package 'igraph' also defines 'density()', and 'igraph' is on the search path, then 'ggplot2' will call 'igraph::density()' instead of 'stats::density()'. stats::density() is an S3 generic, so igraph would define an S3 method, right? And in general a developer would avoid masking a function in a base package, so as not to require the user to distinguish between stats::density() and igraph::density(). Maybe the example is not meant literally. Being able to easily flag non-imported, non-base symbols would definitely improve the robustness of package code, even if not helping the end user disambiguate duplicate symbols. Martin Morgan I think that for a better solution, either 1) the search path should not be used at all to resolve symbols in packages, or 2) only base packages should be searched. I realize that this is something that is not easy to change, especially 1) would break a lot of packages. But maybe at least 'R CMD check' could report these cases. Currently it reports missing imports for non-base packages only. Is it reasonable to have a NOTE for missing imports from base packages as well? [As usual, please fix me if I am missing or misunderstood something.] Thank you, Best, Gabor [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Which function can change RNG state?
On 02/08/2015 09:33 AM, Dirk Eddelbuettel wrote: On 7 February 2015 at 19:52, otoomet wrote: | random numbers. For instance, can I be sure that | set.seed(0); print(runif(1)); print(rnorm(1)) | will always print the same numbers, also in the future version of R? There Yes, pretty much. This is nearly correct. The user could change the uniform or normal generator, since there are options other than the defaults, which would mean the result would be different. And obviously if they changed print precision then the printed result may be truncated differently. I think you could prepare for future versions of R by saving information about the generators you are using. The precedent has already been set (R-1.7.0) that the default could change if there is a good reason. A good reason might be that the RNG is found not to be so good relative to others that become available. But I think the old generator would continue to be available, so people can reproduce old results. (Package setRNG has some utilities to help save and reset, but there is nothing especially difficult or fancy, just a few details that need to be remembered.) I've been lurking here over fifteen years, and while I am getting old and forgetful I can remember exactly one such change where behaviour was changed, and (one of the) generators was altered---if memory serves in the earlier days of R 1.* days . [ Goes digging...] Yes, see `help(RNGkind)` which details that R 1.7.0 made a change when Buggy Kinderman-Ramage was added as the old value, and Kinderman-Ramage was repaired. There once was a similar fix in the very early days of the Mersenne-Twister which is why the GNU GSL has two variants with suffixes _1998 and _1998. I seem to recall a bit of change around R-0.49 but old and forgetful would cover this too. For me, a bigger change was an unadvertised change in Splus - they compiled against a different math library at some point. This changed the lower bits in results, mostly insignificant but accumulated simulation results could amount to something fairly important. The amount of time I spent trying to find why results would not reproduce was one of my main motivations for starting to use R. So your issue seems like pilot error to me: don't attach the parallel package if you do not plan to work in parallel. But do if you do, and see its fine vignette on how it provides you reproducibility for multiple RNG streams. In general, you can very much trust R (and R Core) in these matters. Dirk On 02/08/2015 09:40 AM, Gábor Csárdi wrote: On Sat, Feb 7, 2015 at I don't know if there is intention to keep this reproducible across R versions, but it is already not reproducible across platforms (with the same R version): http://stackoverflow.com/questions/21212326/floating-point-arithmetic-and-reproducibility The situation is better in some respects, and worse in others, than what is described on stackoverflow. I think the point is made pretty well there that you should not be trying to reproduce results beyond machine precision. My experience is that you can compare within a fuzz of 1e-14 usually, even across platforms. (The package setRNG on CRAN has a function random.number.test() which is run in the package's tests/ and makes uniform and normal comparisons to 1e-14. It has passed checks on all R platforms since 2004. Actual, the checks have been done since about 1995 but they were part of package dse earlier.) If you accumulate lots of lower order parts (eg sum(simulated - true) in a long monte-carlo) then the fuzz may need to get much larger, especially comparing across platforms. And you will have trouble with numerically unstable calculations. Once-upon-a-time I was annoyed by this, but then I realized that it was better not to do unstable calculations. In addition to not being reproducible beyond machine precision across R versions and across platforms, you can really not be guaranteed even on the same platform and same version of R. You may get different results if you upgrade the OS and there has been a change in the math libraries. In my experience this happens rather often. I don't think there is any specific 32 vs 64 bit issue, but math libraries sometimes do things a bit differently on different processors (eg processor bug fixes) so you can occasionally get differences with everything the same except the hardware. On 02/07/2015 10:52 PM, otoomet wrote: It turned out that this is because package parallel, buried deep in my dependencies, calls runif() during it's initialization and in this way changes the random number sequence. Guessing a bit about what you are saying: 1/you set the random seed 2/you did some things which included loading package parallel 3/you ran some things for which you expected to get results comparable to some previous run when you did 1/ and 2/ in the reverse order. If I understand this correctly, I suggest you always do everything
Re: [Rd] unloadNamespace
Thanks Winston. That seems like a workaround that might be usefully included into unloadNamespace. Paul On 15-01-09 12:09 PM, Winston Chang wrote: It's probably because the first thing that unloadNamespace does is this: ns - asNamespace(ns, base.OK = FALSE) If you call asNamespace(tseries), it calls getNamespace(tseries), which has the side effect of loading that package (and its dependencies). One way to work around this is to check loadedNamespaces() before you try to unload a package. -Winston On Thu, Jan 8, 2015 at 9:45 AM, Paul Gilbert pgilbert...@gmail.com mailto:pgilbert...@gmail.com wrote: In the documentation the closed thing I see to an explanation of this is that ?detach says Unloading some namespaces has undesirable side effects Can anyone explain why unloading tseries will load zoo? I don't think this behavior is specific to tseries, it's just an example. I realize one would not usually unload something that is not loaded, but I would expect it to do nothing or give an error. I only discovered this when trying to clean up to debug another problem. R version 3.1.2 (2014-10-31) -- Pumpkin Helmet and R Under development (unstable) (2015-01-02 r67308) -- Unsuffered Consequences ... Type 'q()' to quit R. loadedNamespaces() [1] base datasets graphics grDevices methods stats [7] utils unloadNamespace(tseries) # loads zoo ? loadedNamespaces() [1] base datasets graphics grDevices grid lattice [7] methods quadprog stats utils zoo Somewhat related, is there an easy way to get back to a clean state for loaded and attached things, as if R had just been started? I'm trying to do this in a vignette so it is not easy to stop and restart R. Paul R-devel@r-project.org mailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-devel https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] unloadNamespace
In the documentation the closed thing I see to an explanation of this is that ?detach says Unloading some namespaces has undesirable side effects Can anyone explain why unloading tseries will load zoo? I don't think this behavior is specific to tseries, it's just an example. I realize one would not usually unload something that is not loaded, but I would expect it to do nothing or give an error. I only discovered this when trying to clean up to debug another problem. R version 3.1.2 (2014-10-31) -- Pumpkin Helmet and R Under development (unstable) (2015-01-02 r67308) -- Unsuffered Consequences ... Type 'q()' to quit R. loadedNamespaces() [1] base datasets graphics grDevices methods stats [7] utils unloadNamespace(tseries) # loads zoo ? loadedNamespaces() [1] base datasets graphics grDevices grid lattice [7] methods quadprog stats utils zoo Somewhat related, is there an easy way to get back to a clean state for loaded and attached things, as if R had just been started? I'm trying to do this in a vignette so it is not easy to stop and restart R. Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] testing dontrun examples
On 14-11-26 05:49 PM, Duncan Murdoch wrote: On 26/11/2014, 1:45 PM, Paul Gilbert wrote: Is there a good strategy for testing examples which should not be run by default? For instance, I have examples which get data from the Internet. If I wrap them in try() then they can be skipped if the Internet is not available, but may not be tested in cases when I would like to know about the failure. (Not to mention that the example syntax is ugly.) If I mark them \dontrun or \donttest then they are not tested. I could mark them \dontrun and then use example() but for this, in addition to run.dontrun=TRUE, I would need to specify all topics for a package, and I don't see how to do this, missing topic does not work. Wishlist: what I would really like is R CMD check --run-dontrun pkg We have that in R-devel, so everyone will have it next April, but there will possibly be bugs unless people like you try it out now. Are you anticipating my wishes now, or did you tell me this and it entered my subconscious? So far it works as advertised. Thanks, Paul Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] testing dontrun examples
Is there a good strategy for testing examples which should not be run by default? For instance, I have examples which get data from the Internet. If I wrap them in try() then they can be skipped if the Internet is not available, but may not be tested in cases when I would like to know about the failure. (Not to mention that the example syntax is ugly.) If I mark them \dontrun or \donttest then they are not tested. I could mark them \dontrun and then use example() but for this, in addition to run.dontrun=TRUE, I would need to specify all topics for a package, and I don't see how to do this, missing topic does not work. Wishlist: what I would really like is R CMD check --run-dontrun pkg Suggestions? Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] testing dontrun examples
On 14-11-26 02:09 PM, Spencer Graves wrote: Hi, Paul: if(!fda::CRAN()) runs code except with R CMD check –as-cran. I use it so CRAN checks skip examples that (a) need the Internet or (b) take too long for CRAN. Spencer fda::CRAN() gives TRUE on my home machine, I think because I use several variables like _R_CHECK_HAVE_MYSQL_=TRUE to control whether some tests get run. (Not all CRAN test servers have all resources.) But, more importantly, wouldn't this strategy prevent CRAN from automatically running more extensive testing of the examples if they decided to do that sometimes? Paul Hope this helps. Spencer On 11/26/2014 10:45 AM, Paul Gilbert wrote: Is there a good strategy for testing examples which should not be run by default? For instance, I have examples which get data from the Internet. If I wrap them in try() then they can be skipped if the Internet is not available, but may not be tested in cases when I would like to know about the failure. (Not to mention that the example syntax is ugly.) If I mark them \dontrun or \donttest then they are not tested. I could mark them \dontrun and then use example() but for this, in addition to run.dontrun=TRUE, I would need to specify all topics for a package, and I don't see how to do this, missing topic does not work. Wishlist: what I would really like is R CMD check --run-dontrun pkg Suggestions? Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Changing style for the Sweave vignettes
You might also consider starting your vignettes with \begin{Scode}{echo=FALSE,results=hide} options(continue= ) \end{Scode} Then you get one prompt but it is still easy to cut and paste. This has been in many of my packages for many years, so I think it would be fair to assume it is acceptable. Paul On 11/13/2014 06:56 AM, January Weiner wrote: Thank you, Søren and Brian for your answers. Whether this is the right list -- well, I think it is, since I am developing a package and would like to create a vignette which is useful and convenient for my users. I know how to extract the vignette code. However, most of my users don't. Or if they do, they do not bother, but copy the examples from the PDF while they are reading it. At least that is my observation. I'm sorry that my e-mail was unclear -- I started my e-mail with as a user, ..., but I did mention that it is my vignettes that I am concerned with. options(prompt=...) is an idea, though I'm still not sure as to the second part of my question - whether a vignette without a command prompt is acceptable in a package or not. Kind regards, j. On 13 November 2014 12:36, Brian G. Peterson br...@braverock.com wrote: On 11/13/2014 05:09 AM, January Weiner wrote: As a user, I am always annoyed beyond measure that Sweave vignettes precede the code by a command line prompt. It makes running examples by simple copying of the commands from the vignette to the console a pain. I know the idea is that it is clear what is the command, and what is the output, but I'd rather precede the output with some kind of marking. Is there any other solution possible / allowed in vignettes? I would much prefer to make my vignettes easier to use for people like me. I agree with Søren that this is not the right list, but to complete the thread... See the examples in ?vignette start just above ## Now let us have a closer look at the code All vignette's are compiled. You can trivially extract all the code used for any vignette in R, including any code not displayed in the text and hidden from the user, from within R, or saved out to an editor so you can source it line by line from Rstudio (or vim or emacs or...). That's the whole point. Regards, Brian -- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem with build and check
I certainly have longer argument lists with no problem. More likely the Rd file needs special consideration for %. Paul On 11/12/2014 02:11 PM, Therneau, Terry M., Ph.D. wrote: I am getting failure of build and check, for an Rd file that has a long argument list. Guess diagnosis: a quoted string beyond a certain point in the argument list is fatal. Example: Use the function below, create an Rd file for it with prompt(). Move the .Rd file to the man directory (no need to edit it) and try building dart.control - function(server=c(production, integration, development, http), out.poll.duration = 5, out.poll.increase = 1.1, out.poll.max = 30, out.poll.timeout = 3600, netrc.path, netrc.server = ldap, rtype = c(xml, json), dateformat= %Y-%m-%d) { server - match.arg(server) server } I created a package dummy with only this function, and get the following on my Linux box. tmt-local2021% R CMD build dummy * checking for file ‘dummy/DESCRIPTION’ ... OK * preparing ‘dummy’: * checking DESCRIPTION meta-information ... OK Warning: newline within quoted string at dart.control.Rd:11 Warning: /tmp/RtmpjPjz9V/Rbuild398d6e382572/dummy/man/dart.control.Rd:46: unexpected section header '\value' Warning: newline within quoted string at dart.control.Rd:11 Error in parse_Rd(/tmp/RtmpjPjz9V/Rbuild398d6e382572/dummy/man/dart.control.Rd, : Unexpected end of input (in quoted string opened at dart.control.Rd:88:16) Execution halted Session info for my version sessionInfo() R Under development (unstable) (2014-10-30 r66907) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=C [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base Terry T. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] extra package tests directory
I am trying to decide on a name for a directory where I will put some extra package tests. The main motivation for this is the need to limit the package test time on CRAN. That is, these are tests that could be in the tests/ directory and could be run on CRAN, but will take longer than CRAN likes. Scanning through names currently being used in packages on CRAN I see a large number of inst/tests/ directories, but they seem to be instead of a tests/ directory at the top level of the package. (There are also some occurrences of inst/test and test/ at the top level.) I would prefer not to use these directories as I don't like the possible confusion over whether these are the standard package tests or additional ones. The other name that is used a fair amount is inst/unitTests/ (plus inst/UnitTests/, plus inst/UnitTest/, plus inst/unittests). In many case these seem to be run by a script in the tests/ directory using a unit testing framework, so they cannot easily be distinguished from the normal package tests/ run by CRAN. I see also an occurrence each of inst/otherTests/ inst/testScripts/ and inst/test_cases. My own preference would be inst/extraTests but no one is using that. Have I missed anything? Does anyone have suggestions or comments? Are there other reasons one might want tests that are not usually run by CRAN? Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] requireNamespace() questions
I am trying to follow directions at http://cran.r-project.org/doc/manuals/r-patched/R-exts.html#Suggested-packages regarding handling suggested packages with requireNamespace() rather than require(), and I have some questions. 1/ When I do requireNamespace() in a function is the loading of the namespace only effective within the function? 2/ At the link above in the manual it says Note the use of rgl:: as that object would not necessarily be visible When the required package is loading methods, will the method be found when I reference the generic, which is not in the package, or do I need to do something different? 3/ In some packages I have functions that return an object defined in the suggested package being required. For example, a function does require(zoo) and then returns a zoo object. So, to work with the returned object I am really expecting that zoo will be available in the session afterwards. Is it recommended that I just check if the package is available on the search path the user has set rather than use require() or requireNamespace()?. Regarding checking the path without actually attaching the package to the search path, is there something better than package:zoo %in% search() or is that the best way? 4/ I have a function in a package that Depends on DBI and suggests RMySQL, RPostgreSQL, RSQLite. The function uses dbDriver() in DBI which uses do.call(). If I use requireNamespace() in place of require() I get requireNamespace(RMySQL) Loading required namespace: RMySQL m - dbDriver(MySQL) Error in do.call(as.character(drvName), list(...)) : could not find function MySQL require(RMySQL) Loading required package: RMySQL m - dbDriver(MySQL) Is there a different way to handle this without altering the search path? The function do.call() does not seem to work with an argument like do.call(RMySQL::MySQL, list()) even at the top level, and this situation may be more complicated when it is in a required package. What am I missing? Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] How to test impact of candidate changes to package?
On 09/10/2014 06:12 AM, Kirill Müller wrote: If you don't intend to keep the old business logic in the long run, perhaps a version control system such as Git can help you. If you use it in single-user mode, you can think of it as a backup system where you manually create each snapshot and give it a name, but it actually can do much more. For your use case, you can open a new *branch* where you implement your changes, and implement your testing logic simultaneously in both branches (using *merge* operations). The system handles switching between branches, so you can really perform invasive changes, and revert if you find that a particular change breaks something. ... Yes, I would strongly recommend some version control system for this, probably either Git or svn (Subversion). If this is all code and test data that you can release publicly then you might choose some public repository like Github or R-forge. (You will get lots of opinions about the relative merits of different repositories if you ask, but the main point is that any one of them will be better than nothing.) If part of your code and data cannot be released then you might check if something is already supported in your place of business. Chances are that it is, but only programmers in IT have been told about it. On 09/10/2014 11:14 AM, Stephanie Locke wrote: ... Has anyone else had to do this sort of testing before on their packages? How did you do it? Am I missing an obvious package / framework that can do this? Most package maintainers would face some version of this problem, some simpler and some much more complicated. If you set up the tests as scripts in the package tests/ directory that issue stop() in the case of a problem, then R-forge pretty much does the checking for you on multiple platforms, at least when it is working properly. It is probably more trouble than it is worth for a single package, but if you have several packages with inter-dependencies then you might want to look at the develMake framework at http://automater.r-forge.r-project.org/ Regards, Paul Cheers, Steph -- Stephanie Locke BI Credit Risk Analyst __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Re R CMD check checking in development version of R
(Please correct me if I'm wrong. I thought I mostly understood this, finally, but I've made the mistake of thinking I understood something too many times before.) On 08/28/2014 10:39 AM, Simon Urbanek wrote: On Aug 27, 2014, at 6:01 PM, Gavin Simpson ucfa...@gmail.com wrote: On 27 August 2014 15:24, Hadley Wickham h.wick...@gmail.com wrote: Is that the cause of these NOTEs? Is the expectation that if I am using a function from a package, even a package that I have in Depends:, that I have to explicitly declare these imports in NAMESPACE? Yes. (Otherwise your package won't work if it's only attached and not loaded. i.e. if someone does analogue::foo() only the imported functions are available, not the functions in packages you depend on) Cheers Hadley. Thanks for the confirmation, but... ...I don't get this; what is the point of Depends? I thought it was my package needs these other packages to work, i.e. be loaded. Hence it is user error (IMHO ;-) to do `analogue::foo()` without having the dependencies loaded too. No. The point of Depends is that if your package is attached, it also attaches the other packages to make them available for the user. Essentially you're saying if you want to use my package interactively, you will also want to use those other packages interactively. I agree that interactively catches the flavour of what Depends does, but technically that is the wrong word. The important point is whether the functions in a Depended upon package should be available to the user directly without them needing to use library() or require() to make them available, in an interactive session or a batch job. You still need to use import() to define what exactly is used by your package - Amplifying a bit: By import() in the NAMESPACE, which you need whether you have Depends or Imports in the DESCRIPTION file, you ensure that the functions in your package use the ones in the package imported and do not get clobbered by anything the user might do. The user might redefine functions available to the interactive session, or require() another package with functions having the same names, and those are the ones his interactive direct calls will find, but your package functions will not use those. People are sure to have differences of opinion about the trade-off between the annoyance of having to specifically attach packages being used, and the clarity this provides. At first I was really annoyed, but have eventually decided I do like the clarity. In my experience it turns out to be surprisingly rare that you need packages in Depends, but there are legitimate cases beyond the annoyance case mentioned above. I think if you are putting packages in Depends you really do want to have a very good understanding of why you are doing that. If you use Depends then you are inviting support difficulties. Users will contact you about bugs in the package you attach, even though your package may not use the broken functions. If they attach the package themselves then they are much more likely to understand who they should contact. Core seems to have forgotten to take credit for trying to make life easier for package developers. Paul as opposed to what you want to be available to the user in case it is attached. Cheers, Simon This check (whilst having found some things I should have imported and didn't - which is a good thing!) seems to be circumventing the intention of having something in Depends. Is Depends going to go away? (And really you shouldn't have any packages in depends, they should all be in imports) I disagree with *any*; having say vegan loaded when one is using analogue is a design decision as the latter borrows heavily from and builds upon vegan. In general I have moved packages that didn't need to be in Depends into Imports; in the version I am currently doing final tweaks on before it goes to CRAN I have remove all but vegan from Depends. Or am I thinking about this in the wrong way? Thanks again Gavin Hadley -- http://had.co.nz/ -- Gavin Simpson, PhD [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check for the R code from vignettes
On 06/02/2014 12:16 AM, Gabriel Becker wrote: Carl, I don't really have a horse in this race other than a strong feeling that whatever check does should be mandatory. That having been said, I think it can be argued that the fact that check does this means that it IS in the R package vignette specification that all vignettes must be such that their tangled code will run without errors. My understanding of this is that the package maintainer can turn off building the vignette (--no-vignettes) but R CMD check and CRAN still check that the tangle code runs, and the check fails if it does not. Running the tangle code can be turned off, just not by the package maintainer. You have to make a special appeal to the CRAN maintainers, and give reasons they are prepared to accept. I think the intention is that the tangle code should run without errors. I doubt they would accept it doesn't work as an acceptable reason. But there are reasons, like the vignette requires access to a commercial database engine. Also, I think, turning this off means they just do not run it regularly, in the daily checks. I don't think it necessarily means the code is never tested. The testing may need to be done on machines with special resources. Thus, --no-vignettes provides a mechanism to avoid running the tangle code twice but, without special exemption, it is still run once. Some package maintainers may not think of several feature of 'R CMD check' as 'aids'. I think of it having more to do with maintaining some quality assurance, which I think of as an aid but not a debugging aid. I believe the CRAN maintainers have intentionally, and successfully, made disabling the running of tangled code more trouble than it is generally worth. Effectively, a package should have tangle code that runs without errors. (Of course, I could be wrong about all this, it has happened before.) Paul ~G On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger cboet...@gmail.com wrote: Yihui, list, Focusing the behavior of R CMD check, the only reason I have seen put forward in the discussion for having check tangle and then source as well as knit/weave the very same vignette is to assist the package maintainer in debugging R errors vs pdflatex errors. As tangle (and many other tools) are already available to an author needing extra help debugging, and as the error messages are usually clear on whether errors come from the R code or whatever format compiling (pdflatex, markdown html, etc), this seems like a poor reason for R CMD check to be wasting time doing two versions of almost (but not literally) the same check. As has already been discussed, it is possible to write vignettes that can be Sweave'd but not source'd, due to the different treatments of inline chunks. While I see the advantages of this property, I don't see why R CMD check should be enforcing it through the arbitrary mechanism of running both Sweave and tangle+source. If that is the desired behavior for all Sweave documents it should be in part of the Sweave specification not to be able to write/change values in inline expressions, or part of the tangle definition to include inline chunks. I any event I don't see any reason for R CMD check doing both. Perhaps someone can fill in whatever I've overlooked? Carl On Sat, May 31, 2014 at 8:17 PM, Yihui Xie x...@yihui.name wrote: 1. The starting point of this discussion is package vignettes, instead of R scripts. I'm not saying we should abandon R scripts, or all people should write R code to generate reports. Starting from a package vignette, you can evaluate it using a weave function, or evaluate its derivative, namely an R script. I was saying the former might not be a bad idea, although the latter sounds more familiar to most R users. For a package vignette, within the context of R CMD check, is it necessary to do tangle + evaluate _besides_ weave? 2. If you are comfortable with reading pure code without narratives, I'm totally fine with that. I guess there is nothing to argue on this point, since it is pretty much personal taste. 3. Yes, you are absolutely correct -- Sweave()/knit() does more than source(), but let me repeat the issue to be discussed: what harm does it bring if we disable tangle for R package vignettes? Sorry if I did not make it clear enough, my priority of this discussion is the necessity of tangle for package vignettes. After we finish this issue, I'll be happy to extend the discussion towards tangle in general. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker gmbec...@ucdavis.edu wrote: On Sat, May 31, 2014 at 6:54 PM, Yihui Xie x...@yihui.name wrote: I agree that fully evaluating the code is valuable, but it is not a problem since the weave functions do fully evaluate the code. If there is a reason for why source() an R script is preferred, I guess it is users' familiarity with .R instead of
Re: [Rd] type.convert and doubles
On 04/17/2014 02:21 PM, Murray Stokely wrote: On Thu, Apr 17, 2014 at 6:42 AM, McGehee, Robert robert.mcge...@geodecapital.com wrote: Here's my use case: I have a function that pulls arbitrary financial data from a web service call such as a stock's industry, price, volume, etc. by reading the web output as a text table. The data may be either character (industry, stock name, etc.) or numeric (price, volume, etc.), and the function generally doesn't know the class in advance. The problem is that we frequently get numeric values represented with more precision than actually exists, for instance a price of 2.6999 rather than 2.70. The numeric representation is exactly one digit too much for type.convert which (in R 3.10.0) converts it to character instead of numeric (not what I want). This caused a bunch of non-numeric argument to binary operator errors to appear today as numeric data was now being represented as characters. I have no doubt that this probably will cause some unwanted RODBC side effects for us as well. IMO, getting the class right is more important than infinite precision. What use is a character representation of a number anyway if you can't perform arithmetic on it? I would favor at least making the new behavior optional, but I think many packages (like RODBC) potentially need to be patched to code around the new feature if it's left in. The uses of character representation of a number are many: unique identifiers/user ids, hash codes, timestamps, or other values where rounding results to the nearest value that can be represented as a numeric type would completely change the results of any data analysis performed on that data. Database join operations are certainly an area where R's previous behavior of silently dropping precision of numbers with type.convert can get you into trouble. For example, things like join operations or group by operations performed in R code would produce erroneous results if you are joining/grouping by a key without the full precision of your underlying data. Records can get joined up incorrectly or aggregated with the wrong groups. I don't understand this. Assuming you are sending the SQL statement to the database engine, none of this erroneous matching is happening in R. The calculations all happens on the database. But, for the case where the database does know that numbers are double precision, it would be nice if they got transmitted by ODBC to R as numerics (the usual translation) just as they are by the native interfaces like RPostgreSQL. Do you get the erroneous results when you use a native interface? ( from second response:) You want a casting operation in your SQL query or similar if you want a rounded type that will always fit in a double. Cast or Convert operators in SQL, or similar for however you are getting the data you want to use with type.convert(). This is all application specific and sort of beyond the scope of type.convert(), which now behaves as it has been documented to behave. This seems to suggests I need to use different SQL statements depending on which interface I use to talk to the database. If you do 1/3 in a database calculation and that ends up being represented as something more accurate than double precision on the database, then it needs to be transmitted as something with higher precision (character/factor?). If the result is double precision it should be sent as double precision, not as something pretending to be more accurate. I suspect the difficulty with ODBC may be that type.convert() really should not be called when both ends of the communication know that a double precision number is being exchanged. Paul If you later want to do arithmetic on them, you can choose to lose precision by using as.numeric() or use one of the large number packages on CRAN (GMP, int64, bit64, etc.). But once you've dropped the precision with as.numeric you can never get it back, which is why the previous behavior was clearly dangerous. I think I had some additional examples in the original bug/patch I filed about this issue a few years ago, but I'm unable to find it on bugs.r-project.org and its not referenced in the cl descriptions or news file. - Murray __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] NOTE when detecting mismatch in output, and codes for NOTEs, WARNINGs and ERRORs
On 04/10/2014 04:34 AM, Kirill Müller wrote: On 03/26/2014 06:46 PM, Paul Gilbert wrote: On 03/26/2014 04:58 AM, Kirill Müller wrote: Dear list It is possible to store expected output for tests and examples. From the manual: If tests has a subdirectory Examples containing a file pkg-Ex.Rout.save, this is compared to the output file for running the examples when the latter are checked. And, earlier (written in the context of test output, but apparently applies here as well): ..., these two are compared, with differences being reported but not causing an error. I think a NOTE would be appropriate here, in order to be able to detect this by only looking at the summary. Is there a reason for not flagging differences here? The problem is that differences occur too often because this is a comparison of characters in the output files (a diff). Any output that is affected by locale, node name or Internet downloads, time, host, or OS, is likely to cause a difference. Also, if you print results to a high precision you will get differences on different systems, depending on OS, 32 vs 64 bit, numerical libraries, etc. A better test strategy when it is numerical results that you want to compare is to do a numerical comparison and throw an error if the result is not good, something like r - result from your function rGood - known good value fuzz - 1e-12 #tolerance if (fuzz max(abs(r - rGood))) stop('Test xxx failed.') It is more work to set up, but the maintenance will be less, especially when you consider that your tests need to run on different OSes on CRAN. You can also use try() and catch error codes if you want to check those. Thanks for your input. To me, this is a different kind of test, Yes, if you meant that you intended to compare character output, it is a different kind of test. With a file in the tests/ directory of a package you can construct a test of character differences in individual commands with something like z1 - as.character(rnorm(5)) z2 - as.character(type.convert(z1)) if(any(z1 != z2)) stop(character differences exist.) for which no one would be required to make any changes to the existing package checking system. One caveat is output that is done as a side effect. For longer output streams from multiple commands you might construct your own testing with R CMD Rdiff. As you point out, adding something to flag different levels of severity for differences from a .Rout.save file would require some work by someone. HTH, Paul for which I'd rather use the facilities provided by the testthat package. Imagine a function that operates on, say, strings, vectors, or data frames, and that is expected to produce completely identical results on all platforms -- here, a character-by-character comparison of the output is appropriate, and I'd rather see a WARNING or ERROR if something fails. Perhaps this functionality can be provided by external packages like roxygen and testthat: roxygen could create the good output (if asked for) and set up a testthat test that compares the example run with the good output. This would duplicate part of the work already done by base R; the duplication could be avoided if there was a way to specify the severity of a character-level difference between output and expected output, perhaps by means of an .Rout.cfg file in DCF format: OnDifference: mute|note|warning|error Normalize: [R expression] Fuzziness: [number of different lines that are tolerated] On that note: Is there a convenient way to create the .Rout.save files in base R? By convenient I mean a single function call, not checking and manually copying as suggested here: https://stat.ethz.ch/pipermail/r-help/2004-November/060310.html . Cheers Kirill __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] type.convert and doubles
On 04/11/2014 01:43 PM, Simon Urbanek wrote: Greg, On Apr 11, 2014, at 11:50 AM, Gregory R. Warnes g...@warnes.net wrote: Hi All, I see this in the NEWS for R 3.1.0: type.convert() (and hence by default read.table()) returns a character vector or factor when representing a numeric input as a double would lose accuracy. Similarly for complex inputs. This behavior seems likely to surprise users. Can you elaborate why that would be surprising? It is consistent with the intention of type.convert() to determine the correct type to represent the value - it has always used character/factor as a fallback where native type doesn't match. Strictly speaking, I don't think this is true. If it were, it would not have been necessary to make the change so that it does now fallback to using character/factor. It may, however, have always been the intent. I don't really think a warning is necessary, but there are some surprises: str(type.convert(format(1/3, digits=17))) # R-3.0.3 num 0.333 str(type.convert(format(1/3, digits=17))) # R-3.1.0 Factor w/ 1 level 0.1: 1 Now you could say that one should never do that, and the change is just flushing out a bug that was always there. But the point is that in serialization situations there can be some surprises. So, for example, RODBC talking to PostgresSQL databases is now returning factors rather than numerics for double precision fields, whereas with RPostgresSQL the behaviour has not changed. Paul It has never issued any warning in that case historically, so IMHO it would be rather surprising if it did now… Cheers, Simon Would it be possible to issue a warning when this occurs? Aside: I’m very happy to see the new ’s’ and ‘f’ browser (debugger) commands! -Greg [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Varying results of package checks due to random seed
On 03/22/2014 01:32 PM, Radford Neal wrote: From: Philippe GROSJEAN philippe.grosj...@umons.ac.be ... for latest CRAN version, we have successfully installed 4999 packages among the 5321 CRAN package on our platform. ... It is strange that a large portion of R CMD check errors on CRAN occur and disappear *without any version update* of a package or any of its direct or indirect dependencies! That is, a fraction of errors or warnings seem to appear and disappear without any code update. Some of these are likely the result of packages running tests using random number generation without setting the random numbers seed, in which case the seed is set based on the current time and process id, with an obvious possibility of results varying from run to run. In the current development version of pqR (in branch 19-mods, found at https://github.com/radfordneal/pqR/tree/19-mods), I have implemented a change so that if the R_SEED environment variable is set, the random seed is initialized to its value, rather than from the time and process id. This was motivated by exactly this problem - I can now just set R_SEED to something before running all the package checks. Beware, if you are serious about reproducing things, that you really need to save information about the uniform and other generators you use, such as the normal generator. The defaults do not change often, but have in the past, and could in the future if something better comes along. There are some small utilities and examples in the package setRNG which can help. Also remember that you need to beware of a side effect of the environment variable approach. It is great for reproducing things, as you would want to do in package tests, but be careful how you use it in functions as it may mess up the randomness if you always set the seed to the same starting value. Paul Radford Neal __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] No repository set, so cyclic dependency check skipped
When checking a package I am getting * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped How/where do I set the repository so I don't get this note? No doubt this is explained in Writing R Extension, but I have not found it. Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] No repository set, so cyclic dependency check skipped
On 01/26/2014 12:31 PM, Uwe Ligges wrote: On 26.01.2014 17:52, Paul Gilbert wrote: When checking a package I am getting * checking package dependencies ... NOTE No repository set, so cyclic dependency check skipped How/where do I set the repository so I don't get this note? Set a repository (e.g,. via optiopns(repos=) in your .Rprofile. I'm getting this note when I check in R-devel on your win-builder site. Does that mean I need to set .Rprofile or you do? Best, Paul Best, Uwe Ligges No doubt this is explained in Writing R Extension, but I have not found it. Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Depending/Importing data only packages
Would Suggests not work in this situation? I don't understand why you would need Depends. In what sense do you rely on the data only package? Paul On 13-12-06 04:20 PM, Hadley Wickham wrote: Hi all, What should you do when you rely on a data only package. If you just Depend on it, you get the following from R CMD check: Package in Depends field not imported from: 'hflights' These packages needs to imported from for the case when this namespace is loaded but not attached. But there's nothing in the namespace to import, so adding it to imports doesn't seem like the right answer. Is that just a spurious note? Hadley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Depending/Importing data only packages
On 13-12-07 01:47 PM, Gabor Grothendieck wrote: On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert pgilbert...@gmail.com wrote: On 13-12-07 12:19 PM, Gábor Csárdi wrote: I don't know about this particular case, but in general it makes sense to rely on a data package. E.g. I am creating a package that does Bayesian inference for a particular problem, potentially relying on prior knowledge. I think it makes sense to put the data that is used to calculate the prior into another package, because it will be larger than the code, and it does not change that often. Gabor On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert pgilbert...@gmail.com wrote: Would Suggests not work in this situation? I don't understand why you would need Depends. In what sense do you rely on the data only package? HW Because I want someone who downloads the package to be able to run HW the examples without having to take additional action. HW HW Hadley I went through this myself, including thinking it was a nuisance for users to need to attach other packages to run examples. In the end I decided it is not so bad to be explicit about what package the example data comes from, so illustrate it in the examples. Users may not always want this data, and other packages that build on yours probably do not want it. Even in the Bayesian inference case pointed out by Gábor, I am not convinced. It means the prior knowledge base cannot be exchanged for another one. The package would be more general if it allowed the possibility of attaching a different database of prior information. But this is clearly a more important case, since the code probably does not work without some database. (There are a few other situations where something like RequireOneOf: would be useful.) Requiring users to load packages which could be loaded automatically seems to go against ease of use. Its just one more thing that they have to remember to do. It really should be possible to write a batteries included package while leveraging off of other packages. Just to be clear, I distinguish the batteries included situation from the spare batteries included situation. I think it should be possible to automatically load everything that is really needed, that is why I think the Bayesian database is a more important case. But it strikes me as bad to attach everything that could ever possibly be wanted by a user. After all, it would be possible to automatically attach all packages. Some packages seemed to be headed in that direction before the new rules started to be enforced. There is certainly a trade-off here between ease of use, not needing the user to attach packages, and namespace conflicts, which will result in time and difficulty debugging. For packages that no one ever uses in other packages, there would be a tendency to lean toward ease of use. But as soon as anyone starts building on top of a package with another one, I think that avoiding potential conflicts will dominate. Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Depending/Importing data only packages
On 13-12-07 05:21 PM, Hadley Wickham wrote: The Writing R Extensions manual says that Suggests is for packages which are required only for examples, which I believe matches Hadley's original question. Yes, but without this package they won't be able to run the majority of examples, which I think delivers a poor experience to the user. It also means I have to litter my examples with if(require(x)), I think you just need require(x) or library(x). If it is in Suggests then it is available whenever examples are tested, so you don't need the if(). In my opinion, this increases the signal by indicating to the reader where the data comes from. decreasing the signal to noise ratio in the examples. But we're getting a bit far from my original question about the NOTE: Package in Depends field not imported from: 'hflights' These packages needs to imported from for the case when this namespace is loaded but not attached. Depending on (or linking to) a package is not just about making the functions in the package available. Several of us used to think that, but the modern interpretation seems to be just about making things in the package yours depends on available to users of your package. Exports: might be a better term than Depends:, at least if Depends: was not trying to mean both Imports: and Exports:. Paul Hadley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Where to drop a python script?
Jonathan Below is a python script I have been playing with for extracing some system information and additional information about modules that I need to import in my TSjson package. My intention is to put R code in tests/ that will check this before going on to do other tests, but I have not yet done that. The reason for this is that error message are not especially enlightening when other tests fail because python is not working as needed. Please let me know if you find improvements. Paul __ def test(): try: import sys have_sys = True except: have_sys = False if sys.version_info = (3, 0): return dict( error= TSjson requires Python 2. Running + str(sys.version_info)) # mechanize is not (yet) available for Python 3, # Also, urllib2 is split into urllib.request, urllib.error in Python 3 try: import urllib2 have_urllib2 = True except: have_urllib2 = False try: import re have_re = True except: have_re = False try: import csv have_csv = True except: have_csv = False try: import mechanize have_mechanize = True except: have_mechanize = False if (have_sys have_urllib2 have_re have_csv have_mechanize): err = 0 else: err = 1 return dict( exit=err, have_sys=have_sys, have_urllib2=have_urllib2, have_re = have_re, have_csv = have_csv, have_mechanize = have_mechanize) try: import json print(json.JSONEncoder().encode(test())) except: print(dict(exit=1, have_json = False)) On 13-11-01 10:17 AM, Jonathan Greenberg wrote: This was actually the little script I was going to include (prompting me to ask the question): a test for the python version number. Save this (between the ***s) as e.g. python_version.py: *** import sys print(sys.version_info) *** I've done almost no python coding, so I was going to call this with a system(/pathto/python /pathto/python_version.py,intern=TRUE) call and post-process the one-line text output. --j On Thu, Oct 31, 2013 at 12:45 PM, Paul Gilbert pgilbert...@gmail.com mailto:pgilbert...@gmail.com wrote: On 13-10-31 01 tel:13-10-31%2001:16 PM, Prof Brian Ripley wrote: On 31/10/2013 15:33, Paul Gilbert wrote: On 13-10-31 03 tel:13-10-31%2003:01 AM, Prof Brian Ripley wrote: On 31/10/2013 00:40, Paul Gilbert wrote: The old convention was that it went in the exec/ directory, but as you can see at http://cran.at.r-project.org/__doc/manuals/r-devel/R-exts.__html#Non_002dR-scripts-in-__packages http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages it can be in inst/anyName/. A minor convenience of exec/ is that the directory has the same name in source and when installed, whereas inst/anyName gets moved to anyName/, so debugging can be a tiny bit easier with exec/. Having just put a package (TSjson) on CRAN with a python script, here are a few other pointers for getting it on CRAN: -SystemRequirements: should indicate if a particular version of python is needed, and any non-default modules that are needed. (My package does not work with Python 3 because some modules are not available.) Some of the libraries have changed, so it could be a bit tricky to make something work easily with both 2 and 3. -You need a README to explain how to install Python. (If you look at or use mine, please let me know if you find problems.) Better to describe exactly what you need: installation instructions go stale very easily. -The Linux and Sun CRAN test machines have Python 2 whereas winbuilder has Python 3. Be prepared to explain that the package will not work on one or the other. Not true. Linux and Solaris (sic) have both: the Solaris machines have 2.6 and 3.3. For an R package how does one go about specifying which should be used? You ask the user to tell you the path or at least the command name, e.g. by an environment variable or R function argument. Just like any other external
Re: [Rd] Where to drop a python script?
On 13-10-31 03:01 AM, Prof Brian Ripley wrote: On 31/10/2013 00:40, Paul Gilbert wrote: The old convention was that it went in the exec/ directory, but as you can see at http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages it can be in inst/anyName/. A minor convenience of exec/ is that the directory has the same name in source and when installed, whereas inst/anyName gets moved to anyName/, so debugging can be a tiny bit easier with exec/. Having just put a package (TSjson) on CRAN with a python script, here are a few other pointers for getting it on CRAN: -SystemRequirements: should indicate if a particular version of python is needed, and any non-default modules that are needed. (My package does not work with Python 3 because some modules are not available.) Some of the libraries have changed, so it could be a bit tricky to make something work easily with both 2 and 3. -You need a README to explain how to install Python. (If you look at or use mine, please let me know if you find problems.) Better to describe exactly what you need: installation instructions go stale very easily. -The Linux and Sun CRAN test machines have Python 2 whereas winbuilder has Python 3. Be prepared to explain that the package will not work on one or the other. Not true. Linux and Solaris (sic) have both: the Solaris machines have 2.6 and 3.3. For an R package how does one go about specifying which should be used? Please do not spread misinformation about machines you do not have any access to. Another option to system() is pipe() Paul On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote: On 30 October 2013 at 13:54, Jonathan Greenberg wrote: | R-developers: | | I have a small python script that I'd like to include in an R package I'm | developing, but I'm a bit unclear about which subfolder it should go in. R | will be calling the script via a system() call. Thanks! Up to you as you control the path. As Writing R Extensions explains, everything below the (source) directory inst/ will get installed. I like inst/extScripts/ (or similar) as it denotes that it is an external script. As an example, the gdata package has Perl code for xls reading/writing below a directory inst/perl/ -- and I think there are more packages doing this. Dirk __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Where to drop a python script?
On 13-10-31 01:16 PM, Prof Brian Ripley wrote: On 31/10/2013 15:33, Paul Gilbert wrote: On 13-10-31 03:01 AM, Prof Brian Ripley wrote: On 31/10/2013 00:40, Paul Gilbert wrote: The old convention was that it went in the exec/ directory, but as you can see at http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages it can be in inst/anyName/. A minor convenience of exec/ is that the directory has the same name in source and when installed, whereas inst/anyName gets moved to anyName/, so debugging can be a tiny bit easier with exec/. Having just put a package (TSjson) on CRAN with a python script, here are a few other pointers for getting it on CRAN: -SystemRequirements: should indicate if a particular version of python is needed, and any non-default modules that are needed. (My package does not work with Python 3 because some modules are not available.) Some of the libraries have changed, so it could be a bit tricky to make something work easily with both 2 and 3. -You need a README to explain how to install Python. (If you look at or use mine, please let me know if you find problems.) Better to describe exactly what you need: installation instructions go stale very easily. -The Linux and Sun CRAN test machines have Python 2 whereas winbuilder has Python 3. Be prepared to explain that the package will not work on one or the other. Not true. Linux and Solaris (sic) have both: the Solaris machines have 2.6 and 3.3. For an R package how does one go about specifying which should be used? You ask the user to tell you the path or at least the command name, e.g. by an environment variable or R function argument. Just like any other external program such as GhostScript. Yes, but since I don't have direct access to the CRAN test machines, specifically, on the CRAN test machines, how do I specify to use Python 2 or Python 3? (That is, I think you are the user when CRAN tests are done on Solaris, so I am asking you.) Please do not spread misinformation about machines you do not have any access to. Another option to system() is pipe() Paul On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote: On 30 October 2013 at 13:54, Jonathan Greenberg wrote: | R-developers: | | I have a small python script that I'd like to include in an R package I'm | developing, but I'm a bit unclear about which subfolder it should go in. R | will be calling the script via a system() call. Thanks! Up to you as you control the path. As Writing R Extensions explains, everything below the (source) directory inst/ will get installed. I like inst/extScripts/ (or similar) as it denotes that it is an external script. As an example, the gdata package has Perl code for xls reading/writing below a directory inst/perl/ -- and I think there are more packages doing this. Dirk __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Where to drop a python script?
The old convention was that it went in the exec/ directory, but as you can see at http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages it can be in inst/anyName/. A minor convenience of exec/ is that the directory has the same name in source and when installed, whereas inst/anyName gets moved to anyName/, so debugging can be a tiny bit easier with exec/. Having just put a package (TSjson) on CRAN with a python script, here are a few other pointers for getting it on CRAN: -SystemRequirements: should indicate if a particular version of python is needed, and any non-default modules that are needed. (My package does not work with Python 3 because some modules are not available.) Some of the libraries have changed, so it could be a bit tricky to make something work easily with both 2 and 3. -You need a README to explain how to install Python. (If you look at or use mine, please let me know if you find problems.) -The Linux and Sun CRAN test machines have Python 2 whereas winbuilder has Python 3. Be prepared to explain that the package will not work on one or the other. Another option to system() is pipe() Paul On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote: On 30 October 2013 at 13:54, Jonathan Greenberg wrote: | R-developers: | | I have a small python script that I'd like to include in an R package I'm | developing, but I'm a bit unclear about which subfolder it should go in. R | will be calling the script via a system() call. Thanks! Up to you as you control the path. As Writing R Extensions explains, everything below the (source) directory inst/ will get installed. I like inst/extScripts/ (or similar) as it denotes that it is an external script. As an example, the gdata package has Perl code for xls reading/writing below a directory inst/perl/ -- and I think there are more packages doing this. Dirk __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] advise on Depends
On 13-10-25 05:21 PM, Henrik Bengtsson wrote: On Fri, Oct 25, 2013 at 1:39 PM, John Chambers j...@r-project.org wrote: One additional point to Michael's summary: The methods package itself should stay in Depends:, to be safe. It would be nice to have more detail about when this is necessary, rather than suggested as a general workaround. I thought the principle of putting things in Imports was that it is safer. I have methods listed in Imports rather than Depends in 16 of my packages, doing roughly what was the basis for the original question, and I am not aware of a problem, yet. Paul There are a number of function calls to the methods package that may be included in generated methods for user classes. These have not been revised to work when the methods package is not attached, so importing the package only may run into problems. This has been an issue, for example, in using Rscript. To clarify that last sentence for those not aware (and hopefully spare someone having to troubleshoot this), executing R scripts/expressions using 'Rscript' rather than 'R' differs by which packages are attached by default. Example: % Rscript -e search() [1] .GlobalEnvpackage:stats package:graphics [4] package:grDevices package:utils package:datasets [7] Autoloads package:base % R --quiet -e search() search() [1] .GlobalEnvpackage:stats package:graphics [4] package:grDevices package:utils package:datasets [7] package:methods Autoloads package:base Note how 'methods' is not attached when using Rscript. This is explained in help(Rscript), help(options), and in 'R Installation and Administration'. /Henrik John On Oct 25, 2013, at 11:26 AM, Michael Lawrence lawrence.mich...@gene.com wrote: On Wed, Oct 23, 2013 at 8:33 PM, Kasper Daniel Hansen kasperdanielhan...@gmail.com wrote: This is about the new note Depends: includes the non-default packages: ‘BiocGenerics’ ‘Biobase’ ‘lattice’ ‘reshape’ ‘GenomicRanges’ ‘Biostrings’ ‘bumphunter’ Adding so many packages to the search path is excessive and importing selectively is preferable. Let us say my package A either uses a class B (by producing an object that has B embedded as a slot) from another package or provides a specific method for a generic defined in another package (both examples using S4). In both case my impression is that best practices is I ought to Depend on such a package, so it is a available at run time to the user. For classes, you just need to import the class with importClassesFrom(). For generics, as long as your package exports the method with exportMethods(), the generic will also be exported from your package, regardless of whether the defining package is attached. And the methods from the loaded-but-not-attached packages are available for the generic. So neither of these two is really a problem. The rationale for Depends is that the user might always want to use functions defined by another package with objects consumed/produced by your package, such as generics for which your package has not defined any methods. For example, rtracklayer Depends on GenomicRanges, because it imports objects from files as GenomicRanges objects. So just consider what the user sees when looking at your API. What's private, what's public? Michael Comments? Best, Kasper [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Question about selective importing of package functions...
On 13-10-20 04:58 PM, Gabor Grothendieck wrote: On Sun, Oct 20, 2013 at 4:49 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 13-10-20 4:43 PM, Jonathan Greenberg wrote: I'm working on an update for my CRAN package spatial.tools and I noticed a new warning when running R CMD CHECK --as-cran: * checking CRAN incoming feasibility ... NOTE Maintainer: 'Jonathan Asher Greenberg spatial-to...@estarcion.net' Depends: includes the non-default packages: 'sp' 'raster' 'rgdal' 'mmap' 'abind' 'parallel' 'foreach' 'doParallel' 'rgeos' Adding so many packages to the search path is excessive and importing selectively is preferable. Is this a warning that would need to be fixed pre-CRAN (not really sure how, since I need functions from all of those packages)? Is there a way to import only a single function from a package, if that function is a dependency? You really want to use imports. Those are defined in the NAMESPACE file; you can import everything from a package if you want, but the best style is in fact to just import exactly what you need. This is more robust than using Depends, and it doesn't add so much to the user's search path, so it's less likely to break something else (e.g. by putting a package on the path that masks some function the user already had there.) That may answer the specific case of the poster but how does one handle the case where one wants the user to be able to access the functions in the dependent package. There are two answers to this, depending on how much of the dependent package you want to make available to the user. If you want most of that package to be available then this is the (only?) exception to the rule. From Writing R Extensions: Field ‘Depends’ should nowadays be used rarely, only for packages which are intended to be put on the search path to make their facilities available to the end user (and not to the package itself): for example it makes sense that a user of package latticeExtra would want the functions of package lattice made available. If you really only want to make a couple of functions available then you can import and export the functions. Currently this has the unfortunate side effect that you need to document the functions, you cannot just re-direct to the documentation in the imported package, at least, I have not figured out how to do that. Paul For example, sqldf depends on gsubfn which provides fn which is used with sqldf to perform substitutions in the SQL string. library(sqldf) tt - 3 fn$sqldf(select * from BOD where Time $tt) I don't want to ask the user to tediously issue a library(gsubfn) too since fn is frequently needed and for literally years this has not been necessary. Also I don't want to duplicate fn's code in sqldf since that makes the whole thing less modular -- it would imply having to change fn in two places if anything in fn changed. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problems when moving to Imports from Depends
On 13-09-27 06:05 PM, Peter Langfelder wrote: On Fri, Sep 27, 2013 at 2:50 PM, Kasper Daniel Hansen kasperdanielhan...@gmail.com wrote: Peter, This is a relatively new warning from R CMD check (for some definition of new). The authors of Hmisc have clearly not yet gone through the process of cleaning it up, as you are doing right now (and there are many other packages that still need to address this, including several of mine). Given who the authors are of Hmisc, I would suggest writing to them and ask them to look into this, and ask for a time estimate. thanks for the suggestion, but I must be missing something: since Hmisc imports survival (as well as Depends: on it), what can Hmisc change to make the survival functionality visible to my package? The terminology around imports has had many of us confused. (My copy of) Hmisc has survival in both Imports: and Depends: in the DESCRIPTION file (for which they will now be getting flagged) but it does not have it in the NAMSPACE file, which it needs, whether it is in Depends: or Imports: (and for which they are getting another flag). When this is fixed then the Hmisc function rcorr.cens will look at its own NAMSPACE determined path for finding functions, and find is.Surv. As Kasper pointed out, this is not really your problem, except of course that you need to work around the Hmisc problem. Until Hmisc is fixed, I think you have the option of adding survival to Depends:, or leaving Hmisc in Depends:. (I would be inclined to leave it the way you had it until packages further down the chain are fixed.) Paul In the meantime, you may have to do something about this, and whatever you do I would suggest following the Hmisc package and undo it as soon as possible, as the right thing is to fix Hmisc. Having said that, it is not clear to me that you can easily solve this yourself, because I don't think that putting survival into your own imports will make the package available to Hmisc functions, but it is not impossible there is some way around it. Well, as I said, things work fine if I leave Hmisc in the Depends: field, which, however, is against CRAN policy. The trouble is that I don't have a good way of checking whether something breaks by moving a package from Depends into Imports... Peter __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Capture output of install.packages (pipe system2)
On 13-09-23 08:20 PM, Hadley Wickham wrote: Brian Ripley's reply describes how it is done in the tools package. For example, as I sent privately to Jeroen, x - system2(Rscript, -e \install.packages('MASS', repos='http://probability.ca/cran')\, stdout=TRUE, stderr=TRUE) captures all of the output from installing MASS. As Jeroen pointed out, that isn't identical to running install.packages() in the current session; a real version of it should fill in more of the arguments, not leave them at their defaults. It does seems a little crazy that you're in a R process, then open another one, which then opens a 3rd session! (often indirectly by calling R CMD install which then calls an internal function in tools) It does seem very much more straight forward to do this in the process above R: R --vanilla --slave -e install.packages('whatever', repo='http://cran.r-project.org') R.out 21 (Omit mailer wrap.) Your mileage may vary depending on your OS. Paul Hadley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Design for classes with database connection
Simon Your idea to use SQLite and the nature of some of the sorting and extracting you are suggesting makes me wonder why you are thinking of R data structures as the home for the data storage. I would be inclined to put the data in an SQL database as the prime repository, then extract parts you want with SQL queries and bring them into R for analysis and graphics. If the full data set is large, and the parts you want to analyze in R at any one time are relatively small, then this will be much faster. After all, SQL is primarily for databases, whereas R's strength is more in statistics and graphics. In the project http://tsdbi.r-forge.r-project.org/ I have code that does some of the things you probably want. There the focus is on a single identifier for a series, and various observation frequencies are supported. Tick data is supported (as time stamped data) but not extensively tested as I do not work with tick data much. There is a function TSquery, currently in TSdbi on CRAN but very shortly being split with the SQL specific parts of the interface into a package TSsql. It is very much like the queries you seem to have in mind, but I have not used it with tick data. It is used to generate a time series by formulating a query to a database with several possible sorting fields, very much like you describe, and then order the data according to the time index. If your data set is large, then you need to think carefully about which fields you index. You certainly do not want to be building the indexes on the fly, as you would need to do if you dump all the data out of R into an SQL db just to do a sort. If the data set is small then indexing does not matter too much. Also, for a small data set there will be much less advantage of keeping the data in an SQL db rather than in R. You do need to be a bit more specific about what huge means. (Tick data for 5 days or 20 years? A 100 IDs or 10 million?) Large for an R structure is not necessarily large for an SQL db. With more specifics I might be able to give more suggestions. (R-SIG-DB may be a better forum for this discussion.) HTH, Paul On 13-09-18 01:06 PM, Simon Zehnder wrote: Dear R-Devels, I am designing right now a package intended to simplify the handling of market microstructure data (tick data, order data, etc). As these data is most times pretty huge and needs to be reordered quite often (e.g. if several security data is batched together or if only a certain time range should be considered) - the package needs to handle this. Before I start, I would like to mention some facts which made me decide to construct an own package instead of using e.g. the packages bigmemory, highfrequency, zoo or xts: AFAIK big memory does not provide the opportunity to handle data with different types (timestamp, string and numerics) and their appropriate sorting, for this task databases offer better tools. Package highfrequency is designed to work specifically with a certain data structure and the data in market microstructure has much greater versatility. Packages zoo and xts offer a lot of versatility but do not offer the data sorting ability needed for such big data. I would like to get some feedback in regard to my decision and in regard to the short design overview following. My design idea is now: 1. Base the package on S4 classes, with one class that handles data-reading from external sources, structuring and reordering. Structuring is done in regard to specific data variables, i.e. security ID, company ID, timestamp, price, volume (not all have to be provided, but some surely exist on market microstructure data). The less important variables are considered as a slot @other and are only ordered in regard to the other variables. Something like this: .mmstruct - setClass('mmstruct', representation( name = character, index= array, N = integer, K= integer, compiD = array, secID = array, tradetime = POSIXlt, flag = array, price= array, vol= array, other = data.frame)) 2. To enable a lightweight ordering function, the class should basically create an SQLite database on construction and delete it if 'rm()' is called. Throughout its life an object holds the database path and can execute queries on the database tables. By this, I can use the table sorting of SQLite (e.g. by constructing an index for each important variable). I assume this is faster and more efficient than programming something on my own - why reinventing the wheel? For this I would use VIRTUAL classes like: .mmstructBASE - setClass('mmstructBASE', representation( dbName = character, dbTable = character)) .mmstructDB - setClass('mmstructDB', representation( conn = SQLiteConnection), contains = c(mmstructBASE)) .mmstruct - setClass('mmstruct', representation( name = character, index= array, N = integer, K
[Rd] helping R-forge build
(subject changed from Re: [Rd] declaring package dependencies ) ... Yes useful. But that includes a package build system (which is what breaks on R-Forge). If you could do that on a six-pack then could you fix R-Forge on a three-pack first please? The R-Forge build system is itself an open source package on R-Forge. Anyone can look at it, understand it and change it to be more stable. That build system is here : https://r-forge.r-project.org/R/?group_id=34 (I only know this because Stefan told me once. So I suspect others don't know either, or it hasn't sunk in that we're pushing on an open door.) Matthew Open code is necessary, but to debug one needs access to logs, etc, to see where it is breaking. Do you know how to find that information? (And, BTW, there are also tools to help automatically build R and test packages at http://automater.r-forge.r-project.org/ .) Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] FOSS licence with BuildVignettes: false
On 13-09-16 05:19 AM, Uwe Ligges wrote: ... Yes, and I could see really rare circumstances where vignette building takes a long time and the maintainer decides not to build vignettes as part of the daily checks. ... I thought 'BuildVignettes: FALSE' only turns of assembling the pdf, all the code is still run. I don't think that would affect the time very much. Am I wrong (again)? Paul Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] declaring package dependencies
On 13-09-14 07:20 PM, Duncan Murdoch wrote: On 13-09-14 12:19 PM, Paul Gilbert wrote: On 13-09-14 09:04 AM, Duncan Murdoch wrote: On 13-09-13 12:00 PM, Dirk Eddelbuettel wrote: On 13 September 2013 at 11:42, Paul Gilbert wrote: | On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote: | It's not so much Rcpp itself or my 20-ish packages but the fact that we (as | in the Rcpp authors) now stand behind an API that also has to accomodate | changes in R CMD check. Case in point is current (unannounced) change that | makes all Depends: Rcpp become Imports: Rcpp because of the NAMESPACE checks. | | I am a bit confused by this Dirk, so maybe I am missing something. I | think this is still a Note in R-devel so you do have some time to make | the change, at least several months, maybe more. It is not quite what I | think of as an announcement, more like a shot across the bow, but it | is also not unannounced. One package author [as in user of Rcpp and not an author of it] was told by CRAN this week to change his package and came to me for help -- so in that small way the CRAN non-communication policy is already creating more work for me, and makes me look silly as I don't document what Rcpp-using packages need as I sadly still lack the time machine or psychic powers to infer what may get changed this weekend. | More importantly, I don't think that the requirement is necessarily to | change Depends: Rcpp to Imports: Rcpp, the requirement is to put | imports(Rcpp) in the NAMESPACE file. I think this is so that the package | continues to work even if the user does something with the search path. | The decision to change Depends: Rcpp to Imports: Rcpp really depends on | whether the package author wants Rcpp functions to be available directly Rcpp is a bit of an odd-ball as you mostly need it at compile-time, and you require very few R-level functions (but there is package initialization etc pp). We also only about two handful of functions, and those are for functionality not all 135 packages use (eg Modules etc). But the focus here should not be on my hobby package. The focus needs to be on how four CRAN maintainers (who do a boatload of amazing work which is _truly_ appreciated in its thoroughness and reach) could make the life of authors of 4800+ packages easier by communicating and planning a tad more. Let me paraphrase that: The CRAN maintainers do a lot of work, and it helps me a lot, but if they only did a little bit more work it would help me even more. I suspect they'd be more receptive to suggestions that had them doing less work, not more. Actually, this is one of the parts that I do not understand. It seems to me that it would be a lot less work for CRAN maintainers if the implications and necessary changes to packages were explained a bit more clearly in a forum like R-devel that many package developers actually read regularly. Then why don't you explain them? They aren't secret. Well, I have been trying to do that on this and related threads over the past few weeks. But there is a large credibility difference between my explanation of something I am just learning about myself and an explanation by a core member or CRAN maintainer of something they have implemented. (At least, I hope most readers of this list know there is a difference.) I many not fully understand how much of the response to package submission gets done automatically, but I do get the sense that there is a fairly large amount of actual human time spent dealing with just my submissions alone. If that is representative of all developers, then CRAN maintainers don't have time to do much else. (The fact that they do much more suggests I may not be representative.) Two specific points have already been mentioned implicitly. CRAN submission testing is often done at a higher/newer level using the latest devel version. This results in lots of rejections for things that I would fix before submission, if I knew about them. Then why don't you test against R-devel before submitting? I have been relying on R-forge to provide that testing. One practical suggestion in this thread (Matthew Dowle) was to test with win-builder R-devel. This needs to be amplified. I had thought of win-builder as a mechanism to test on Windows, since I rarely work on that platform. Following the CRAN submission guidelines I test on win-builder if I am not doing the Windows testing on my own machine and the R-forge results are not available. (I think for a single package they are equivalent when R-forge is working.) But on win-builder I have usually used the R-release directory. Using the R-devel directory has the advantage that it gives an as-cran test that is almost up-to-date with the one against which the package is tested when it is submitted. Another feature of win-builder that I had not recognized is that submitted packages are available in its library for a short time, so packages with version dependencies can
Re: [Rd] declaring package dependencies
On 13-09-14 09:04 AM, Duncan Murdoch wrote: On 13-09-13 12:00 PM, Dirk Eddelbuettel wrote: On 13 September 2013 at 11:42, Paul Gilbert wrote: | On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote: | It's not so much Rcpp itself or my 20-ish packages but the fact that we (as | in the Rcpp authors) now stand behind an API that also has to accomodate | changes in R CMD check. Case in point is current (unannounced) change that | makes all Depends: Rcpp become Imports: Rcpp because of the NAMESPACE checks. | | I am a bit confused by this Dirk, so maybe I am missing something. I | think this is still a Note in R-devel so you do have some time to make | the change, at least several months, maybe more. It is not quite what I | think of as an announcement, more like a shot across the bow, but it | is also not unannounced. One package author [as in user of Rcpp and not an author of it] was told by CRAN this week to change his package and came to me for help -- so in that small way the CRAN non-communication policy is already creating more work for me, and makes me look silly as I don't document what Rcpp-using packages need as I sadly still lack the time machine or psychic powers to infer what may get changed this weekend. | More importantly, I don't think that the requirement is necessarily to | change Depends: Rcpp to Imports: Rcpp, the requirement is to put | imports(Rcpp) in the NAMESPACE file. I think this is so that the package | continues to work even if the user does something with the search path. | The decision to change Depends: Rcpp to Imports: Rcpp really depends on | whether the package author wants Rcpp functions to be available directly Rcpp is a bit of an odd-ball as you mostly need it at compile-time, and you require very few R-level functions (but there is package initialization etc pp). We also only about two handful of functions, and those are for functionality not all 135 packages use (eg Modules etc). But the focus here should not be on my hobby package. The focus needs to be on how four CRAN maintainers (who do a boatload of amazing work which is _truly_ appreciated in its thoroughness and reach) could make the life of authors of 4800+ packages easier by communicating and planning a tad more. Let me paraphrase that: The CRAN maintainers do a lot of work, and it helps me a lot, but if they only did a little bit more work it would help me even more. I suspect they'd be more receptive to suggestions that had them doing less work, not more. Actually, this is one of the parts that I do not understand. It seems to me that it would be a lot less work for CRAN maintainers if the implications and necessary changes to packages were explained a bit more clearly in a forum like R-devel that many package developers actually read regularly. I many not fully understand how much of the response to package submission gets done automatically, but I do get the sense that there is a fairly large amount of actual human time spent dealing with just my submissions alone. If that is representative of all developers, then CRAN maintainers don't have time to do much else. (The fact that they do much more suggests I may not be representative.) Two specific points have already been mentioned implicitly. CRAN submission testing is often done at a higher/newer level using the latest devel version. This results in lots of rejections for things that I would fix before submission, if I knew about them. If the tests were rolled out with R, and only later incorporated into CRAN submission testing, I think there would be a lot less work for the CRAN maintainers. (This is ignoring the possibility that CRAN submission is really the testing ground for the tests, and to prove the tests requires a fair amount of manual involvement. I'm happy to continue contributing to this -- I've often felt my many contribution is an endless supply of bugs for the checkers to catch.) The second point is that a facility like R-forge that runs the latest checks, on many platforms, is really useful in order to reduce work for both package developers and CRAN maintainers. With R-forge broken, the implication for additional work for CRAN maintainers seems enormous. But even with it working, not all packages are kept on R-forge, and with package version dependencies R-forge does not really work. (i.e. I have to get new versions of some packages onto CRAN before the new versions of other packages will build on R-forge.) Perhaps the package checking part of R-forge should be separated into a pre-submission clearing house to which packages are submitted. If they pass checks there then the package developer could click on a submit button to do the actual submission to CRAN. (Of course there needs to be a mechanism to plead for the fact that the test systems do not have needed resources.) Something like the daily, but with new pre-release versions of packages might actually be better than the R-forge
Re: [Rd] declaring package dependencies
On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote: On 13 September 2013 at 10:38, Duncan Murdoch wrote: | On 13/09/2013 10:18 AM, Dirk Eddelbuettel wrote: | On 13 September 2013 at 09:51, Duncan Murdoch wrote: | | Changes are generally announced in the NEWS.Rd file long before release, | | but R-devel is an unreleased version, so you won't see the news until it | | is there. Announcing things that nobody can try leads to fewer useful | | comments than putting them into R-devel where at least people can see | | what is really happening. | | That comment makes sense _in theory_. | | Yet _in practice_ it does not as many of us have been shot down by tests in | R-devel which had been implemented within a 48 hour window of the package | submission. | | It sounds as though you are talking about CRAN here, not R. I can't | speak for CRAN. Hah :) -- in practive you actually do as the service you built to create RSS summaries of R NEWS changes (ie R Core) is one good way to learn about CRAN changes as the CRAN folks use the R Core access to R itself (via R CMD check) to effect change. And yes: we all want change for the better. But we also want a more grown-up process. | Absent a time machine or psychic powers, I do not see how package developers | can reasonably be expected to cope with this. | | I'm a CRAN user as a package developer, and I do get emails about | changes, but I don't find them overwhelming, and I don't recall | receiving any that were irrational. Generally the package is improved | when I follow their advice. It has happened that I have been slower | than they liked in responding, but the world didn't end. Of course they improve. The long arc of history points to progress. Packages are better than they used to be (cf NAMESPACE discussion). Nobody disputes that. But what we take excpetion with is the _process_ and the matter in which changes are (NOT REALLY) communicated, or even announced with a windows. | I imagine Rcpp pushes the limits more than my packages do, but I think | most developers can cope. After all, the number of packages on CRAN is | increasing, not decreasing. It's not so much Rcpp itself or my 20-ish packages but the fact that we (as in the Rcpp authors) now stand behind an API that also has to accomodate changes in R CMD check. Case in point is current (unannounced) change that makes all Depends: Rcpp become Imports: Rcpp because of the NAMESPACE checks. I am a bit confused by this Dirk, so maybe I am missing something. I think this is still a Note in R-devel so you do have some time to make the change, at least several months, maybe more. It is not quite what I think of as an announcement, more like a shot across the bow, but it is also not unannounced. More importantly, I don't think that the requirement is necessarily to change Depends: Rcpp to Imports: Rcpp, the requirement is to put imports(Rcpp) in the NAMESPACE file. I think this is so that the package continues to work even if the user does something with the search path. The decision to change Depends: Rcpp to Imports: Rcpp really depends on whether the package author wants Rcpp functions to be available directly by users without them needing to specifically attach Rcpp. They are available with Depends but with Imports they are just used internally in the package. So, one of us is confused. Usually it is me. Paul Yet I cannot really talk to 135 packages using Rcpp as I have CRAN Policy document to point to. Dirk __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] declaring package dependencies
Michael (Several of us are struggling with these changes, so my comments are from the newly initiated point of view, rather than the fully knowledgeable.) On 13-09-12 09:38 AM, Michael Friendly wrote: I received the following email note re: the vcdExtra package A vcd update has shown that packages TIMP and vcdExtra are not declaring their dependence on colorspace/MASS: see http://cran.r-project.org/web/checks/check_results_vcdExtra.html But, I can't see what to do to avoid this, nor understand what has changed in R devel. Lots in this respect. Sure enough, CRAN now reports errors in examples using MASS::loglm(), using R Under development (unstable) (2013-09-11 r63906) Caesar.mod0 - loglm(~Infection + (Risk*Antibiotics*Planned), data=Caesar) Error: could not find function loglm In DESCRIPTION I have Depends: R (= 2.10), vcd, gnm (= 1.0.3) The modern way of thinking about this is that the Depends line should not have much in it, only things from other packages that you want directly available to the user. (There are a few other exceptions necessary for packages that have not themselves embraced the modern way.) Since you may want users of vcdExtra to automatically have access to functions in vcd, without needing to execute library(vcd), this classifies as one of the official exceptions and you probably want cvd in the Depends line. However, chances are that gnm should be in Imports:. If vcd is in the Depends line then it is automatically attached and your examples do not need library(vcd) or requires(vcd). The Note Unexported object imported by a ‘:::’ call: ‘vcd:::rootogram.default’ is harder to decide how to deal with. (This is sill just a note, but it looks to me like a note that will soon become a warning or error.) The simple solution is to export rootogram.default from vcd, but that exposes it to all users, and really you may just want to expose it to packages like vcdExtra. There was some recent discussion about this on R-devel. I suggested one possibility would be some sort of limited export. Since that was a suggestion that required work by someone else, it probably went the same place as most of those suggestion do. The solution I have adopted for the main case where this causes me problems is to split the classes, generics, and methods into one package, and the user functions into another. For example, if you have rootogram.default in a package called vcdClasses and exported it, then both vcd and vcdExtra can import it, but if it is not in their Depends line then it will not be visible to a user that executes library(vcd) or library(vcdExtra). Beware that there is currently a small gotcha if the generics are S3, which was discussed recently and a patch submitted by Henrik Bengtsson (See Re: [Rd] False warning on replacing previous import when re-exporting identical object .) Although there has been much moaning about these changes, including my own, I think the general logic is a real improvement. The way I think of it, the namespace imports for a package provide the equivalent of a search path for functions in the package, which is not changed by what packages a user or other packages attach or import. Thus a package developer has much more certain control over where the functions used by the package will come from. This is a trade-off for safety rather than convenience, thus the moaning. I am a complete newbie on this, but there seems to be a pretty good unofficial description at http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/. Suggests: ca,gmodels,Fahrmeir,effects,VGAM,plyr,rgl,lmtest,MASS,nnet,ggplot2,Sleuth2,car If it is only in Suggests you can refer to it in the example by MASS::loglm(), or require(MASS)/library(MASS). (I might have that wrong, at least one works but I'm not certain of both.) and the vcd DESCRIPTION has Depends: R (= 2.4.0), grid, stats Suggests: KernSmooth, mvtnorm, kernlab, HSAUR, coin Imports: utils, MASS, grDevices, colorspace Probably grid and stats should be in Imports. so, in an R 3.0.0 console, library(vcdExtra) loads vcd and its dependencies: library(vcdExtra) Loading required package: vcd Loading required package: MASS Loading required package: grid Loading required package: colorspace Loading required package: gnm Warning messages: 1: package ‘vcd’ was built under R version 3.0.1 2: package ‘MASS’ was built under R version 3.0.1 Note: these CRAN errors do not occur on R-Forge, using R version 3.0.1 Are you actually getting anything to build on R-forge? All my packages have been stuck for a couple of weeks, as have many others. Paul Patched (2013-08-21 r63645) and the latest devel version (0.5-11) of vcdExtra. -Michael __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] False warning on replacing previous import when re-exporting identical object
This is related to the recent thread on correct NAMESPACE approach when writing S3 methods. If your methods are S4 I think pkgB does not need to export the generic. Just export the method and everything works magically and your problem disappears. For S3 methods there seems to be the difficultly you describe. Of course, the difference between S3 and S4 on this appears somewhat bug like. (I have not tested all this very carefully so I may have something wrong.) Paul Henrik Bengtsson h...@biostat.ucsf.edu wrote: Hi, SETUP: Consider three packages PkgA, PkgB and PkgC. PkgA defines a generic function foo() and exports it; export(foo) PkgB imports PkgA::foo() and re-exports it; importFrom(PkgA, foo) export(foo) PkgC imports everything from PkgA and PkgB: imports(PkgA, PkgB) PROBLEM: Loading or attaching the namespace of PkgC will generate a warning: replacing previous import by 'PkgA::foo' when loading 'PkgC' This in turn causes 'R CMD check' on PkgC to generate a WARNING (no-go at CRAN): * checking whether package 'PkgC' can be installed ... WARNING Found the following significant warnings: Warning: replacing previous import by 'PkgA::foo' when loading 'CellularAutomaton' FALSE? Isn't it valid to argue that this is a false warning, because identical(PkgB::foo, PkgA::foo) is TRUE and therefore has no effect? /Henrik PS. The above can be avoided by using explicit importFrom() on PkgA and PkgB, but that's really tedious. In my case this is out of my reach, because I'm the author of PkgA and PkgB but not many of the PkgC packages. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] ‘:::’ call
I have a package (TSdbi) which provides end user functions that I export, and several utilities for plugin packages (e.g. TSMySQL) that I do not export because I do not intend them to be exposed to end users. I call these from the plugin packages using TSdbi::: but that now produces a note in the checks: * checking dependencies in R code ... NOTE Namespace imported from by a ‘:::’ call: ‘TSdbi’ See the note in ?`:::` about the use of this operator. :: should be used rather than ::: if the function is exported, and a package almost never needs to use ::: for its own functions. Is there a preferred method to accomplish this in a way that does not produce a note? Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ‘:::’ call
On 13-08-28 12:29 PM, Marc Schwartz wrote: On Aug 28, 2013, at 11:15 AM, Paul Gilbert pgilbert...@gmail.com wrote: I have a package (TSdbi) which provides end user functions that I export, and several utilities for plugin packages (e.g. TSMySQL) that I do not export because I do not intend them to be exposed to end users. I call these from the plugin packages using TSdbi::: but that now produces a note in the checks: * checking dependencies in R code ... NOTE Namespace imported from by a ‘:::’ call: ‘TSdbi’ See the note in ?`:::` about the use of this operator. :: should be used rather than ::: if the function is exported, and a package almost never needs to use ::: for its own functions. Is there a preferred method to accomplish this in a way that does not produce a note? Thanks, Paul Paul, See this rather lengthy discussion that occurred within the past week: https://stat.ethz.ch/pipermail/r-devel/2013-August/067180.html Regards, Marc Schwartz I did follow the recent discussion, but no one answered the question Is there a preferred method to accomplish this? (I suppose the answer is that there is no other way, given that no one actually suggested anything else.) Most of the on topic discussion in that thread was about how to subvert the CRAN checks, which is not what I am trying to do and was also pointed out as a bad idea by Duncan. The substantive response was r63654 has fixed this particular issue, and R-devel will no longer warn against the use of ::: on packages of the same maintainer. Regards, Yihui but that strikes me as a temporary work around rather than a real solution: suppose plugins are provided by a package from another maintainer. Since CRAN notes have a habit of becoming warnings and then errors, it seems useful to identify the preferred legitimate approach while this is still a note. That would save work for both package developers and CRAN maintainers. My thinking is that there is a need for a NAMESPACE directive something like limitedExport() that allows ::: for identified functions without provoking a CRAN complaint when packages use those functions. But there may already be a better way I don't know about. Or perhaps the solution is to split the end user functions and the utilities for plugin packages into two separate packages? Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ‘:::’ call
I may have confused things by referring to ':::' which everyone reads as not exported, not documented, not part of the API, constantly changing, ... In my mind, the real question is about two levels of exporting, one to other package developers, and another to end users. In both cases they are part of the API, relatively constant, and documented. (I try to document even internal functions, otherwise I can't remember what they do.) So far, I see three possible solutions: 1/ R adds another namespace directive allowing certain functions to be exported differently, possibly just by causing the checks to be silent about ::: when those functions are used in that way by other packages. 2/ The package gets split in two, one for use by other packages and one for use by end users. 3/ Some functions are exported normally but hidden by using . in the beginning of their names. Other package maintainers would know they exist, but end users would not so easily find them. (Duncan's other suggestion of using \keyword{internal} in the .Rd file strikes me as problematic. I'm surprised CRAN checks do not already object to functions exported and documented with \keyword{internal}.) Paul On 13-08-28 03:44 PM, Yihui Xie wrote: If this issue is going to be solved at all, it might end up as yet another hack like utils::globalVariables just to fix R CMD check which was trying to fix things that were not necessarily broken. To be clear, I was not suggesting subvert this check. What I was hoping is a way to tell CRAN that Yes, I have read the documentation; I understand the risk, and I want to take it like a moth flying into the flames. Many people have been talking about this risk, and how about some evidence? Who was bitten by :::? How many real cases in which a package was broken by :::? Yes, unexported functions may change, so are exported functions (they may change API, be deprecated, add new arguments, change defaults, and so on). Almost everything in a package is constantly evolving, and I believe the correct way (and the only way) to stop things from being broken is to write enough test cases. When something is broken, we will be able to know that. Yes, we may not have control over other people's packages, but we always have control over our own test cases. IMHO, testing is the justification of CRAN's reputation and quality, and that is a part of what CRAN does. In God we trust, and everyone else should bring tests. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA On Wed, Aug 28, 2013 at 1:50 PM, Paul Gilbert pgilbert...@gmail.com wrote: On 13-08-28 12:29 PM, Marc Schwartz wrote: On Aug 28, 2013, at 11:15 AM, Paul Gilbert pgilbert...@gmail.com wrote: I have a package (TSdbi) which provides end user functions that I export, and several utilities for plugin packages (e.g. TSMySQL) that I do not export because I do not intend them to be exposed to end users. I call these from the plugin packages using TSdbi::: but that now produces a note in the checks: * checking dependencies in R code ... NOTE Namespace imported from by a ‘:::’ call: ‘TSdbi’ See the note in ?`:::` about the use of this operator. :: should be used rather than ::: if the function is exported, and a package almost never needs to use ::: for its own functions. Is there a preferred method to accomplish this in a way that does not produce a note? Thanks, Paul Paul, See this rather lengthy discussion that occurred within the past week: https://stat.ethz.ch/pipermail/r-devel/2013-August/067180.html Regards, Marc Schwartz I did follow the recent discussion, but no one answered the question Is there a preferred method to accomplish this? (I suppose the answer is that there is no other way, given that no one actually suggested anything else.) Most of the on topic discussion in that thread was about how to subvert the CRAN checks, which is not what I am trying to do and was also pointed out as a bad idea by Duncan. The substantive response was r63654 has fixed this particular issue, and R-devel will no longer warn against the use of ::: on packages of the same maintainer. Regards, Yihui but that strikes me as a temporary work around rather than a real solution: suppose plugins are provided by a package from another maintainer. Since CRAN notes have a habit of becoming warnings and then errors, it seems useful to identify the preferred legitimate approach while this is still a note. That would save work for both package developers and CRAN maintainers. My thinking is that there is a need for a NAMESPACE directive something like limitedExport() that allows ::: for identified functions without provoking a CRAN complaint when packages use those functions. But there may already be a better way I don't know about. Or perhaps the solution is to split the end user functions and the utilities for plugin
Re: [Rd] ‘:::’ call
On 13-08-28 05:13 PM, Hadley Wickham wrote: 3/ Some functions are exported normally but hidden by using . in the beginning of their names. Other package maintainers would know they exist, but end users would not so easily find them. (Duncan's other suggestion of using \keyword{internal} in the .Rd file strikes me as problematic. I'm surprised CRAN checks do not already object to functions exported and documented with \keyword{internal}.) Why? I think this is exactly the use case of \keyword{internal}. From Writing R extensions The special keyword ‘internal’ marks a page of internal objects that are not part of the package’s API which suggests to me that a function with \keyword{internal} should not be exported, since that makes it part of the API. And, if it is really for internal use in a package, why would you export it? I think you are interpreting internal to mean internal to a group of packages, not internal to a package. But that is just the complement of what I am saying: there may be a need for two levels of export. (Also, if you export it then you should document it, but for many maintainers \keyword{internal} is shorthand for I don't need to document this properly because no one is suppose to use it outside the package.) Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Correct NAMESPACE approach when writing an S3 method for a generic in another package
On 13-08-26 12:04 PM, Gavin Simpson wrote: Right Henrik, but then you have to document it or R CMD check raises a Warning, which is less likely to pass muster when submitting to CRAN. So you document that method on your existing method's Rd page (just via an \alias{}), which is fine until the user does end up attaching the original source of the method, and then you get the annoying warnings about masking and `?plot3d` will bring up a dialogue asking which version of the help you want to read. Part of me thinks it would be better if there was a mechanism whereby a generic will just work if package foo imports that generic and exports a method for it. Either I am messing up something again (reasonably likely) or it does just work with S4 methods. I can import the namespace that has the generic and the methods work, I do not seem to need to export the generic. Is S3 working differently? I do have the documentation problem when I try to export other imported functions that I would like available to users. Paul Cheers, G On 26 August 2013 09:42, Henrik Bengtsson h...@biostat.ucsf.edu wrote: On Mon, Aug 26, 2013 at 1:28 AM, Martyn Plummer plumm...@iarc.fr wrote: I think rgl should be in Depends. You are providing a method for a generic function from another package. In order to use your method, you want the user to be able to call the generic function without scoping (i.e. without calling rgl::plot3d), so the generic should be on the search path, so the package that provides it should be listed in Depends in the NAMESPACE file. You can re-export an imported object, but it has to be done via an explicit export(), cf. It is possible to export variables from a namespace which it has imported from other namespaces: this has to be done explicitly and not via exportPattern [Writing R Extensions]. /H Martyn On Fri, 2013-08-23 at 22:01 -0600, Gavin Simpson wrote: Dear List, In one of my packages I have an S3 method for the plot3d generic function from package rgl. I am trying to streamline my Depends entries but don't know how to have plot3d(foo) in the examples section for the plot3d method in my package, without rgl being in Depends. Note that I importFrom(rgl, plotd3d) and register my S3 method via S3Method() in the NAMESPACE. If rgl is not in Depends but in Imports, I see this when checking the package ## 3D plot of data with curve superimposed plot3d(aber.pc, abernethy2) Error: could not find function plot3d I presume this is because rgl's namespace is only loaded but the package is not attached to the search path. Writing R extensions indicates that one can export from a namespace something that was imported from another package namespace. I thought that might help the situation, and now the code doesn't raise an error, I get * checking for missing documentation entries ... WARNING Undocumented code objects: ‘plot3d’ All user-level objects in a package should have documentation entries. See the chapter ‘Writing R documentation files’ in the ‘Writing R Extensions’ manual. as I don't document plot3d() itself. What is the recommended combination of Depends and Imports plus NAMESPACE directives etc that one should use in this situation? Or am I missing something else? I have a similar issue with my package including an S3 method for a generic in the lattice package, so if possible I could get rid of both of these from Depends if I can solve the above issue. Thanks in advance. Gavin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Depends vs Imports
I am being asked to modernize the Depends line in the DESCRIPTION file of some packages. Writing R Extensions says: The general rules are Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the ‘Depends’ field. Packages listed in imports or importFrom directives in the NAMESPACE file should almost always be in ‘Imports’ and not ‘Depends’. Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only. Could someone please explain a few points I thought I understood but obviously do not, or point to where these are explained: -What does it mean for the namespace only to be needed? I thought the namespace was needed if the package or some of its functions were mentioned in the NAMESPACE file, and that only the namespace was needed if only the generics were called, and not other functions. The above suggests that I may be wrong about this. If so, that is, Imports will usually suffice, then when would Depends ever be needed when a package is mentioned in the NAMESPACE file? -Should the package DESCRIPTION make any accommodation for the situation where users will probably need to directly call functions in the imported package, even though the package itself does not? -What does need to be attached mean? Is there a distinction between a package being attached and a namespace being attached. -Does successfully load mean something different from actually using the package? That is, can we assume that if the package loads then all the functions to run things will actually be found? -If pkg1 uses a function foo in pkg3 indirectly, by a call to a function in pkg2 which then uses foo, how should pkg1 indicate the relationship with foo's pkg3, or is there no need to indicate any relationship with pkg3 because that is all looked after by pkg2? Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Depends vs Imports
Simon Thanks, that helps a lot, but see below .. On 13-07-31 08:35 PM, Simon Urbanek wrote: On Jul 31, 2013, at 7:14 PM, Paul Gilbert wrote: I am being asked to modernize the Depends line in the DESCRIPTION file of some packages. Writing R Extensions says: The general rules are Packages whose namespace only is needed to load the package using library(pkgname) must be listed in the ‘Imports’ field and not in the ‘Depends’ field. Packages listed in imports or importFrom directives in the NAMESPACE file should almost always be in ‘Imports’ and not ‘Depends’. Packages that need to be attached to successfully load the package using library(pkgname) must be listed in the ‘Depends’ field, only. Could someone please explain a few points I thought I understood but obviously do not, or point to where these are explained: -What does it mean for the namespace only to be needed? I thought the namespace was needed if the package or some of its functions were mentioned in the NAMESPACE file, and that only the namespace was needed if only the generics were called, and not other functions. The above suggests that I may be wrong about this. If so, that is, Imports will usually suffice, then when would Depends ever be needed when a package is mentioned in the NAMESPACE file? In the namespace era Depends is never really needed. All modern packages have no technical need for Depends anymore. Loosely speaking the only purpose of Depends today is to expose other package's functions to the user without re-exporting them. This seems to mostly work, except in the situation where a package is used that enhances an imported package. For example, I Import DBI but the call dbDriver(MySQL) fails looking for MySQL in package RMySQL if I only import that and do not list it in Depends. Am I missing something? Similarly, I have a package tframePlus that provides extra methods (for zoo and xts) for my package tframe. Since tframe does not depend or import tframePlus (in fact, the reverse), I seem to need tframePlus in Depends not Imports of another package that Imports tframe. Does this sound right or am I missing something else? Also, I have a package TSMySQL which enhances my package TSdbi. When a user uses TSMySQL they will want to use many functions in TSdbi. Here again, I seem to need TSMySQL to Depend on TSdbi, for the reason you mention, exposing all the functions to the user. (I'm glad this is simple, I have trouble when things are difficult.) Thanks again, Paul -Should the package DESCRIPTION make any accommodation for the situation where users will probably need to directly call functions in the imported package, even though the package itself does not? -What does need to be attached mean? Is there a distinction between a package being attached and a namespace being attached. No, the distinction is between loaded and attached (namespace/package is synonymous here). -Does successfully load mean something different from actually using the package? That is, can we assume that if the package loads then all the functions to run things will actually be found? Define found - they will not be attached to the search path, so they will be found if you address them fully via myPackage::myFn but not just via myFn (except for another package that imports myPackage). -If pkg1 uses a function foo in pkg3 indirectly, by a call to a function in pkg2 which then uses foo, how should pkg1 indicate the relationship with foo's pkg3, or is there no need to indicate any relationship with pkg3 because that is all looked after by pkg2? There is no need - how would you imagine being responsible for code that you did not write? pkg2 will import function from pkg1, but you're not importing them in pkg3, you don't even care about them so you have no direct relationship with pkg1 (imagine pkg2 switched to use pkg4 instead of pkg1). IMHO it's all really simple: load = functions exported in myPkg are available to interested parties as myPkg::foo or via direct imports - essentially this means the package can now be used attach = the namespace (and thus all exported functions) is attached to the search path - the only effect is that you have now added the exported functions to the global pool of functions - sort of like dumping them in the workspace (for all practical purposes, not technically) import a function into a package = make sure that this function works in my package regardless of the search path (so I can write fn1 instead of pkg1::fn1 and still know it will come from pkg1 and not someone's workspace or other package that chose the same name) Cheers, Simon __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-3.0.1 - transient make check failure in splines-EX.r
Avraham I resolved this only by switching to a different BLAS on the 32 bit machine.Since no one else seemed to be having problems, I considered it possible that there was a hardware issue on my old 32 bit machine. The R check test failed somewhat randomly, but often. most disconcertingly, it failed because it gives different answers. If you source the code in an R session a few times you have no trouble reproducing this. It gives the impression of an improperly zeroed matrix. (All this from memory, I'm on the road.) Paul On 13-05-28 06:36 PM, Adler, Avraham wrote: Hello. I seem to be having the same problem that Paul had in the thread titled [Rd] R 2.15.2 make check failure on 32-bit --with-blas=-lgoto2 from October of last year https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html Unfortunately, that thread ended without an answer to his last question. Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the splines-Ex test in the exact same place that Paul had issues: stopifnot(identical(ns(x), ns(x, df = 1)), + identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true till 2.15.2 + !is.null(kk - attr(ns(x), knots)), # not true till 1.5.1 + length(kk) == 0) Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE Yet, opening up R and running the actual code shows that the error is transient: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] FALSE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] TRUE identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) [1] FALSE This is the only error I have on the 32-bit version, I believe (trying to build a blas for 64-bit on SandyBridge is a completely different kettle of fish that is causing me to pull out what little hair I have left), and if it can be solved that would be great. Thank you, Avraham __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R 3.0, Rtools3.0,l Windows7 64-bit, and permission agony
Being generally uninformed about Windows, I have to admit to almost total confusion trying to follow this thread. However, since I have recently been trying to do something in Windows, I would appreciate a newbie friendly explanation of a few points: -Rtools is used to build R and to build (some?) R packages. If you make Rtools an R package, how do you bootstrap the R build process? -in unix-like OSes, configure is used before make to set things similar to the question of where to find Rtools, and what version of various tools are available, and give warnings and errors if these are not adequate. Is there a reason configure cannot be used in Windows, or is there not something similar? -or am I really confused and should not consider the possibility that people actually build R, so the discussion is just about packages? Thanks, Paul On 13-04-22 11:16 AM, Gabor Grothendieck wrote: On Mon, Apr 22, 2013 at 10:27 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 21/04/2013 6:57 PM, Hadley Wickham wrote: PS. Hadley, is this what you meant when you wrote Better solutions (e.g. Rstudio and devtools) temporarily set the path on when you're calling R CMD *., or those approaches are only when you call 'R CMD' from the R prompt? I believe the latter, but I just want to make sure I didn't miss something. Well both devtools and RStudio allow you to do package development without leaving R, so neither do anything to your path when you're not using them. In teaching windows users to develop R packages, I found the use of the command line to be a substantial road-block, and if you can develop packages without leaving R, why not? The idea of temporary additions to the path during the INSTALL/build/check code sounds reasonable. R could probably do it more accurately than devtools or RStudio can (since we know the requirements, and you have to guess at them), but could hopefully do it in a way that isn't incompatible with those. The code called by install.packages() and related functions within R is essentially the same code as called by R CMD INSTALL etc from the command line, so this would help both cases. I would like to comment on this as I have had to implement similar facilities myself as part R.bat in the batchfiles. There is an issue of keeping R and Rtools in sync. Currently different Rtools versions will work with the same R version. For example, I have used both Rtools 1927 and 1930 with the current version of R. Its necessary to determine the relative paths that the version of Rtools in use requires since in principle the relative Rtools paths can vary from one version of Rtools to the next if the gcc version changes, say. Ideally the system would be able to figure this out even if registry entries and environment variables are not set by looking in standard locations for the Rtools root and finding the relative paths by querying some file in the Rtools installation itself. devtools does this by querying the Rtools version and uss an internal database of relative paths keyed by version. R.bat in batchfiles does it by scanning the Rtools unins000.dat file and extracting the relative paths directly from it. This has the advantage that no database need be maintained and it also automatically adapts to new versions of Rtools without any foreknowledge of them. Of course since you have control of both ends you alternately could add the relative paths to an expanded version of the VERSION file or add some additional text file into Rtools for the purpose of identifying he relative paths.. Another possibility if significant changes were to be considered would be to make Rtools into an R package thereby leveraging existing facilities and much simplifying any synchronization. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] package file permissions problem R 3.0.0/Windows OS
On 13-04-15 03:19 PM, Prof Brian Ripley wrote: On 15/04/2013 14:11, John Fox wrote: Dear Brian, On Mon, 15 Apr 2013 06:56:26 +0100 Prof Brian Ripley rip...@stats.ox.ac.uk wrote: POSIX-style execute permission isn't a Windows concept, so it was fortuitous this ever worked. One possibility is that Cygwin was involved, and a Cygwin emulation got set when tar unpacked the file and converted back to the tar representation when Cygwin tar produced the tarball. (The tar in Rtools is a fixed version of Cygwin tar, fixed to use Windows file paths.) Recall that the problem was first detected when I submitted to CRAN a new version of the sem package that I built on one of my Windows systems. I'm guessing that you unpacked that on a Linux system. Perhaps I misunderstand the point, but if the problem is in unpacking, then shouldn't I see it when the package is built on R 2.15.2 (not 2.5.2 -- sorry, my typo)? The puzzle is how you got execute permissions recorded for files on your Windows system. They are not part of the Windows file system: Cygwin uses ACLs to emulate them. Once the ACLs are there, a Cygwin-based tar will put them as permissions into the tarball. But a native Windows tool would not (it might or might not capture the ACLs using a tar extension, but those would be ignored by most unpacking tools on a Unix-alike). The issue is not really Windows: if you use a FAT file system on a Unix-alike you have the same problem -- this is why SMB mounts at least did not work on OS X for building R (and much else), and you need to be careful transferring directories via USB sticks (which are usually FAT-formatted). That route usually makes the opposite compromise: to assume everything is executable. What are those screen shots of? 7zip, which I use on Windows to manage file archives. Ah, so that's a listing of the .tar.gz, a graphical form of tar -tvf. R 2.5.2 was a very long time ago. A recent change is Indeed. Again, that is my unfortunate typo -- I used 2.15.2. I wanted to confirm that I can build packages with the correct permissions on my Windows systems using an older (but recent) version of R. • R CMD build by default uses the internal method of tar() to prepare the tarball. This is more likely to produce a tarball compatible with R CMD INSTALL and R CMD check: an external tar program, including options, can be specified _via_ the environment variable R_BUILD_TAR. I saw that but didn't understand its import. That makes sense of a difference between R 2.15.2 and 3.0.0, though I'm not sure why this change would introduce a problem with the permissions. Can you try using an external tar? (Using the internal tar on Windows was first trialled in 2.15.3.) Yes, when I set R_BUILD_TAR=tar on my Windows 8 system, the tarball for the package is built with the correct permissions under R 3.0.0. The tar should be found in the Rtools\bin directory, which is first on my path. I don't have Cygwin installed on this machine independently of Rtools. What's curious to me is that I'm seeing the problem on two different Windows system but, AFAIK, no one else has experienced a similar problem. Very few Windows users will ever get a file that appears to 'tar' to have execute permissions. For example, svn checkouts on Windows lose execute permissions, something which has caught me for time to time over the years. I am just having the opposite problem, sliksvn is adding x permission on checkout, to some but not all files. Not sure why and I don't want it to, so would be happy to hear suggestions. Paul Thanks for your help, John On 14/04/2013 22:17, John Fox wrote: Dear list members, I'm experiencing a file permissions problem with a package built under Windows with R 3.0.0. I've encountered the problem on two Windows computers, one running Windows 7 and the other Windows 8, and both when I build the package under RStudio or directly in a Windows console via R CMD build. In particular, the cleanup file for the package, which as I understand it should have permissions set at rwx-r-r, instead has permissions rw-rw-rw. I've attached two .png screen shots showing how the permissions are set when the package is built under R 2.5.2 and R 3.0.0. I think that my two Windows systems are reasonably vanilla. Here are the system and session info from R 3.0.0 run from a Windows console: Sys.info() sysname release Windows 7 x64 version nodename build 7601, Service Pack 1 JOHN-DELL-XPS machinelogin x86 User user effective_user User User sessionInfo() R version 3.0.0 (2013-04-03) Platform: i386-w64-mingw32/i386 (32-bit)
[Rd] R-3.0.0 reg-tests-3.R / survival
make check is failing on reg-test3.R with a message that survival was built with an older version of R. (On my Ubuntu 32 bit and Ubuntu 64 bit machines). Why would make check be looking anywhere that it would find something built with an older version of R? ~/RoboAdmin/R-3.0.0/tests$ tail reg-tests-3.Rout.fail print(1.001, digits=16) [1] 1.001 ## 2.4.1 gave 1.001 ## 2.5.0 errs on the side of caution. ## as.matrix.data.frame with coercion library(survival) Error: package 'survival' was built before R 3.0.0: please re-install it Execution halted __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-3.0.0 reg-tests-3.R / survival
make check seems to be picking up the setting of my R_LIBS_SITE which is still set for 2.15.3 and had survival in it. At least, when I set that to empty then the check passes. I'm not sure if checking worked like that before, I don't think so. I have not usually had base packages in my site-library. In any case, it seems like a bad idea for make check to use an existing setting of R_LIBS_SITE. At least, I think the idea is that it should be checking the just built library. Paul On 13-04-03 11:36 AM, peter dalgaard wrote: Any chance that you might have a personal library, which isn't versioned? If you do and you for some reason installed survival into it, it would explain it. E.g., I have, with the system-wide R .libPaths() [1] /Users/pd/Library/R/2.15/library [2] /opt/local/Library/Frameworks/R.framework/Versions/2.15/Resources/library lapply(.libPaths(), list.files) [[1]] [1] abind aplpackcarcolorspace e1071 [6] effectsellipseHmisc ISwR leaps [11] lmtest matrixcalc mclust multcomp mvtnorm [16] pcaPP Rcmdr relimp represent rgl [21] robustbase rrcov semxtable zoo [[2]] [1] base boot class clustercodetools [6] compiler datasets foreigngraphics grDevices [11] grid KernSmooth latticeMASS Matrix [16] methodsmgcv nlme nnet parallel [21] rpart spatialsplinesstats stats4 [26] survival tcltk tools utils but the one in my development build tree of 3.0.0 has .libPaths() [1] /Users/pd/r-release-branch/BUILD-dist/library If I explicitly set R_LIBS, I can easily reproduce your error. On Apr 3, 2013, at 17:00 , Paul Gilbert wrote: make check is failing on reg-test3.R with a message that survival was built with an older version of R. (On my Ubuntu 32 bit and Ubuntu 64 bit machines). Why would make check be looking anywhere that it would find something built with an older version of R? ~/RoboAdmin/R-3.0.0/tests$ tail reg-tests-3.Rout.fail print(1.001, digits=16) [1] 1.001 ## 2.4.1 gave 1.001 ## 2.5.0 errs on the side of caution. ## as.matrix.data.frame with coercion library(survival) Error: package 'survival' was built before R 3.0.0: please re-install it Execution halted __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [BioC] enabling reproducible research R package management install.package.version BiocLite
(More on the original question further below.) On 13-03-05 09:48 AM, Cook, Malcolm wrote: All, What got me started on this line of inquiry was my attempt at balancing the advantages of performing a periodic (daily or weekly) update to the 'release' version of locally installed R/Bioconductor packages on our institute-wide installation of R with the disadvantages of potentially changing the result of an analyst's workflow in mid-project. I have implemented a strategy to try to address this as follows: 1/ Install a new version of R when it is released, and packages in the R version's site-library with package versions as available at the time the R version is installed. Only upgrade these package versions in the case they are severely broken. 2/ Install the same packages in site-library-fresh and upgrade these package versions on a regular basis (e.g. daily). 3/ When a new version of R is released, freeze but do not remove the old R version, at least not for a fairly long time, and freeze site-library-fresh for the old version. Begin with the new version as in 1/ and 2/. The old version remains available, so reverting is trivial. The analysts are then responsible for choosing the R version they use, and the library they use. This means they do not have to change R and package version mid-project, but they can if they wish. I think the above two libraries will cover most cases, but it is possible that a few projects will need their own special library with a combination of package versions. In this case the user could create their own library, or you might prefer some more official mechanism. The idea of the above strategy is to provide the stability one might want for an ongoing project, and the possibility of an upgraded package if necessary, but not encourage analysts to remain indefinitely with old versions (by say, putting new packages in an old R version library). This strategy has been implemented in a set of make files in the project RoboAdmin available at http://automater.r-forge.r-project.org/. It can be done entirely automatically with a cron job. Constructive comments are always appreciated. (IT departments sometimes think that there should be only one version of everything available, which they test and approve. So the initial reaction to this approach could be negative. I think they have not really thought about the advantages. They usually cannot test/approve an upgrade without user input, and timing is often extremely complicate because of ongoing user needs. This strategy is simply shifting responsibility and timing to the users, or user departments, that can actually do the testing and approving.) Regarding NFS mounts, it is relatively robust. There can be occasional problems, especially for users that have a habit of keeping an R session open for days at a time and using site-library-fresh packages. In my experience this did not happen often enough to worry about a blackout period. Regarding the original question, I would like to think it could be possible to keep enough information to reproduce the exact environment, but I think for potentially sensitive numerical problems that is optimistic. As others have pointed out, results can depend not only on R and package versions, configuration, OS versions, and library and compiler versions, but also on the underlying hardware. You might have some hope using something like an Amazon core instance. (BTW, this problem is not specific to R.) It is true that restricting to a fixed computing environment at your institution may ease things somewhat, but if you occasionally upgrade hardware or the OS then you will probably lose reproducibility. An alternative that I recommend is that you produce a set of tests that confirm the results of any important project. These can be conveniently put in the tests/ directory of an R package, which is then maintained local, not on CRAN, and built/tested whenever a new R and packages are installed. (Tools for this are also available at the above indicated web site.) This approach means that you continue to reproduce the old results, or if not, discover differences/problems in the old or new version of R and/or packages that may be important to you. I have been successfully using a variant of this since about 1993, using R and package tests/ since they became available. Paul I just got the green light to institute such periodic updates that I have been arguing is in our collective best interest. In return, I promised my best effort to provide a means for preserving or reverting to a working R library configuration. Please note that the reproducibility I am most eager to provide is limited to reproducibility within the computing environment of our institute, which perhaps takes away some of the dragon's nests, though certainly not all. There are technical issues of updating package installations on an NFS mount that might have
Re: [Rd] maintaining multiple R versions
Aaron For the problem I had in mind, changing a couple of environment variables does not seem like more work than this, but it may solve a bigger problem than the one I was thinking about. If I understand correctly, you can use this to switch among versions of R, similar to what I am doing and still with versions in different directories found by a PATH setting. But, in addition, it is also possible that the R versions were compiled with different gcc and other tools, as long as those are still installed on the system. Does it also work if you upgrade the OS and have newer versions of system libraries, etc, or do you then need to recompile the R versions? Thanks, Paul On 13-01-18 02:58 PM, Aaron A. King wrote: Have you looked at Environment Modules (http://modules.sourceforge.net/)? I use it to maintain multiple versions of R. Users can choose their default and switch among them at the command line. Aaron On Fri, Jan 18, 2013 at 02:04:13PM -0500, Paul Gilbert wrote: (somewhat related to thread [Rd] R CMD check not reading R_LIBS ) For many years I have maintained R versions by building R (./configure ; make) in a directory indicating the version number, putting the directory/bin on my path, and setting R_LIBS_SITE. It seems only one version can easily be installing in /usr/bin, and in any case that requires root, so I do not do that. There may be an advantage to installing somewhere in a directory with the version number, but that does not remove the need to set my path. (If there is an advantage to installing I would appreciate someone explaining briefly what it is.) My main question is whether there is a better ways to maintaining multiple versions, in some way that lets users choose which one they are using? (The only problem I am aware of with my current way of doing this is: if the system has some R in /usr/bin then I have to set my preferred version first, which means shell commands like man find R's pager first, and do not work.) Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] maintaining multiple R versions
(somewhat related to thread [Rd] R CMD check not reading R_LIBS ) For many years I have maintained R versions by building R (./configure ; make) in a directory indicating the version number, putting the directory/bin on my path, and setting R_LIBS_SITE. It seems only one version can easily be installing in /usr/bin, and in any case that requires root, so I do not do that. There may be an advantage to installing somewhere in a directory with the version number, but that does not remove the need to set my path. (If there is an advantage to installing I would appreciate someone explaining briefly what it is.) My main question is whether there is a better ways to maintaining multiple versions, in some way that lets users choose which one they are using? (The only problem I am aware of with my current way of doing this is: if the system has some R in /usr/bin then I have to set my preferred version first, which means shell commands like man find R's pager first, and do not work.) Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-forge, package dependencies
I'm surprised this works on Windows and Mac, since RMonetDB does not seem to be on CRAN. I thought it was still a requirement that dependencies need to be on CRAN (which makes development difficult for related packages like this). A related long standing request is that R-forge checking look for required newer versions on R-forge rather than just on CRAN. Does anyone know if that works on Windows and/or Mac? Paul On 13-01-15 03:09 PM, Uwe Ligges wrote: On 15.01.2013 20:47, Thomas Lumley wrote: I have a project on R-forge (sqlsurvey.r-forge.r-project.org) with two packages, RMonetDB and sqlsurvey. At the moment, sqlsurvey is listed as failing to build. The error is on the Linux package check, which says that RMonetDB is not available: * checking package dependencies ... ERROR Package required but not available: ‘RMonetDB’ RMonetDB has built successfully: r-forge lists its status as 'current', with Linux, Windows, and Mac packages available for download. The package check for sqlsurvey on Windows and Mac finds RMonetDB without any problems, it's just on Linux that it appears to be unavailable. Any suggestions for how to fix this? I've tried uploading a new version of RMonetDB, but the situation didn't change: it built successfully, but the Linux check of sqlsurvey still couldn't find it. I think you have to ask Stefan to check the details. Best, Uwe -thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] SystemRequirements’ field
Am I correct in thinking that the ‘SystemRequirements’ field in a package DESCRIPTION file is purely descriptive, there are no standard elements that can be extracted by parsing it and used automatically? This field does not seem to be widely used, even for some obvious cases like backend database driver requirements, perl, perl modules, etc. It might help to have a list of possibilities. Some I think about immediately are SQLLite, MySQL, PostgreSQL, ODBC, Perl, Perl_CSVXS, MPI, rpcgen, Oracle-license, Bloomberg-license and Fame-license. Maybe there could be a generic OTHER_* for things not in a standard list? Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] SystemRequirements’ field
On 12-12-12 02:19 PM, Prof Brian Ripley wrote: On 12/12/2012 18:33, Paul Gilbert wrote: Am I correct in thinking that the ‘SystemRequirements’ field in a package DESCRIPTION file is purely descriptive, there are no standard elements that can be extracted by parsing it and used automatically? No. Where can I find more details? The section The DESCRIPTION file in Writing R Extensions says only: Other dependencies (external to the R system) should be listed in the ‘SystemRequirements’ field, possibly amplified in a separate README file. Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] An idea: Extend mclapply's mc.set.seed with an initial seed value?
I appreciate your problem, and getting reproducible random generator results on a parallel system is something to be careful about. However, I would avoid making it too easy to have a fixed seed. In earlier days there were mistakes too often made by users inadvertently using the same seed over and over again (on simple single processor systems), for example, by reloading a session with the seed set. Paul On 12-11-01 08:46 PM, Ivan Popivanov wrote: Hello, Have been thinking that sometimes users may want each process to initialize its random seed with a specific value rather then the current seed. This could be keyed off depending whether mc.set.seed is logical, preserving the current behaviour, or numerical, using the value in a call to set.seed. Does this make sense? If you wonder how I came up with the idea: I spent a couple of hours debugging unstable results from parallel tuning of svms, which was caused by the parallel execution. In my case I can simply do the set.seed in the FUN argument function, but that may not be always the case. Ivan [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Retrieving data from aspx pages
I must be really dense, I know RCurl provides a POST capability, but I don't see how this allows interaction. Suppose the example actually worked, which it does not. (Unfortunately many of the examples in RCurl seem to be marked \dontrun{} or disabled with some if() condition.) When you post to a page like this you will often get something back that has a dynamically generated URI, and you will need to post more information to that page. But how do you find out the URI of that next dynamically generated page? Even when you know what you will need to post, you need the URI to do it. If RCurl provided interaction you would be able get the URI so you could post to the next page. Maybe you can do that, but I have not discover how. If you know how, I would appreciate a real working example. Paul On 12-10-31 12:14 PM, jose ramon mazaira wrote: I'd like to make you note that I've discovered that package RCurl already provides a utility that allows interaction via POST requests with servers. In fact, the FAQ for RCurl contains specifically an example with an aspx page: x = postForm(http://www.fas.usda.gov/psdonline/psdResult.aspx;, style = post, .params = list(visited=1, lstGroup = all, lstCommodity=2631000, lstAttribute=88, lstCountry=**, lstDate=2011, lstColumn=Year, lstOrder=Commodity%2FAttribute%2FCountry)) Check this link: http://www.omegahat.org/RCurl However, I think that it would be more useful to automate the interaction with servers retrieving automatically the name-value pairs required by the server (parsing the page source code) instead of examining in each web page the appropiate fields. 2012/10/30, Paul Gilbert pgilbert...@gmail.com: Jose As far as getting to the data, I think the best way to do this sort of thing would be if the site supports a SOAP or REST interface. When they don't (yet) then one is faced with clicking through some pages. Python or Java is one way to automate the process of clicking through the pages. I don't know how to do that in R, but would like to know if it is possible. But, I guess I was confused about the part you want to improve. What I have works fairly smoothly parsing and passing back JSON data, converted from a csv file, into R. The downside is that this approach requires more than R to be installed on the client machine. But if the object you get back is ASPX, then you either need to parse it directly, or convert it to JSON, or something else you can deal with. I suspect that will be fairly specific to a particular web site, but I don't really know enough about ASPX to be sure. Paul On 12-10-30 01:12 PM, jose ramon mazaira wrote: Thanks for your interest, Paul. I've checked the source code of TSjson and I've seen that what it does is to call a Python script to retrieve the data. In fact, I've already done this with Java using the URLConnection class and sending the requested values to fill the form. However, I think it would be more useful to open a connection with R and to send the requested values within R, and not through an external program. The application I've designed, like yours, is also page-specific (i.e., designed for http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx), but I think that our applications would be more powerful if they were able to parse the name-value pairs generated from ASPX (or of any other dynamically generated web page) and ask the user to select the appropiate values. 2012/10/30, Paul Gilbert pgilbert...@gmail.com: I think RHTMLForms works if you have a single form, but I have not been able to see how to use it when you need to go through a sequence of dynamically generated forms (like you can do with Python mechanize). Paul On 12-10-30 09:08 AM, Gabriel Becker wrote: I haven't used it extensively myself, and can't speak to it's current state but on quick inspection RHTMLForms seems worth a look for what you want. http://www.omegahat.org/RHTMLForms/ ~G On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com mailto:pgilbert...@gmail.com wrote: I don't know of an easy way to do this in R. I've been doing something similar with python scripts called from R. If anyone knows how to do this with just R, I would appreciate hearing too. Paul On 12-10-29 04:11 PM, jose ramon mazaira wrote: Hi. I'm trying to write an application to retrieve financial data (specially bonds data) from FINRA. The web page is served dynamically from an asp.net http://asp.net application: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx I'd like to know if it's possible to fill
Re: [Rd] Retrieving data from aspx pages
I don't know of an easy way to do this in R. I've been doing something similar with python scripts called from R. If anyone knows how to do this with just R, I would appreciate hearing too. Paul On 12-10-29 04:11 PM, jose ramon mazaira wrote: Hi. I'm trying to write an application to retrieve financial data (specially bonds data) from FINRA. The web page is served dynamically from an asp.net application: http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx I'd like to know if it's possible to fill dynamically the web page form from R and, after filling it (with the issuer name), retrieve the web page, parse the data, and covert it to appropiate R objects. For example, suppose I want to search data for ATT bonds. I'd like to know if it's possible, within R, to fill the page served from: http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx select the corporate option and fill with ATT the field for Issuer name, ask the page to display the results, and retrieve the results for each of the bonds issued by ATT (for example: http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3) and parsing the data from the web page. Thanks in advance. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Retrieving data from aspx pages
I think RHTMLForms works if you have a single form, but I have not been able to see how to use it when you need to go through a sequence of dynamically generated forms (like you can do with Python mechanize). Paul On 12-10-30 09:08 AM, Gabriel Becker wrote: I haven't used it extensively myself, and can't speak to it's current state but on quick inspection RHTMLForms seems worth a look for what you want. http://www.omegahat.org/RHTMLForms/ ~G On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com mailto:pgilbert...@gmail.com wrote: I don't know of an easy way to do this in R. I've been doing something similar with python scripts called from R. If anyone knows how to do this with just R, I would appreciate hearing too. Paul On 12-10-29 04:11 PM, jose ramon mazaira wrote: Hi. I'm trying to write an application to retrieve financial data (specially bonds data) from FINRA. The web page is served dynamically from an asp.net http://asp.net application: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx I'd like to know if it's possible to fill dynamically the web page form from R and, after filling it (with the issuer name), retrieve the web page, parse the data, and covert it to appropiate R objects. For example, suppose I want to search data for ATT bonds. I'd like to know if it's possible, within R, to fill the page served from: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx select the corporate option and fill with ATT the field for Issuer name, ask the page to display the results, and retrieve the results for each of the bonds issued by ATT (for example: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/BondDetail.aspx?ID=__MDAxOTU3Qko3 http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3) and parsing the data from the web page. Thanks in advance. R-devel@r-project.org mailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-devel https://stat.ethz.ch/mailman/listinfo/r-devel R-devel@r-project.org mailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-devel https://stat.ethz.ch/mailman/listinfo/r-devel -- Gabriel Becker Graduate Student Statistics Department University of California, Davis __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Retrieving data from aspx pages
Jose As far as getting to the data, I think the best way to do this sort of thing would be if the site supports a SOAP or REST interface. When they don't (yet) then one is faced with clicking through some pages. Python or Java is one way to automate the process of clicking through the pages. I don't know how to do that in R, but would like to know if it is possible. But, I guess I was confused about the part you want to improve. What I have works fairly smoothly parsing and passing back JSON data, converted from a csv file, into R. The downside is that this approach requires more than R to be installed on the client machine. But if the object you get back is ASPX, then you either need to parse it directly, or convert it to JSON, or something else you can deal with. I suspect that will be fairly specific to a particular web site, but I don't really know enough about ASPX to be sure. Paul On 12-10-30 01:12 PM, jose ramon mazaira wrote: Thanks for your interest, Paul. I've checked the source code of TSjson and I've seen that what it does is to call a Python script to retrieve the data. In fact, I've already done this with Java using the URLConnection class and sending the requested values to fill the form. However, I think it would be more useful to open a connection with R and to send the requested values within R, and not through an external program. The application I've designed, like yours, is also page-specific (i.e., designed for http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx), but I think that our applications would be more powerful if they were able to parse the name-value pairs generated from ASPX (or of any other dynamically generated web page) and ask the user to select the appropiate values. 2012/10/30, Paul Gilbert pgilbert...@gmail.com: I think RHTMLForms works if you have a single form, but I have not been able to see how to use it when you need to go through a sequence of dynamically generated forms (like you can do with Python mechanize). Paul On 12-10-30 09:08 AM, Gabriel Becker wrote: I haven't used it extensively myself, and can't speak to it's current state but on quick inspection RHTMLForms seems worth a look for what you want. http://www.omegahat.org/RHTMLForms/ ~G On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com mailto:pgilbert...@gmail.com wrote: I don't know of an easy way to do this in R. I've been doing something similar with python scripts called from R. If anyone knows how to do this with just R, I would appreciate hearing too. Paul On 12-10-29 04:11 PM, jose ramon mazaira wrote: Hi. I'm trying to write an application to retrieve financial data (specially bonds data) from FINRA. The web page is served dynamically from an asp.net http://asp.net application: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx I'd like to know if it's possible to fill dynamically the web page form from R and, after filling it (with the issuer name), retrieve the web page, parse the data, and covert it to appropiate R objects. For example, suppose I want to search data for ATT bonds. I'd like to know if it's possible, within R, to fill the page served from: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx select the corporate option and fill with ATT the field for Issuer name, ask the page to display the results, and retrieve the results for each of the bonds issued by ATT (for example: http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/BondDetail.aspx?ID=__MDAxOTU3Qko3 http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3) and parsing the data from the web page. Thanks in advance. R-devel@r-project.org mailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-devel https://stat.ethz.ch/mailman/listinfo/r-devel R-devel@r-project.org mailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/__listinfo/r-devel https://stat.ethz.ch/mailman/listinfo/r-devel -- Gabriel Becker Graduate Student Statistics Department University of California, Davis __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R 2.15.2 make check failure on 32-bit --with-blas=-lgoto2
Is --with-blas=-lgoto2 a known problem (other than possibly not being the preferred choice)? I thought I had been testing RC with the same setup I regularly use, but I now see there was a slight difference. I am now getting the following failure in make check on 32-bit Ubuntu 12.04, configuring with --with-blas=-lgoto2. (These may not be surprising statistically or numerically, but it is a bit disconcerting when make check fails.) ... Testing examples for package ‘stats’ comparing ‘stats-Ex.Rout’ to ‘stats-Ex.Rout.save’ ... 2959c2959 N:K1 33.13 33.13 2.146 0.16865 --- N:K1 33.14 33.14 2.146 0.16865 12782c12782 Murder -0.536 0.418 0.341 0.649 --- Murder -0.536 0.418 -0.341 0.649 12783c12783 Assault -0.583 0.188 0.268 -0.743 --- Assault -0.583 0.188 -0.268 -0.743 12784c12784 UrbanPop -0.278 -0.873 0.378 0.134 --- UrbanPop -0.278 -0.873 -0.378 0.134 12785c12785 Rape -0.543 -0.167 -0.818 --- Rape -0.543 -0.167 0.818 12943c12943 6 -0.5412 20.482886-0.845157 --- 6 -0.5412 20.482887-0.845157 14481c14481 Sum of Squares 780.1250 276.1250 2556.1250 112.5000 774.0937 --- Sum of Squares 780.1250 276.1250 2556.1250 112.5000 774.0938 15571c15571 Murder -0.54 0.42 0.34 0.65 --- Murder -0.54 0.42 -0.34 0.65 15572c15572 Assault -0.58 0.27 -0.74 --- Assault -0.58 -0.27 -0.74 15573c15573 UrbanPop -0.28 -0.87 0.38 --- UrbanPop -0.28 -0.87 -0.38 15574c15574 Rape -0.54 -0.82 --- Rape -0.54 0.82 Testing examples for package ‘datasets’ comparing ‘datasets-Ex.Rout’ to ‘datasets-Ex.Rout.save’ ... OK ... I inadvertently seemed to have set things slightly differently while testing RC. While testing the RC, I was using ./configure --prefix=/home/paul/RoboRC/R-test/ --enable-R-shlib and configure gave ... External libraries:readline Additional capabilities: PNG, NLS Options enabled: shared R library, shared BLAS, R profiling, Java whereas with the release I used ./configure --prefix=/home/paul/RoboAdmin/R-2.15.2 --enable-R-shlib --with-blas=-lgoto2 and configure gave ... External libraries:readline, BLAS(generic) Additional capabilities: PNG, NLS Options enabled: shared R library, R profiling, Java Thanks, Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R 2.15.2 make check failure on 32-bit --with-blas=-lgoto2
On 12-10-26 12:15 PM, Prof Brian Ripley wrote: On 26/10/2012 16:37, Paul Gilbert wrote: Is --with-blas=-lgoto2 a known problem (other than possibly not being the preferred choice)? And what precisely is it? And what chipset are you using? I thought I had been testing RC with the same setup I regularly use, but I now see there was a slight difference. I am now getting the following failure in make check on 32-bit Ubuntu 12.04, configuring with --with-blas=-lgoto2. (These may not be surprising statistically or numerically, but it is a bit disconcerting when make check fails.) No failure shown here ... surely you know that the signs of principal components are not determined? I apologize, I missed the real error. (But yes, I am aware of this, and also that I should expect precision differences with different libraries and different architectures.) I thought for a moment that make was throwing an error because of the differences, but in fact it was later: Testing examples for package ‘grid’ comparing ‘grid-Ex.Rout’ to ‘grid-Ex.Rout.save’ ... OK Testing examples for package ‘splines’ Error: testing 'splines' failed Execution halted make[3]: *** [test-Examples-Base] Error 1 make[3]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests/Examples' make[2]: *** [test-Examples] Error 2 make[2]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests' make[1]: *** [test-all-basics] Error 1 make[1]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests' make: *** [check] Error 2 paul@toaster:~/RoboAdmin/R-2.15.2$ The problem seems to be here: source(~/RoboAdmin/R-2.15.2/tests/Examples/splines-Ex.R) List of 2 $ x: num [1:51] 58 58.3 58.6 58.8 59.1 ... $ y: num [1:51] 115 115 116 117 117 ... - attr(*, class)= chr xyVector Warning in bs(height, degree = 3L, knots = c(62.7, 67.3 : some 'x' values beyond boundary knots may cause ill-conditioned bases Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE traceback() 6: stop(paste0(ch, is not , if (length(r) 1L) all , TRUE), call. = FALSE) 5: stopifnot(identical(ns(x), ns(x, df = 1)), identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), !is.null(kk - attr(ns(x), knots)), length(kk) == 0) at splines-Ex.R#130 4: eval(expr, envir, enclos) 3: eval(ei, envir) 2: withVisible(eval(ei, envir)) 1: source(~/RoboAdmin/R-2.15.2/tests/Examples/splines-Ex.R) It also seems that this error is transient. If I rerun several times, it does not always happen. Is anyone aware of other cases of transient problems with 32-bit goto2? Here is the cpu info: paul@toaster:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz stepping: 10 microcode : 0x92 cpu MHz : 800.000 cache size : 4096 KB physical id : 0 siblings: 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida dtherm tpr_shadow vnmi flexpriority bogomips: 3989.99 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 Duo CPU T7300 @ 2.00GHz stepping: 10 microcode : 0x92 cpu MHz : 800.000 cache size : 4096 KB physical id : 0 siblings: 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm lahf_lm ida dtherm tpr_shadow vnmi flexpriority bogomips: 3989.97 clflush size: 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Paul And 32-bit platforms seem prone to round in different ways (to each other and to 64-bit platforms): the differences in the last digit are typical of 32-bit platforms. ... Testing examples for package ‘stats’ comparing ‘stats-Ex.Rout’ to ‘stats-Ex.Rout.save’ ... 2959c2959 N:K1 33.13 33.13 2.146 0.16865 --- N:K1 33.14 33.14 2.146 0.16865 12782c12782 Murder
Re: [Rd] Is there an automatic method for updating an existing CRAN package from R-forge?
Yes, the button is still there if you are logged in and unhide it. (You probably also need the appropriate developer permission.) Beware that R-forge needs to indicate the package build status is current, but also, if you have just made svn updates it may falsely indicate current. Check the revision number is accurate, or wait a couple of days, or submit by hand. If you do submit by ftp to incoming, beware that the CRAN policy now requires that your email state that you have read and agree with the CRAN policies. (Probably everyone else noticed that, but there is no need for others to generate extra work for CRAN maintainers, like I have just done.) Paul On 12-10-01 11:09 AM, Spencer Graves wrote: On 10/1/2012 4:47 AM, Duncan Murdoch wrote: On 12-10-01 7:38 AM, S Ellison wrote: I have a package on CRAN and now have a modest update that's passing build checks on R-forge. Is there a mechanism on R-forge for updating an existing CRAN package, analogous to the 'submit to cran' link on the R-forge package page, or should I just follow the instructions at http://cran.r-project.org/web/packages/policies.html for FTP upload? If there were a Submit to CRAN button, that would be the method. But I think that button has gone away, so the description on that page is the way to go. The Submit to CRAN button on that page is hidden by default but is exposed by clicking on the Show/Hide extra info button right below where it gives Build Status and R install command. (I had trouble finding the Submit to CRAN button for a while after it became hidden. You need Build status: Current. Also, at least the for last submissions I made, the CRAN maintainers did NOT accept updates if there were Warnings or Notes in the Package Checks; these warnings are also hidden until you click Show/Hide extra info.) Spencer Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] CRAN test / avoidance
( subject changed from Re: [Rd] R-devel Digest, Vol 115, Issue 18 ) I have the impression from this, and previous discussions on the subject, that package developers and CRAN maintainers are talking at cross-purposes. Many package maintainers are thinking that they should be responsible for choosing which tests are run and which are not run by CRAN, whereas CRAN maintainers may want to run all possible tests sometimes, or a trimmed down set when time constraints demand this. With good reason, CRAN may want to run all possible tests sometimes. There are too many packages on CRAN that remain there because they don't have any testing or vignettes, and very few examples. Encouraging more of that is a bad thing. If I understand correctly, the --as-cran option was introduced to help developers specify options that CRAN uses, so they would find problems that CRAN would notice, and correct before submitting. The Rd discussions of this have morphed into a discussion of how package developers can use --as-cran to control which tests are run by CRAN. I tend to be more sympathetic with what I call the CRAN maintainer view above, even though I am a package developer. I think packages should have extensive testing and that all the tests should go in the source package on CRAN, so the testing is available for CRAN and everyone else. (Although, it is sometimes not clear if CRAN maintainers like me doing this, because they are torn between time demands and maintaining quality - that is part of the confusion.) The question becomes: how does information get passed along to indicate things that may take a long time to run. The discussion so far has focused on developers setting, or using, some flags to indicate tests and examples that take a long time. Another option would be to have the check/build process generate a file with information about the time it took to run tests, vignettes, and examples, probably with some information about the speed of the machine it was run on. Then CRAN and anyone else that wants to run tests can take this information into consideration. Paul On 12-09-19 10:08 AM, Terry Therneau wrote: In general, as a package user, I don't want people to be able to suppress checks on CRAN. I want things fixed. So I am pretty sure there won't ever be a reliable CRAN-detector put into R. It would devalue the brand. Duncan Murdoch My problem is that CRAN demands that I suppress a large fraction of my checks, in order to fit within time constraints. This leaves me with 3 choices. 1. Add lines to my code that tries to guess if CRAN is invoker. A cat and mouse game per your desire above. 2. Remove large portions of my test suite. I consider the survival package to be one of the pre-eminent current code sets in the world precisely because of the extensive validations, this action would change it to a second class citizen. 3. Add a magic environment variable to my local world, only do the full tests if it is present, and make the dumbed down version the default. Others who want to run the full set are then SOL, which I very much don't like. I agree that CRAN avoidence, other than the time constraint, should be verboten. But I don't think that security through obscurity is the answer. And note that under scenario 3, which is essentially what is currently being forced on us, I can do such micshief as easily as under number 1. Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] problem with vignettes when S4 classes in packages overlap
On 12-09-18 07:23 PM, Duncan Murdoch wrote: On 12-09-18 5:40 PM, Paul Gilbert wrote: ( A similar problem is also reported by Sebastian P. Luque with library(maptools) library(trip) in the vignette as below ). I am writing a vignette which loads RMySQL and RPostgreSQL. This produces the warning: Loading required package: DBI Warning in .simpleDuplicateClass(def, prev) : A specification for class “dbObjectId” in package ‘RPostgreSQL’ seems equivalent to one from package ‘RMySQL’ and is not turning on duplicate class definitions for this class This can be reproduced by running R CMD Sweave --pdf Atest.Stex where the file Atest.Stex has the lines \documentclass{article} \usepackage{Sweave} \begin{document} \begin{Scode} library(RMySQL) library(RPostgreSQL) \end{Scode} \end{document} These warnings only happen in a vignette. They are not produced if the lines are entered in an R session. (Using R version 2.15.1 (2012-06-22) -- Roasted Marshmallows on Ubunt You'll get the warning in a regular session if you set options(warn=1). I think Sweave is probably doing this so that warnings show up around the time of the chunk they correspond to. It does it in the command line version, but not in the Sweave() function (which would save them up to the end). I don't know if the warning is something you should worry about or not. It doesn't interfere with producing the vignette, but for submitting to CRAN it is better not to have warnings coming from my package, even though they are caused by a problem with other packages. Now that I know why it only happens in the vignette, I guess I can suppress it (but it would be nice to see the other packages fixed). Thanks, Paul Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Requests for vignette clarification (re: Writing R Extensions)
I'll make a guess at some parts of this. On 12-06-01 02:53 PM, Paul Johnson wrote: I apologize that these questions are stupid and literal. I write to ask for clarification of comments in the R extensions manual about vignettes. I'm not great at LaTeX, but I'm not a complete novice either, and some of the comments are puzzling to me. 1. I'm stumbling over this line: Make sure all files needed to run the R code in the vignette (data sets, ...) are accessible by either placing them in the inst/doc hierarchy of the source package or by using calls to system.file(). Where it says inst/doc, can I interpret it to mean vignettes? The vignette files are under vignettes. Why wouldn't those other files be in there? Or does that mean I'm supposed to copy the style and bib files from the vignettes folder to the inst/doc folder? Or none of the above :) I think the idea is that a user looking at an installed version of the package will be able to see things that are in the doc/ directory of the installed package. This automatically includes the source files (eg *.Stex) from vignettes/ and also the generated *.pdf and the *.R files stripped from the *.Stex files. If you want them to have access to other files then you should put those somewhere so they get installed, such as in the source package /inst/doc directory so they get put in the doc/ directory of the installed package. That should probably include anything else that is important to reproduce the results in the vignette, but I do not count the .bib file in that list (so I have it in vignettes/ and users would need to look at the package source to find it). 2. I'm also curious about the implications of the parenthesized section of this comment: By default R CMD build will run Sweave on all files in Sweave format in vignettes, or if that does not exist, inst/doc (but not in sub-directories). At first I though that meant it will search vignettes and subdirectories under vignettes, or it will look under inst/doc, but no subdirectories under inst/doc. So I created vignettes in subdirectories under vignettes and they are ignored by the build process, so that was obviously wrong. For clarification, it would help me if the manual said By default R CMD build will run Sweave on all files in Sweave format in vignettes (but not in sub-directories), or if that does not exist, inst/doc . In this list I've read several questions/complaints from people who don't want their vignettes rebuild during the package check or build process, and I wondered if there is a benefit to having vignettes in subdirectories. Could inclusion of troublesome vignettes in subdirectories be a way that people can circumvent the rebuilding and re-checking of vignettes during build, check, or install? If I build my vignettes manually and copy the pdf output over to inst/doc, will those pdf files be legitimate vignette files as far as CRAN is concerned? The writeup in R Extensions is a little bit confusing on that point. By including the PDF version in the package sources it is not necessary that the vignette PDFs can be re-built at install time, i.e., the package author can use private R packages, screen snapshots and LaTeX extensions which are only available on his machine. Its just confusing, that's all I can say about it. There was at least one earlier R-devel discussion of this, in which I contributed an incorrect understanding, but was generally straightened out by Uwe. I hope I have a correct understanding now. You can put a pdf file in inst/doc and specify BuildVignettes: false in the DESCRIPTION file, in which case the already constructed pdf from inst/doc will be used. The purpose of this is to allow vignettes which would not be completely constructed from sources, for example, because certain data or other resources may not be generally available. However, R CMD check will still try to parse the Sweave file and run the R code, and fail if it does not run. So, when the resources to build the vignette are not generally available this does require some special attention, often with try(), in the code for your vignette. It is possible to claim special exemption for a vignette. If the reasons seem valid then that package will be put on a special list which allows skipping the vignette when the package is tested for CRAN. The reason for somewhat tight control on this by CRAN maintainers is that the vignettes have proven to be a good check on problems with packages, so skipping them will reduce quality, and so CRAN maintainers do not want to provide an easy option to skip this check. There have been a variety of mechanism suggested on R-devel for subverting the CRAN checks of the vignette code. My interpretation is that these should generally be considered contrary to the spirit of what CRAN maintainers are attempting to do, and package maintainers should expect continuing problems as the loopholes are removed. Paul Gilbert I could
Re: [Rd] equivalent to source() inside a package
Is there a reason for not using a vignette or putting a file in the demo/ directory? This seems like the sort of thing for which they are intended. Paul On 12-05-25 03:33 PM, Wei Hao wrote: Hi all: I'm working on a project that I have packaged for ease of distribution. The different simulations in the package share code, so obviously I have those parts organized as functions. Now, I want to show people my code, but the structure with the internal functions might be a little confusing to follow. One thing I tried was to have the code of the functions as their own R files in the R/ folder, and then using source() instead of calling the functions (with consistent variable names and such) but this didn't work. The goal is for the user to be able to see the entirety of the code in the interactive R session, i.e. with a standard package implementation: library(wei.simulations) sim1 function (seed=) { [stuff] a = internal_function1(data) [stuff] } I would like the user to see: sim1 function (seed=) { [stuff] tmp = apply(data,1,mean) a = sum(tmp) #or whatever, this is just an example [stuff] } where I can change those two lines in their own file, and have the changes apply for all the simulation functions. I know this seems like a weird question to ask, but it would be useful for me to make it as foolproof as possible for the user to see all the simulation code (I'm presuming the user is a casual R user and not familiar with looking through package sources). Thanks Wei __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Vignette questions
On 12-04-12 03:15 AM, Uwe Ligges wrote: On 12.04.2012 01:16, Paul Gilbert wrote: On 12-04-11 04:41 PM, Terry Therneau wrote: Context: R2.15-0 on Ubuntu. 1. I get a WARNING from CMD check for Package vignette(s) without corresponding PDF: In this case the vignettes directory had both the pdf and Rnw; do I need to move the pdf to inst/doc? Yes, you need to put the pdf in the inst/doc directory if it cannot be built by R-forge and CRAN check machines, but leave the Rnw in the vignettes directory. No, this is all done automatically by R CMD build, hence you do not need to worry. Now I am not sure if I am confused or if you missed the if it cannot be built by R-forge and CRAN part of my sentence. I understand that this is done automatically by R CMD build for vignettes that can be built on all, or most, R platforms. In the situation where R CMD build on R-forge will fail, or not result in a complete vignette pdf, I think it is necessary to put a good pdf in inst/doc in order to get a build on R-forge that can be submitted to CRAN. That is, in situations like: -the vignette requires databases or drivers not generally available -the vignette (legitimately) takes forever to run -the vignette requires a cluster I am now wondering what the recommended practice is. What I have been doing, which I thought was the recommended practice, is to put the vignette Rnw (Stex) file in vignettes/ and put a pdf, constructed on a machine that has appropriate resources, into inst/doc. Is that the recommended way to proceed? Related, some have commented that they put a pdf in inst/doc and then leave out the vignette Rnw file to avoid error messages. Is the discouraged or encouraged? Paul I'm reluctant to add the pdf to the svn source on Rforge, per the usual rule that a code management system should not have both a primary source and a object dervived from it under version control. However, if this is the suggested norm I could do so. Yes, I think this is the norm if the vignette cannot be built on CRAN and R-forge, Well, yours are that specific that they rely on third party software. Vignettes only depending on R and installed packages that are declared as dependencies can be build by CRAN. even though it does seem a bit strange. However, you do not necessarily need to update the vignette pdf in inst/doc every time you make a change to the package even though, in my opinion, the correct logic is to test remaking the vignette when you make a change to the package. You should do this testing, of course, you just do not need to put the new pdf in inst/doc and commit it to svn each time. (But you should probably do that before you build the final package to put on CRAN.) R CMD build will rebuild vignettes unless you ask R not to do so. Uwe 2. Close reading of the paragraph about vignette sources shows the following -- I think? If I have a vignette that should not be rebuilt by check or BUILD I should put the .Rnw source and pdf in /inst/doc, and have the others that should be rebuilt in /vignettes. This would include any that use private R packages, screen snapshots, ..., or in my case one that takes just a little short of forever to run. I don't think it is intended to say that, and I didn't read it that way. I think putting the Rnw in inst/doc is supported (temporarily?) for historical reasons only. If it is not in vignettes/ and is found in inst/doc/, it is treated the same way as if it were in vignettes/. You can include screen snapshots, etc, in either case. For your situation, what you probably do need to do is specify BuildVignettes: false in the DESCRIPTION file. This prevents the pdf for inst/doc from being generated by the the Rnw. However, it does not prevent R CMD check from checking that the R code extracted from the Rnw actually runs, and generating an error if it does not. To prevent testing of the R code, you have to appeal directly to the CRAN and R-forge maintainers, and they will put the package on a special list. You do need to give them a good reason why the code should not be tested. I think they are sympathetic with takes forever to run and not very sympathetic with does not work anymore. Generally, I think they want to consider doing this only in exceptional cases, so they do not get in a situation of having lots of broken vignettes. (One should stick with journal articles for recording broken code.) 3. Do these unprocessed package also contribute to the index via \VignetteIndexEntry lines, or will I need to create a custom index? I'm not sure of the answer to this, but would be curious to know. You may need to rely on voodoo. Paul Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Vignette questions
On 12-04-11 04:41 PM, Terry Therneau wrote: Context: R2.15-0 on Ubuntu. 1. I get a WARNING from CMD check for Package vignette(s) without corresponding PDF: In this case the vignettes directory had both the pdf and Rnw; do I need to move the pdf to inst/doc? Yes, you need to put the pdf in the inst/doc directory if it cannot be built by R-forge and CRAN check machines, but leave the Rnw in the vignettes directory. I'm reluctant to add the pdf to the svn source on Rforge, per the usual rule that a code management system should not have both a primary source and a object dervived from it under version control. However, if this is the suggested norm I could do so. Yes, I think this is the norm if the vignette cannot be built on CRAN and R-forge, even though it does seem a bit strange. However, you do not necessarily need to update the vignette pdf in inst/doc every time you make a change to the package even though, in my opinion, the correct logic is to test remaking the vignette when you make a change to the package. You should do this testing, of course, you just do not need to put the new pdf in inst/doc and commit it to svn each time. (But you should probably do that before you build the final package to put on CRAN.) 2. Close reading of the paragraph about vignette sources shows the following -- I think? If I have a vignette that should not be rebuilt by check or BUILD I should put the .Rnw source and pdf in /inst/doc, and have the others that should be rebuilt in /vignettes. This would include any that use private R packages, screen snapshots, ..., or in my case one that takes just a little short of forever to run. I don't think it is intended to say that, and I didn't read it that way. I think putting the Rnw in inst/doc is supported (temporarily?) for historical reasons only. If it is not in vignettes/ and is found in inst/doc/, it is treated the same way as if it were in vignettes/. You can include screen snapshots, etc, in either case. For your situation, what you probably do need to do is specify BuildVignettes: false in the DESCRIPTION file. This prevents the pdf for inst/doc from being generated by the the Rnw. However, it does not prevent R CMD check from checking that the R code extracted from the Rnw actually runs, and generating an error if it does not. To prevent testing of the R code, you have to appeal directly to the CRAN and R-forge maintainers, and they will put the package on a special list. You do need to give them a good reason why the code should not be tested. I think they are sympathetic with takes forever to run and not very sympathetic with does not work anymore. Generally, I think they want to consider doing this only in exceptional cases, so they do not get in a situation of having lots of broken vignettes. (One should stick with journal articles for recording broken code.) 3. Do these unprocessed package also contribute to the index via \VignetteIndexEntry lines, or will I need to create a custom index? I'm not sure of the answer to this, but would be curious to know. You may need to rely on voodoo. Paul Terry Therneau __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
Mark I would like to clarify two specific points. On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote: ... Someone has subsequently decided that code should look a certain way, and has added a check that isn't in the language itself-- but they haven't thought of everything, and of course they never could. There is a large overlap between people writing the checks and people writing the interpreter. Even though your code may have been working, if your understanding of the language definition is not consistent with that of the people writing the interpreter, there is no guarantee that it will continue to work, and in some cases the way in which it fails could be that it produces spurious results. I am inclined to think of code checks as an additional way to be sure my understanding of the R language is close to that of the people writing the interpreter. It depends on how Notes are being interpreted, which from this thread is no longer clear. The R-core line used to be Notes are just notes but now we seem to have significant Notes and ... My understanding, and I think that of a few other people, was incorrect, in that I thought some notes were intended always to remain as notes, and others were more serious in that they would eventually become warnings or errors. I think Uwe addressed this misunderstanding by saying that all notes are intended to become warnings or errors. In several cases the reason they are not yet warnings or errors is that the checks are not yet good enough, they produce too many false positives. So, this means that it is very important for us to look at the notes and to point out the reasons for the false positives, otherwise they may become warnings or errors without being recognised as such. ... Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R-forge --as-cran
(Renamed from Re: [Rd] CRAN policies because the of the muli-threading of that subject.) Claudia Actually, my version numbers are year-month dates, eg 2012.3-1, although I don't set them automatically. I have had some additional off-line discussion on this. The problem is this: Now when I submit version 2012.3-1 to CRAN, any checks of that package on R-forge will fail, until I change the version number. This is by specific request of the CRAN maintainers to the R-forge maintainers, the reason being, understandably, that the CRAN maintainers do not like getting submissions without the version number changed. One implication of this is that I should change the R-forge version number as soon as I make any changes to the package, even if I am going to change it again before I actually release to CRAN. This seems like a reasonable practice, even if I have not always done that. The case where the code on R-forge remains unchanged for some time after it is released to CRAN is more subtle. If R-forge does not re-run the checks until I make a change, as is the current situation, then the package will still be indicated as ok on the R-forge pkg page. However, when R is upgraded, I would like the checks to be re-run on all platforms, not just on my own testing platform. But when that is done, the R-forge indication is going to be that the package failed, because the version number is the same as on CRAN. The information I want is actually available on the CRAN daily check. I just need to know that when my package is unchanged from the version on CRAN, I should look at CRAN daily rather than at the R-forge result. Paul On 12-03-30 10:38 AM, Claudia Beleites wrote: Paul, One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my [snip] I am curious how other developers approach this. Regardless of --as-cran I find it very useful to use the date as minor part of the version number (e.g. hyperSpec 0.98-20120320), which I set automatically. Claudia __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 12-03-29 09:29 PM, mark.braving...@csiro.au wrote: I'm concerned this thread is heading the wrong way, towards techno-fixes for imaginary problems. R package-building is already encumbered with a huge set of complicated rules, and more instructions/rules eg for metadata would make things worse not better. RCMD CHECK on the 'mvbutils' package generates over 300 Notes about no visible binding..., which inevitably I just ignore. They arise because RCMD CHECK is too stupid to understand one of my preferred coding idioms (I'm not going to explain what-- that's beside the point). Actually, I think that is the point. If your code is generating that many notes then I think you should explain your idiom, so the checks can be made to accommodate it if it really is good. Otherwise, I'd be worried about the quality of your code. And RCMD CHECK always will be too stupid to understand everything that a rich language like R might quite reasonably cause experienced coders to do. Possibly the interpreter is too stupid to understand it too? It should not be CRAN's business how I write my code, or even whether my code does what it is supposed to. It might be CRAN's business to try to work out whether my code breaks CRAN's policies, eg by causing R to crash horribly-- that's presumably what Warnings are for (but see below). And maybe there could be circumstances where an automatic check might be worried enough to alert the CRANia and require manual explanation and emails etc from a developer, but even that seems doomed given the growing deluge of packages. RCMD CHECK currently functions both as a sanitizer for CRAN, and as a developer-tool. But the fact that the one programl does both things seems accidental to me, and I think this dual-use is muddying the discussion. There's a big distinction between (i) code-checks that developers themselves might or might not find useful-- which should be left to the developer, and will vary from person to person-- I think this a case of two heads are better than one. I did lots of checks before the CRAN checks existed, but the CRAN checks still found bugs in code that I considerer very mature, including bugs in code has been running without noticeable problems for over 15 years. Despite all the noise today, most of us are only talking about a small inconvenience around the intended meaning of note, not about whether quality control is a bad thing. I've found the errors and warnings are always valid, even though I do not always like having to fix the bugs, and the notes are most often valid too. But there are a few false positives, so the checks that give notes are not yet reliable enough to give warnings or errors. But they should be sometime, so one should usually consider fixing the package code. and (ii) code-checks that CRAN enforces for its own peace-of-mind. I think of this as being for the piece-of-mind of your package users. Maybe it's convenient to have both functions in the same place, and it'd be fine to use Notes for one and Warnings for the other, but the different purposes should surely be kept clear. Personally, in building over 10 packages (only 2 on CRAN), I haven't found RCMD CHECK to be of any use, except for the code-documentation and example-running bits. I know other people have different opinions, but that's the point: one-size-does-not-fit-all when it comes to coding tools. And wrto the Warnings themselves: I feel compelled to point out that it's logically impossible to fully check whether R code will do bad things. One has to wonder at what point adding new checks becomes futile or counterproductive. There must be over 2000 people who have written CRAN packages by now; every extra check and non-back- compatible additional requirement runs the risk of generating false- negatives and incurring many extra person-hours to fix non-problems. Plus someone needs to document and explain the check (adding to the rule mountain), plus there is the time spent in discussions like this..! Bugs in your packages will require users to waste a lot of time too, and possibly reach faulty results with much more serious consequences. Just because perfection may never be attained, this does not mean that progress should not be attempted, in small steps. Compared to Statlib, which basicly followed your recommended approach, CRAN is a vast improvement. Paul Mark Mark Bravington CSIRO CMIS Marine Lab Hobart Australia From:r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] On Behalf Of Hadley Wickham [had...@rice.edu] Sent: 30 March 2012 07:42 To: William Dunlap Cc:r-de...@stat.math.ethz.ch; Spencer Graves Subject: Re: [Rd] CRAN policies Most of that stuff is already in codetools, at least when it is checking functions with checkUsage(). E.g., arguments of ~ are not checked. The expr argument to with() will not be checked if
[Rd] --as-cran / BuildVignettes: false
I have packages where I know CRAN and other test platforms do not have all the resources to build the vignettes, for example, access to databases. Previously I think putting BuildVignettes: false in the DESCRIPTION file resolved this, by preventing CRAN checks from attempting to run the vignette code. (If it was not this, then there was some other magic I don't understand.) Now, when I specify --as-cran, the checks fail when attempting to check R code from vignettes, even though I have BuildVignettes: false in the DESCRIPTION file. What is the mechanism for indicating that CRAN should not attempt to check this code? Perhaps it is intentionally difficult - I can see an argument for that. (For running tests there are environment variables, e.g._R_CHECK_HAVE_MYSQL_, but using these really clutters up a vignette, and it did not seem necessary to use them before.) (The difficult also occurs on R-forge, possibly because it is using --as-cran like settings.) Paul __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
On 12-03-27 10:59 AM, Uwe Ligges wrote: On 27.03.2012 16:17, Paul Gilbert wrote: One of the things I have noticed with the R 2.15.0 RC and --as-cran is that the I have to bump the version number of the working copy of my packages immediately after putting a version on CRAN, or I get an message about version suitability. This is probably a good thing for packages that I have changed, compared with my old habit of bumping the version number at arbitrary times, although the mechanics are a nuisance because I do not actually want to commit to the next version number at that point. For packages that I have not changed it is a bit worse, because I have to change the version number even though I have not yet made any changes to the package. This will mean, for example, that on R-forge it will look like there is a slightly newer version, even though there is not really. I am curious how other developers approach this. Is it better to not specify --as-cran most of the time? My feeling is that it is better to specify it all of the time so that I catch errors sooner rather than later, but maybe there is a better solution? --as-cran is modelled rather closely after the CRAN incoming checks. CRAN checks if a new version has a new version number. Of course, you can ignore its result if you do not want to submit. The idea of using --as-cran is to apply it before you actually submit. Some parts require network connection etc. Uwe Yes but, for example, will R-forge run checks with --as-cran, and thus give warnings for any package unchanged from the one on CRAN, or run without --as-cran, and thus not give a true indication of whether the package is good to submit? (No doubt R-forge will customise more, but I am trying to work out a strategy for my own automatic testing.) Paul Paul On 12-03-27 07:52 AM, Prof Brian Ripley wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Kurt Hornik Uwe Ligges Brian Ripley __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] CRAN policies
An associated problem, for the wish list, is that it would be nice for package developers to have a way to automatically distinguish between NOTEs that can usually be ignored (e.g. a package suggests a package that is not available for cross reference checks - I have several case where the suggested package depends on the package being built, so this NOTE occurs all the time), and NOTEs that are really pre-WARNINGS, so that one can flag these and spend time fixing them before they become a WARNING or ERROR. Perhaps two different kinds of notes? (And, BTW, having been responsible for a certain amount of the [*] Since answering several emails a day about why their results were different was taking up far too much time. I think --as-cran is great.) Paul On 12-03-27 02:19 PM, Uwe Ligges wrote: On 27.03.2012 19:10, Jeffrey Ryan wrote: Is there a distinction as to NOTE vs. WARNING that is documented? I've always assumed (wrongly?) that NOTES weren't an issue with publishing on CRAN, but that they may change to WARNINGS at some point. We won't kick packages off CRAN for Notes (but we will if Warnings are not fixed), but we may not accept new submissions with significant Notes. Best, Uwe Ligges Is the process by which this happens documented somewhere? Jeff On 3/27/12 11:09 AM, Gabor Grothendieckggrothendi...@gmail.com wrote: 2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de: On 27.03.2012 17:09, Gabor Grothendieck wrote: On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: CRAN has for some time had a policies page at http://cran.r-project.org/web/packages/policies.html and we would like to draw this to the attention of package maintainers. In particular, please - always send a submission email to c...@r-project.org with the package name and version on the subject line. Emails sent to individual members of the team will result in delays at best. - run R CMD check --as-cran on the tarball before you submit it. Do this with the latest version of R possible: definitely R 2.14.2, preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are able to give better diagnostics, e.g. for compiled code and especially on Windows. They may also have extra checks for recently uncovered problems.) Also, please note that CRAN has a very heavy workload (186 packages were published last week) and to remain viable needs package maintainers to make its life as easy as possible. Regarding the part about warnings or significant notes in that page, its impossible to know which notes are significant and which ones are not significant except by trial and error. Right, it needs human inspection to identify false positives. We believe most package maintainers are able to see if he or she is hit by such a false positive. The problem is that a note is generated and the note is correct. Its not a false positive. But that does not tell you whether its significant or not. There is no way to know. One can either try to remove all notes (which may not be feasible) or just upload it and by trial and error find out if its accepted or not. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RC / methods package
On 12-03-25 05:29 PM, Paul Gilbert wrote: John Here is the definition of the TSMySQLConnection class, and a few other things. This is a simplified example that produces the message, but unfortunately will not work unless you have a MySQL database to connect to. (I do get the same problem with PostgreSQL, and may with SQLLite, but I have not tested the last yet.) require(methods) require(DBI) require(RMySQL) setClassUnion(OptionalPOSIXct, c(POSIXct, logical)) setClass(conType, representation( drv=character), VIRTUAL ) setClass(TSdb, representation( dbname=character, hasVintages=logical, hasPanels=logical), VIRTUAL ) setClass(TSMySQLConnection, contains=c(MySQLConnection, conType, TSdb)) setGeneric(TSconnect, def= function(drv, dbname, ...) standardGeneric(TSconnect)) setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character), definition=function(drv, dbname, ...) { con - dbConnect(drv, dbname=dbname, ...) new(TSMySQLConnection , con, drv=MySQL, dbname=dbname, hasVintages=dbExistsTable(con, vintageAlias), hasPanels =dbExistsTable(con, panels)) }) con - TSconnect(dbDriver(MySQL), test) dbGetQuery(con, show tables) Note: Method with signature MySQLConnection#integer chosen for function coerce, target signature TSMySQLConnection#integer. dbObjectId#integer would also be valid Tables_in_test 1 A 2 B The message also seems to go away, even quitting R and restarting to clear the cache, if I change the TSconnect method as follow setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character), definition=function(drv, dbname, ...) { con - dbConnect(drv, dbname=dbname, ...) new(TSMySQLConnection , con, drv=MySQL, dbname=dbname, hasVintages=FALSE, hasPanels =FALSE) }) Why this would happen makes absolutely no sense to me. In the first version is dbExistsTable(con, vintageAlias) left unevaluated in the result from new? This is very strange. With setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character), definition=function(drv, dbname, ...) { con - dbConnect(drv, dbname=dbname, ...) hasVintages - as.logical(dbExistsTable(con, vintageAlias) ) hasPanels - as.logical(dbExistsTable(con, panels) ) new(TSMySQLConnection , con, drv=MySQL, dbname=dbname, hasVintages=FALSE, hasPanels =FALSE) }) I get the note, but if I remove the two lines that appear to do nothing: setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character), definition=function(drv, dbname, ...) { con - dbConnect(drv, dbname=dbname, ...) new(TSMySQLConnection , con, drv=MySQL, dbname=dbname, hasVintages=FALSE, hasPanels =FALSE) }) I no longer get the note. I am restarting R each time to be sure nothing is cached. [ R version 2.15.0 RC (2012-03-25 r58832) ] Paul As you can tell, I'm struggling a bit with interpreting the information from the note. Also, if it were a warning I could set it to stop, and then traceback to what was causing the problem. As it is, it took me a fairly long time just to get the fact that the call to dbGetQuery() was generating the message. And caching the methods may be good for performance, but when things change the second time you call them it sure makes debugging difficult. Best, Paul On 12-03-25 03:24 PM, John Chambers wrote: On 3/24/12 5:43 PM, Paul Gilbert wrote: On 12-03-24 08:11 PM, John Chambers wrote: On 3/24/12 1:29 PM, Paul Gilbert wrote: (I think this is being caused by the new methods package in RC.) Possibly, but the methods package isn't particularly new in its method selection. We need to see the definition of the class. Is there a way to know which class it is that we need to see the definition for? It's in the note: 'target signature TSMySQLConnection#integer'. In functional OOP with multiple dispatch, it's all the classes that matter in general, but in this and most cases, one class is likely the relevant one: TSMySQLConnection. That was why I said what I did before. (We could go to a bit more effort and back-translate the dispatch string TSMySQLConnection#integer into the corresponding formal arguments. Would be more natural with the INSTALL time tool I mentioned before. That's the real challenge here -- to give information about this to the package developer, not the poor user.) John Paul The note implies that it inherits from both MySQLConnection and dbObjectId, both of which have methods for coercing to integer. Hence the ambiguity. In the RC (March 24) some of my packages are generating a Note Note: Method with signature MySQLConnection#integer chosen for function coerce, target signature TSMySQLConnection#integer. dbObjectId#integer would also be valid This is coming from a call to dbGetQuery() in package DBI. The method with the signature TSMySQLConnection#integer is generated automatically because TSMySQLConnection inherits from MySQLConnection. (More details below.) Is there a way