from:"Paul Gilbert"

Re: [Rd] Apple M1 CRAN checks

2021-02-28 Thread Paul Gilbert

Simon,

Yes, I can imagine it is not trivial testing. I hope you have a stack of 
minis in a cluster.

It looks like a trivial transgression,  max.error =  1.065814e-14 where 
my tolerance is set at 1e-14. Many of the other tests already have a 
larger tolerance. Very possibly it is not actually the tolerance that is 
the problem. The test value itself is just determined by a run on 
another machine, so that may not be in the middle of the result 
distribution.

I'll fix it shortly.

Thanks,
Paul

On 2021-02-28 4:23 p.m., Simon Urbanek wrote:

Paul,

this is being worked on. As you can imagine testing over 17,000 package in a M1 
Mac mini isn't quite trivial. The first priority was to get the nightly R 
builds to work. Second was to get CRAN package builds to work. Third is to 
provide checks. The first two have finished last week and the checks have been 
running for the past two days. Unfortunately, some pieces (like XQuartz) are 
still not quite stable so it takes more manual interventions than one would 
expect. We are at close to 16k packags checked, so we're getting there.

As for EvalEst the check has finished so I have:

   Running ‘dse2tstgd2.R’ [13s/14s]
Running the tests in ‘tests/dse2tstgd2.R’ failed.
Last 13 lines of output:
   >   ok <-  fuzz.large > error
   >   if (!ok) {if (is.na(max.error)) max.error <- error
   + else max.error <- max(error, max.error)}
   >   all.ok <- all.ok & ok
   >   {if (ok) cat("ok\n") else cat("failed! error= ", error,"\n") }
   ok
   >
   >   cat("All Brief User Guide example tests part 2 completed")
   All Brief User Guide example tests part 2 completed>  if (all.ok) cat(" 
OK\n")  else
   + cat(", some FAILED! max.error = ", max.error,"\n")
   , some FAILED! max.error =  1.065814e-14
   >
   >   if (!all.ok) stop("Some tests FAILED")
   Error: Some tests FAILED
   Execution halted

when I run it by hand I get ok for all but:

Guide part 2 test 10...
failed! error=  1.065814e-14

  sum(fc1$forecastCov[[1]])

[1] 14.933660144821400806

  sum(fc2$forecastCov[[1]])

[1] 14.933660144821400806

sum(fc2$forecastCov.zero)

[1] 31.654672476928304548

sum(fc2$forecastCov.trend)

[1] 18.324461923341957004

c(14.933660144821400806 - sum(fc1$forecastCov[[1]]),

+ 14.933660144821400806 - sum(fc2$forecastCov[[1]]),
+ 31.654672476928297442 - sum(fc2$forecastCov.zero),
+ 18.324461923341953451 - sum(fc2$forecastCov.trend) )
[1]  0.000e+00  0.000e+00 
-1.0658141036401502788e-14
[4] -3.5527136788005009294e-15

I hope this helps you to track it down.

Cheers,
Simon

On Mar 1, 2021, at 4:50 AM, Paul Gilbert  wrote:

If there was a response to the "how can I test it out" part of this question 
then I missed it. Can anyone point to a Win-builder like site for testing on M1mac, or to 
the M1mac results from testing packages already on CRAN?  They still do not seem to be on 
the CRAN daily site. Even a link to the 'Additional issues' on M1 Mac on the results 
pages would be helpful because it does not seem to be in an obvious place. I am trying to 
respond to a demand to relax or remove some package testing that fails because M1mac 
gives results outside my specified tolerances.

The tests in question (in package EvalEst) have been used since very early R 
versions (0.16 circa 1995), and used on Splus prior to that. There has been a 
need to adjust tolerances occasionally, but they have been stable for a long 
time (more than 20 years I believe). Since these tests date from a time when 
simple double precision was the norm, the tolerances are already fairly relaxed 
so I hesitate to adjust them with actually examining the results.

Paul Gilbert

On 2021-02-22 3:30 a.m., Travers Ching wrote:

I noticed CRAN is now doing checks against Apple M1, and some packages are
failing including a dependency I use.
Is building on M1 now a requirement, or can the check be ignored? If it's a
requirement, how can one test it out?
Travers
[[alternative HTML version deleted]]
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Apple M1 CRAN checks

2021-02-28 Thread Paul Gilbert

If there was a response to the "how can I test it out" part of this 
question then I missed it. Can anyone point to a Win-builder like site 
for testing on M1mac, or to the M1mac results from testing packages 
already on CRAN?  They still do not seem to be on the CRAN daily site. 
Even a link to the 'Additional issues' on M1 Mac on the results pages 
would be helpful because it does not seem to be in an obvious place. I 
am trying to respond to a demand to relax or remove some package testing 
that fails because M1mac gives results outside my specified tolerances.


The tests in question (in package EvalEst) have been used since very 
early R versions (0.16 circa 1995), and used on Splus prior to that. 
There has been a need to adjust tolerances occasionally, but they have 
been stable for a long time (more than 20 years I believe). Since these 
tests date from a time when simple double precision was the norm, the 
tolerances are already fairly relaxed so I hesitate to adjust them with 
actually examining the results.


Paul Gilbert

On 2021-02-22 3:30 a.m., Travers Ching wrote:

I noticed CRAN is now doing checks against Apple M1, and some packages are
failing including a dependency I use.

Is building on M1 now a requirement, or can the check be ignored? If it's a
requirement, how can one test it out?

Travers

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] CRAN check texi2dvi failure

2021-01-10 Thread Paul Gilbert

Thanks Enrico for the great guess, and Georgi for the details.

If I omit the space as seems to be implied in some documentation, changing
  \verb <https://www.bankofcanada.ca/2006/03/working-paper-2006-3> .
to
  \verb<https://www.bankofcanada.ca/2006/03/working-paper-2006-3> .

then the R CMD check error "\verb ended by end of line" happens on my 
linux machine. I did not try replacing the space with another 
deliminator, which I guess would now be the correct way to use \verb. 
The solution of adding

  \usepackage{url}

and changing to
  \url{https://www.bankofcanada.ca/2006/03/working-paper-2006-3}.

does seem to work. (No "on CRAN" confirmation yet but I have not had the 
immediate pre-test rejection that I got previously.)

Paul

On 2021-01-10 8:04 a.m., Georgi Boshnakov wrote:
> The problem is not in the Warning from the example but from
> the \verb commands in the references.
> You use space to delimit the argument of \verb and I was surprised
> that it worked since TeX ignores spaces after commands.
> Apparently, this has been an exception for \verb but now this feature
> is considered a bug and hs been recently fixed, see the atacjexchange
> question below and the relevant paragraph from LaTeX News. Probably
> the linux machines have updated their TeX installations.
>
> In short, changing the space tp  say  +  delimiter for \verb command 
should fix the issue.

>
> Georgi Boshnakov
>

On 2021-01-09 6:52 p.m., Enrico Schumann wrote:

When I run R CMD check on my Linux machine [*], I also
do not get an error.  But here is a guess: The error
mentions \verb, and the LaTeX manual says that \verb
should be followed by nonspace character.  But in the
vignette it is followed by a space.  Maybe using \url
in the vignette could fix the error?

kind regards
 Enrico

[*] R version 4.0.3 (2020-10-10)
 Platform: x86_64-pc-linux-gnu (64-bit)
 Running under: Ubuntu 20.10

> On Sat, 09 Jan 2021, Paul Gilbert writes:

I am trying to debug a problem that is appearing in the
linux and Solaris checks, but not Windows or Mac
checks, of my package tsfa as reported at
https://cran.r-project.org/web/checks/check_results_tsfa.html

The problem is with re-building the vignette
   ...
   this is package 'tsfa' version '2014.10-1'
   ...
  checking re-building of vignette outputs ... [6s/9s] WARNING
  Error(s) in re-building vignettes:
   ...
 Running 'texi2dvi' on 'Guide.tex' failed.
 LaTeX errors:
 ! LaTeX Error: \verb ended by end of line.
   ...

In responding to the threat of removal I have also
fixes some long standing warnings about adding imports
to the NAMESPACE. The new version builds with --as-cran
giving no errors or warnings with both R-devel on
win-builder (2021-01-07 r79806) and on my linux machine
(R 2021-01-08 r79812 on Linux Mint 19.3 Tricia). When I
submit it to CRAN the Windows build is OK but the same
error happens at the 'texi2dvi' step in the debian
vignette re-build.

This seems to happens after an example that correctly
has a warning message (about Heywood cases). In my
linux build the the warning happens but the message
does not appear in the pdf output, so one possibility
is that the handling of the warning on the CRAN Unix
check machines fails to produce clean tex or suppress
output. Another possibility is that my build using
--as-cran is different from the actual CRAN build
options. For example, my 00check.log shows
...
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* checking for non-standard things in the check directory ... OK
...

so I am not sure if it uses texi2dvi. (I haven't used
dvi myself for a long time.)

I'm not sure how to debug this when I can't reproduce
the error. Suggestions would be appreciated.

Paul Gilbert

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] translation .mo files

2020-02-08 Thread Paul Gilbert

I have been sent .po and .mo files with message translations for one of 
my packages. The .po file I know goes in the source package po/ 
directory but I have not had .mo files previously. The translator thinks 
the .mo file goes in inst/po. The .mo file seems to be generated from 
the .po file, but I am not sure if that happens in the install of the 
source package, or in some pre-process. I thought I could determine this 
by looking at an installed package, but I don't see .po or .mo files in 
installed packages. So far I have had no luck finding documentation on 
these details. So I have three questions.


-Should the .mo file be included in the package, and if so, where?

-When a package is installed, where does the translation information go 
in the directory structure of the library?


-Is this documented somewhere? (Please not a vague reference to 'Writing 
R Extensions', I've looked there and many other places. I need a section 
or page reference.)


Thanks,
Paul Gilbert

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] survival changes

2019-06-02 Thread Paul Gilbert


Terry

Let me call this things to think about, rather than advice. I went 
through a similar process twice, once about 30 years ago and once about 
20 years ago. I had fewer dependent packages of course, but still enough 
to cause headaches. I don't recommend doing it often.


- I think you need to consider where you would like to end up before 
deciding how to get there. If you end up having to maintain a lot of 
legacy stuff I don't think you will be very happy. So then the problem 
becomes how to help people get off the part you want to abandon, rather 
then how to help them stay on it.


- I know you are very experienced, but I will be really impressed if you 
get the new approach perfect on the first shot. That argues for having a 
new package with hardly any users so you can fiddle with the API more 
easily, and not deprecating the old one until you are really happy with 
the new one.


- There may be a part which is common to both old and new and/or there 
may be a part which is what most dependent packages use. If you can 
separate that out as something like survivalBase it would make your life 
easier. That will be especial true if that part is more stable, so don't 
put in anything you are experimenting with.


Good luck,
Paul Gilbert

On 6/1/19 8:02 PM, Therneau, Terry M., Ph.D. via R-devel wrote:



On 6/1/19 1:32 PM, Marc Schwartz wrote:


On Jun 1, 2019, at 12:59 PM, Peter Langfelder 
 wrote:


On Sat, Jun 1, 2019 at 3:22 AM Therneau, Terry M., Ph.D. via R-devel
 wrote:
In the next version of the survival package I intend to make a 
non-upwardly compatable
change to the survfit object.  With over 600 dependent packages this 
is not something to
take lightly, and I am currently undecided about the best way to go 
about it.  I'm looking

for advice.

The change: 20+ years ago I had decided not to include the initial 
x=0,y=1 data point in
the survfit object itself.  It was not formally an estimand and the 
plot/points/lines etc
routines could add this on themselves.  That turns out to have been 
a mistake, and has led
to a steady proliferation of extra bits as I realized that the time 
axis doesn't always
start at 0, and later (with multi state) that y does not always 
start at 1 (though the
states sum to 1), and later the the error doesn't always start at 0, 
and another

realization with cumulative hazard, and ...
The new survfit method for multi-state coxph models was going to add 
yet another special
case.  Basically every component is turning into a duplicate of "row 
1" vs "all the

others".  (And inconsistently named.)

Three possible solutions
1. Current working draft of survival_3.0.3:  Add a 'version' element 
to the survfit object
and a 'survfit2.3' function that converts old to new.  All my 
downstream functions (print,
plot,...) start with an "if (old) update to new" line.  This has 
allowed me to stage
updates to the functions that create survfit objects -- I expect it 
to happen slowly.
There will also be a survfit3.2 function to go backwards. Both the 
forward and backwards
functions leave objects alone if they are currently in the desired 
format.


2. Make a new class "survfit3" and the necessary 'as' functions. The 
package would contain
plot.survfit and plot.survfit3 methods, the former a two line 
"convert and call the

second" function.

3. Something I haven't thought of.

A more "clean break" solution would be to start a whole new package
(call it survival2) that would make these changes, and deprecate the
current survival. You could add warnings about deprecation and urging
users to switch in existing survival functions. You could continue
bugfixes for survival but only add new features to survival2. The new
survival2 and the current survival could live side by side on CRAN for
quite some time, giving maintainers of dependent packages (and just
plain users) enough time to switch. This could allow you to
change/clean up other parts of the package that you could perhaps also
use a rethink/rewrite, without too much concern for backward
compatibility.

Peter


Hi,

I would be cautious in going in that direction, bearing in mind that 
survival is a Recommended package, therefore included in the default R 
distribution from the R Foundation and other parties. To have two 
versions can/will result in substantial confusion, and I would argue 
against that approach.


There is language in the CRAN submission policy that covers API 
changes, which strictly speaking, may or may not be the case here, 
depending upon which direction Terry elects to go:


"If an update will change the package’s API and hence affect packages 
depending on it, it is expected that you will contact the maintainers 
of affected packages and suggest changes, and give them time (at least 
2 weeks, ideally more) to prepare updates before submitting your 
updated package. Do mention in the submission email which packages are 
affected a

Re: [R-pkg-devel] Submitting a package whose unit tests sometimes fail because of server connections

2019-04-17 Thread Paul Gilbert

5.4  In the spirit of simple & stupid you can also use the built in 
mechanism for doing this: organize some of your tests in subdirectories 
like inst/testWithInternet, inst/veryLongTests, 
inst/testsNeedingLicence, inst/testsNeedingSpecialCluster, etc. CRAN 
will only run the tests in the tests/ directory, but you can check them 
yourself  using

R CMD check  --test-dir=inst/testWithInternet   whatever.tar.gz

> In a separate response On 4/16/19 2:06 PM, Steven Scott wrote:
>  Just don't include the live fire stuff in the package.

Please do not do this. If you omit tests from your package then it 
cannot be properly checked by other people.

Paul Gilbert

On 4/16/19 2:16 PM, Dirk Eddelbuettel wrote:

On 16 April 2019 at 11:40, Will wrote:
| Some things I have considered include:
|
|1. Skipping all unit tests on CRAN (using something like
|*testtht::skip_on_cran*). This would immediately fix the problem, and as
|a mitigating factor we report automated test result and coverage on the
|package's GitHub page (https://github.com/ropensci/suppdata).
|2. Using HTTP-mocking to avoid requiring a call to a server during tests
|at all. I would be uncomfortable relying solely on this for all tests,
|since if the data hosters changed things we wouldn't know. Thus I would
|still want the Internet-enabled tests, which would also have to be turned
|off for CRAN (see 1 above). This would also be a lot of additional work.
|3. Somehow bypassing the requirement for the unit tests to all pass
|before the package is checked by the CRAN maintainers. I have no idea if
|this is something CRAN would be willing to do, or if it is even possible.
|It would be the easiest option for me, but I don't want to create extra
|work for other people!
|4. Slowing the tests with something like *Sys.sleep*. This might work,
|but would slow the tests massively and so might that cause problems for
|CRAN?
|
| Does anyone have any advice as to which of the above would be the best
| option, or who I should email directly about this?

5. Run a hybrid scheme where you have multiple levels:

5.1 Do what eg Rcpp does and only opt into 'all tests' when an overall
variable is set; that variable can be set conveniently in .travis.yml
and conditionally in your test runner below ~/tests/

That way you can skip tests that would fail.

5.2 Do a lot of work and wrap 3. above into try() / tryCatch() and pass
if _your own aggregation of tests_ passes a threshold.

Overkill to me.

5.3 Turn all tests on / off based on some other toggle. I.e. I don't think
I test all features of RcppRedis on CRAN as I can't assume a redis
server, but I do run those tests at home, on Travis, ...

Overall, I would recommend to 'keep it simple & stupid' (KISS) as life is to
short to argue^Hdebate this with CRAN. And their time is too precious so we
should try to make their life easier.

Dirk

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] code for sum function

2019-02-19 Thread Paul Gilbert


(I didn't see anyone else answer this, so ...)

You can probably find the R code in src/main/ but I'm not sure. You are 
talking about a very simple calculation, so it seems unlike that the 
algorithm is the cause of the difference. I have done much more 
complicated things and usually get machine precision comparisons. There 
are four possibilities I can think of that could cause (small) differences.


0/ Your code is wrong, but that seems unlikely on such a simple 
calculations.


1/ You are summing a very large number of numbers, in which case the sum 
can become very large compared to numbers being added, then things can 
get a bit funny.


2/ You are using single precision in fortran rather than double. Double 
is needed for all floating point numbers you use!


3/ You have not zeroed the double precision numbers in fortran. (Some 
compilers do not do this automatically and you have to specify it.) Then 
if you accidentally put singles, like a constant 0.0 rather than a 
constant 0.0D+0, into a double you will have small junk in the lower 
precision part.


(I am assuming you are talking about a sum of reals, not integer or 
complex.)


HTH,
Paul Gilbert

On 2/14/19 2:08 PM, Rampal Etienne wrote:

Hello,

I am trying to write FORTRAN code to do the same as some R code I have. 
I get (small) differences when using the sum function in R. I know there 
are numerical routines to improve precision, but I have not been able to 
figure out what algorithm R is using. Does anyone know this? Or where 
can I find the code for the sum function?


Regards,

Rampal Etienne

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] package fails with parallel make - would forcing a serial version work?

2019-01-14 Thread Paul Gilbert


(I didn't see an answer to this, so ...)

I think using .NOTPARALLEL will usually get rid of the error but, in my 
experience, this problem is usually caused by an incorrect or incomplete 
Makefile. When not done in parallel this missing target is usually 
getting done first as a side-affect of something that happens before and 
usually finishes before it is needed. Your luck does not hold in 
parallel. The better fix is to correct your Makefile.


Paul

On 1/10/19 4:54 PM, Satyaprakash Nayak wrote:

Dear R package developers

I published a package on CRAN last year (sundialr) which is now failing
with as it is not make to compile a static library with parallel make.

In this package, I compile a static library (libsundials_all.a) from source
files of a third party. The specifics of compiling the static library can
be found at - https://github.com/sn248/sundialr/blob/master/src/Makevars

Now, I got the following error message from CRAN (actually, I was informed
of this before, but had neglected to fix it). Here is the message from one
of the CRAN maintainers ..

***
This have just failed to install for me with a parallel make:

g++ -std=gnu++98 -std=gnu++98 -shared
-L/data/blackswan/ripley/extras/lib64 -L/usrlocal/lib64 -o sundialr.so
cvode.o RcppExports.o -L/data/blackswan/ripley/R/R-patched/lib -lRlapack
-L/data/blackswan/ripley/R/R-patched/lib -lRblas -lgfortran -lm
-lquadmath -L../inst/ ../inst/libsundials_all.a
g++: error: ../inst/libsundials_all.a: No such file or directory
make[1]: *** [/data/blackswan/ripley/R/R-patched/share/make/shlib.mk:6:
sundialr.so] Error 1
*

It seems the package fails to generate the static library with the parallel
make. The easiest solution I could think of for this problem was to force a
serial version of make using the .NOTPARALLEL phony command in Makevars and
Makevars.win(https://github.com/sn248/sundialr/blob/master/src/Makevars). I
have made this change and it seems to work on my machine and on testing
with TravisCI and Appveyor(https://github.com/sn248/sundialr).

However, before I re-submit to CRAN, I wanted to get an opinion as to will
this be enough to get rid of the error with parallel make?

Any suggestions would be very much appreciated, thank you!
Satyaprakash

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Extreme bunching of random values from runif with Mersenne-Twister seed

2017-11-05 Thread Paul Gilbert

I'll point out that there is there is a large literature on generating 
pseudo random numbers for parallel processes, and it is not as easy as 
one (at least me) would intuitively think. By a contra-positive like 
thinking one might guess that it will not be easy to pick seeds in a way 
that will produce independent sequences.


(I'm a bit confused about the objective but) If the objective is to 
produce independent sequence from some different seeds then the RNGs for 
parallel processing might be a good place to start. (And, BTW, if you 
want to reproduce parallel generated random numbers you need to keep 
track of both the starting seed and the number of nodes.)


Paul Gilbert

On 11/05/2017 10:58 AM, peter dalgaard wrote:



On 5 Nov 2017, at 15:17 , Duncan Murdoch <murdoch.dun...@gmail.com> wrote:

On 04/11/2017 10:20 PM, Daniel Nordlund wrote:

Tirthankar,
"random number generators" do not produce random numbers.  Any given
generator produces a fixed sequence of numbers that appear to meet
various tests of randomness.  By picking a seed you enter that sequence
in a particular place and subsequent numbers in the sequence appear to
be unrelated.  There are no guarantees that if YOU pick a SET of seeds
they won't produce a set of values that are of a similar magnitude.
You can likely solve your problem by following Radford Neal's advice of
not using the the first number from each seed.  However, you don't need
to use anything more than the second number.  So, you can modify your
function as follows:
function(x) {
set.seed(x, kind = "default")
y = runif(2, 17, 26)
return(y[2])
  }
Hope this is helpful,


That's assuming that the chosen seeds are unrelated to the function output, 
which seems unlikely on the face of it.  You can certainly choose a set of 
seeds that give high values on the second draw just as easily as you can choose 
seeds that give high draws on the first draw.

The interesting thing about this problem is that Tirthankar doesn't believe 
that the seed selection process is aware of the function output.  I would say 
that it must be, and he should be investigating how that happens if he is 
worried about the output, he shouldn't be worrying about R's RNG.



Hmm, no. The basic issue is that RNGs are constructed so that with x_{n+1} = 
f(x_n),
x_1, x_2, x_3,... will look random, not so that f(s_1), f(s_2), f(s_3), ... 
will look random for any s_1, s_2, ... . This is true, even if seeds s_1, s_2, 
... are not chosen so as to mess with the RNG. In the present case, it seems 
that the seeds around 86e6 tend to give similar output. On the other hand, it 
is not _just_ the similarity in magnitude that does it, try e.g.

s <- as.integer(runif(100, 86.54e6, 86.98e6))
r <- sapply(s, function(s){set.seed(s); runif(1,17,26)})
plot(s,r, pch=".")

and no obvious pattern emerges. My best guess is that the seeds are not only of 
similar magnitude, but also have other bit-pattern similarities.

(Isn't there a Knuth quote to the effect that "Every random number generator will 
fail in at least one application"?)

One remaining issue is whether it is really true that the same seeds givee 
different output on different platforms. That shouldn't happen, I believe.



Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] parallel::detectCores() bug on Raspberry Pi B+

2016-12-09 Thread Paul Gilbert

In R 3.3.2 detectCores() in package parallel reports 2 rather than 1 on 
Raspberry Pi B+ running Raspbian. (This report is just 'for the record'. 
The model is superseded and I think no longer produced.)  The problem 
seems to be caused by


 grep processor /proc/cpuinfo
processor   : 0
model name  : ARMv6-compatible processor rev 7 (v6l)

(On Raspberry Pi 2 and 3 there is no error because the model name lines are

 model name : ARMv7 Processor rev 5 (v7l)
 model name : ARMv7 Processor rev 4 (v7l)
)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-09 Thread Paul Gilbert

On 09/08/2016 05:06 PM, robin hankin wrote:

Could we take a cue from min() and max()?

x <- 1:10
min(x[x>7])

[1] 8

min(x[x>11])

[1] Inf
Warning message:
In min(x[x > 11]) : no non-missing arguments to min; returning Inf

As ?min says, this is implemented to preserve transitivity, and this
makes a lot of sense.
I think the issuing of a warning here is a good compromise; I can
always turn off warnings if I want.

I fear you are thinking of this as an end user, rather than as a package 
developer. Warnings are for end users, when they do something they 
possibly should be warned about. A package really should not generate 
warnings unless they are for end user consumption. In package 
development I treat warnings the same way I treat errors: build fails, 
program around it. So what you call a compromise is no compromise at all 
as far as I am concerned.

But perhaps there is a use for an end user version, maybe All() or ALL() 
that issues an error or warning. There are a lot of functions and 
operators in R that could warn about mistakes that a user may be making.

Paul

I find this behaviour of min() and max() to be annoying in the *right*
way: it annoys me precisely when I need to be
annoyed, that is, when I haven't thought through the consequences of
sending zero-length arguments.

On Fri, Sep 9, 2016 at 6:00 AM, Paul Gilbert <pgilbert...@gmail.com> wrote:

On 09/08/2016 01:22 PM, Gabriel Becker wrote:

On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdun...@tibco.com> wrote:

Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)?  This
is
a different case than a zero-length operand of a legitimate type.  E.g.,
 any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an
error
if x is NULL.

Bill,

That is a good point. I can see the argument for this in the case that the
non-zero length is 1. I'm not sure which is better though. If we switch
any() to all(), things get murky.

Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
all(x>0)), but the likelihood of this being a thought-bug on the author's
part is exceedingly high, imho.

I suspect there may be more R users than you think that understand and use
vacuously true in code. I don't really like the idea of turning a perfectly
good and properly documented mathematical test into an error in order to
protect against a possible "thought-bug".

Paul

So the desirable behavior seems to depend

on the angle we look at it from.

My personal opinion is that x < y with length(x)==0 should fail if
length(y)

1, at least, and I'd be for it being an error even if y is length 1,

though I do acknowledge this is more likely (though still quite unlikely
imho) to be the intended behavior.

~G

I.e., I think the type check should be done before the length check.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbec...@ucdavis.edu>
wrote:

Martin,

Like Robin and Oliver I think this type of edge-case consistency is
important and that it's fantastic that R-core - and you personally - are
willing to tackle some of these "gotcha" behaviors. "Little" stuff like
this really does combine to go a long way to making R better and better.

I do wonder a  bit about the

x = 1:2

y = NULL

x < y

case.

Returning a logical of length 0 is more backwards compatible, but is it
ever what the author actually intended? I have trouble thinking of a
case
where that less-than didn't carry an implicit assumption that y was
non-NULL.  I can say that in my own code, I've never hit that behavior
in
a
case that wasn't an error.

My vote (unless someone else points out a compelling use for the
behavior)
is for the to throw an error. As a developer, I'd rather things like
this
break so the bug in my logic is visible, rather than  propagating as the
0-length logical is &'ed or |'ed with other logical vectors, or used to
subset, or (in the case it should be length 1) passed to if() (if throws
an
error now, but the rest would silently "work").

Best,
~G

On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
maech...@stat.math.ethz.ch>
wrote:

robin hankin <hankin.ro...@gmail.com>
on Thu, 8 Sep 2016 10:05:21 +1200 writes:

> Martin I'd like to make a comment; I think that R's
> behaviour on 'edge' cases like this is an important thing
> and it's great that you are working on it.

> I make heavy use of zero-extent arrays, chiefly because
> the dimnames are an efficient and logical way to keep
> track of certain types of information.

> If I have, for example,

> a <- array(0,c(2,0,2))
> dimnames(a) <- list(name=c('Mike','Kevin'),
NULL,item=c("hat","scarf"))

> Then in R-3.3.1, 70800 I get

a

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Paul Gilbert

On 09/08/2016 01:22 PM, Gabriel Becker wrote:

On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  wrote:

Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)?  This is
a different case than a zero-length operand of a legitimate type.  E.g.,
 any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an error
if x is NULL.

Bill,

That is a good point. I can see the argument for this in the case that the
non-zero length is 1. I'm not sure which is better though. If we switch
any() to all(), things get murky.

Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
all(x>0)), but the likelihood of this being a thought-bug on the author's
part is exceedingly high, imho.

I suspect there may be more R users than you think that understand and 
use vacuously true in code. I don't really like the idea of turning a 
perfectly good and properly documented mathematical test into an error 
in order to protect against a possible "thought-bug".

Paul

So the desirable behavior seems to depend

on the angle we look at it from.

My personal opinion is that x < y with length(x)==0 should fail if length(y)

1, at least, and I'd be for it being an error even if y is length 1,

though I do acknowledge this is more likely (though still quite unlikely
imho) to be the intended behavior.

~G

I.e., I think the type check should be done before the length check.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
wrote:

Martin,

Like Robin and Oliver I think this type of edge-case consistency is
important and that it's fantastic that R-core - and you personally - are
willing to tackle some of these "gotcha" behaviors. "Little" stuff like
this really does combine to go a long way to making R better and better.

I do wonder a  bit about the

x = 1:2

y = NULL

x < y

case.

Returning a logical of length 0 is more backwards compatible, but is it
ever what the author actually intended? I have trouble thinking of a case
where that less-than didn't carry an implicit assumption that y was
non-NULL.  I can say that in my own code, I've never hit that behavior in
a
case that wasn't an error.

My vote (unless someone else points out a compelling use for the behavior)
is for the to throw an error. As a developer, I'd rather things like this
break so the bug in my logic is visible, rather than  propagating as the
0-length logical is &'ed or |'ed with other logical vectors, or used to
subset, or (in the case it should be length 1) passed to if() (if throws
an
error now, but the rest would silently "work").

Best,
~G

On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
maech...@stat.math.ethz.ch>
wrote:

robin hankin 
on Thu, 8 Sep 2016 10:05:21 +1200 writes:

> Martin I'd like to make a comment; I think that R's
> behaviour on 'edge' cases like this is an important thing
> and it's great that you are working on it.

> I make heavy use of zero-extent arrays, chiefly because
> the dimnames are an efficient and logical way to keep
> track of certain types of information.

> If I have, for example,

> a <- array(0,c(2,0,2))
> dimnames(a) <- list(name=c('Mike','Kevin'),
NULL,item=c("hat","scarf"))

> Then in R-3.3.1, 70800 I get

a> 0
> logical(0)
>>

> But in 71219 I get

a> 0
> , , item = hat

> name
> Mike
> Kevin

> , , item = scarf

> name
> Mike
> Kevin

> (which is an empty logical array that holds the names of the

people

and
> their clothes). I find the behaviour of 71219 very much preferable
because
> there is no reason to discard the information in the dimnames.

Thanks a lot, Robin, (and Oliver) !

Yes, the above is such a case where the new behavior makes much sense.
And this behavior remains identical after the 71222 amendment.

Martin

> Best wishes
> Robin

> On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
maech...@stat.math.ethz.ch>
> wrote:

>> > Martin Maechler 
>> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>>
>> > Yesterday, changes to R's development version were committed,
>> relating
>> > to arithmetic, logic ('&' and '|') and
>> > comparison/relational ('<', '==') binary operators
>> > which in NEWS are described as
>>
>> > SIGNIFICANT USER-VISIBLE CHANGES:
>>
>> > [.]
>>
>> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>> > behave consistently, notably for arrays of length zero.
>>
>> > Arithmetic between length-1 arrays and longer non-arrays had
>> > silently dropped the array attributes and recycled.  This
>> > now gives a warning and will signal an

Re: [Rd] A bug in the R Mersenne Twister (RNG) code?

2016-08-31 Thread Paul Gilbert




On 08/30/2016 06:29 PM, Duncan Murdoch wrote:

I don't see evidence of a bug.  There have been several versions of the
MT; we may be using a different version than you are.  Ours is the
1999/10/28 version; the web page you cite uses one from 2002.

Perhaps the newer version fixes some problems, and then it would be
worth considering a change.  But changing the default RNG definitely
introduces problems in reproducibility,


Well "problems in reproducibility" is a bit vague. Results would always 
be reproducible by specifying kind="Mersenne-Twister" or kind="Buggy
Kinderman-Ramage" for older results, so there is no problem reproducing 
results. The only problem is that users expecting to reproduce results 
twenty years later will need to know what random generator they used. 
(BTW, they may also need to record information about the normal or other 
generator, as well as the seed.) Of course, these changes are recorded 
pretty well for R, so the history of "default" can always be found.


I think it is a mistake to encourage users into thinking they do not 
need to keep track of some information if they want reproducibility. 
Perhaps the default should be changed more often in order to encourage 
better user habits.


More seriously, I think "default" should continue to be something that 
is currently considered to be good. So, if there really is a known 
problem, then I think "default" should be changed.


(And, no I did not get burned by the R 1.7.0 change in the default 
generator. I got burned by a much earlier, unadvertised, and more subtle 
change in the Splus generator.)


Paul Gilbert

so it's not obvious that we

would do it.

Duncan Murdoch


On 30/08/2016 5:45 PM, Mark Roberts wrote:

Whomever,

I recently sent the "bug report" below tor-c...@r-project.org and have
just been asked to instead submit it to you.

Although I am basically not an R user, I have installed version 3.3.1
and am also the author of a statistics program written in Visual Basic
that contains a component which correctly implements the Mersenne
Twister (MT) algorithm.  I believe that it is not possible to generate
the correct stream of pseudorandom numbers using the MT default random
number generator in R, and am not the first person to notice this.  Here
is a posted 2013 entry
(www.r-bloggers.com/reproducibility-and-randomness/) on an R website
that asserts that the SAS computer program implementation of the MT
algorithm produces different numbers than R does when using the same
starting seed number.  The author of this post didn’t get anyone to
respond to his query about the reason for this SAS vs. R discrepancy.

There are two ways of initializing the original MT computer program
(written in C) so that an identical stream of numbers can be repeatedly
generated:  1) with a particular integer seed number, and 2) with a
particular array of integers.   In the 'compilation and usage' section
of this webpage (https://github.com/cslarsen/mersenne-twister) there is
a listing of the first 200 random numbers the MT algorithm should
produce for seed number = 1.  The inventors of the Mersenne Twister
random number generator provided two different sets of the first 1000
numbers produced by a correctly coded 32-bit implementation of the MT
algorithm when initializing it with a particular array of integers at:
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.out.
[There is a link to this output at:
www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html.]

My statistics program obtains exactly those 200 numbers from the first
site mentioned in the previous paragraph and also obtains those same
numbers from the second website (though I didn't check all 2000 values).
   Assuming that the MT code within R uses the 32-bit MT algorithm, I
suspect that the current version of R can't do that.  If you (i.e.,
anyone who might knowledgeably respond to this report) is able to
duplicate those reference test-values, then please send me the R code to
initialize the MT code within R to successfully do that, and I apologize
for having wasted your time. If you (collectively) can't do that, then R
is very likely using incorrectly implemented MT code.  And if this
latter possibility is true, it seems to me that this is something that
should be fixed.

Mark Roberts, Ph.D.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] Handling Not-Always-Needed Dependencies? - Part 2

2016-08-04 Thread Paul Gilbert




On 08/04/2016 11:51 AM, Dirk Eddelbuettel wrote:


On 4 August 2016 at 11:46, Paul Gilbert wrote:
| If my package has a test that needs another package, but that package is
| not needed in the /R code of my package, then I indicate it as
| "Suggests", not as "Depends" nor as "Imports".  If that package is not
| available when I run R CMD check, should the test pass?

Wrong question.

Better question:  Should the test be running?  My preference is for only
inside of a requireNamespace() (or equivalent) block as the package is not
guaranteed to be present.  In theory.


At the level of R CMD check throwing an error or not, I think this is 
arguing that it should be possible to pass the tests (not throw an 
error) even though they are not run, isn't it? (So your answer to my 
question is yes, at least the way I was thinking of the question.) Or do 
you mean you would just like the tests to fail with a more appropriate 
error message? Or do you mean, as Duncan suggests, that the person 
writing the test should be allowed to code in something to decide if the 
test is really important or not?




In practice people seem to unconditionally install it anyway, and think that
is a good idea.  I disagree on both counts but remain in the vocal minority.


Actually, I think you are in agreement with Uwe and Duncan on this 
point, Duncan having added the refinement that the test writer gets to 
decide. No one so far seems to be advocating for my position that the 
tests should necessarily fail if they cannot be run. So I guess I am the 
one in the minority.


Paul


Dirk



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] Handling Not-Always-Needed Dependencies? - Part 2

2016-08-04 Thread Paul Gilbert


(One question from the thread Handling Not-Always-Needed Dependencies?)

I hope not to start another long tangled thread, but I have a basic 
confusion which I think has a yes/no answer and I would like to know if 
there is agreement on this point (or is it only me that is confused as 
usual).


If my package has a test that needs another package, but that package is 
not needed in the /R code of my package, then I indicate it as 
"Suggests", not as "Depends" nor as "Imports".  If that package is not 
available when I run R CMD check, should the test pass?


Yes or no:  ?

(I realize my own answer might be different if the package was used in 
an example or demo in place of a test, but that is just the confusion 
caused by too many uses for Suggests. In the case of a test, my own 
thought is that the test must fail, so my own answer is no. If the test 
does not fail then there is no real testing being done, thus missing 
code coverage in the testing. If the answer is no, then the tests do not 
need to be run if the package is not available, because it is known that 
they must fail. I think that not bothering to run the tests because the 
result is known is even more efficient than other suggestions. I also 
think it is the status quo.)


Hoping my confusion is cleared up, and this does not become another long 
tangled thread,

Paul

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Suggested dependencies in context of R CMD check

2016-04-04 Thread Paul Gilbert




On 04/04/2016 01:56 PM, Duncan Murdoch wrote:

On 04/04/2016 1:35 PM, Dirk Eddelbuettel wrote:

On 4 April 2016 at 07:25, Hadley Wickham wrote:
| On Sat, Apr 2, 2016 at 5:33 AM, Jan Górecki <j.gore...@wit.edu.pl>
wrote:
|
| In principle, I believe a package should pass R CMD check if no
| suggested packages are installed. However, since this is not currently

The relevant manual says

  The 'Suggests' field uses the same syntax as 'Depends' and lists
   packages that are not necessarily needed.  This includes packages used
   only in examples, tests or vignettes (*note Writing package
   vignettes::), and packages loaded in the body of functions.  E.g.,
   suppose an example(1) from package *foo* uses a dataset from package
   *bar*.  Then it is not necessary to have *bar* use *foo* unless one
   wants to execute all the examples/tests/vignettes: it is useful to
have
   *bar*, but not necessary.  Version requirements can be specified, and
   will be used by 'R CMD check'.

and later

* All packages that are needed(2) to successfully run 'R CMD check'
  on the package must be listed in one of 'Depends' or 'Suggests' or
  'Imports'.  Packages used to run examples or tests conditionally
  (e.g. _via_ 'if(require(PKGNAME))') should be listed in 'Suggests'
  or 'Enhances'.  (This allows checkers to ensure that all the
  packages needed for a complete check are installed.)

| automatically checked, many packages will fail to cleanly pass R CMD
| check if suggested packages are missing.

I consider that to be a bug in those 'many packages'.  It essentially
takes
away the usefulness of having a Suggests: to provide a more fine-grained
dependency graph.

So I am with Jan here.


I think I agree with Jan, but not for the reason you state.

Suggests is useful even if "R CMD check" treats it as Depends, because
most users never need to run "R CMD check".  It allows them to use a
subset of the functionality of a package without installing tons of
dependencies.

I agree that packages that fail on examples when Suggested packages are
missing are broken.  (Using if (require()) to skip particular examples
isn't failing.)  It would be useful to be able to detect failure; I
don't think that's easy now with "R CMD check".  That's why you should
be able to run it with Suggested packages missing.


Perhaps I'm confused, it would not be the first time, but I have the 
impression that some/all? of you are arguing for a different philosophy 
around R CMD check and Suggests/Depends. But the current design is not 
broken, it is working the way it has been advertised for many years now. 
It provides a fine-grained dependency graph for end users, not 
developers and testers. Being able to suggest packages for use in 
testing, when they are not needed for regular use is a good thing. A 
package failing R CMD check when the suggested packages are not 
available is not a bug, it is a feature following the rules as they have 
been designed. If you want to check a package then you need to install 
things that are needed to check it. If R CMD check skipped testing 
because suggested packages were not available then you will have many 
packages not being tested properly, that is, lots of broken packages 
passing R CMD check. (I've done this to myself sometimes using 
if(require()).) There are situations where some testing needs to be 
skipped, for example, license requirements and special databases, but 
this needs to be done carefully, and my impression is that if(require()) 
provides most of what is necessary, sometimes along with environment 
variables. Perhaps this is not elegant, but it does work and is not 
difficult.


The ideal situation would be to be able to run all possible combinations
of missing Suggested packages, but that's probably far too slow to be a
default.


But how do you decide pass/fail when you do this? I think it will only 
pass when all the suggested packages are available?


Paul Gilbert



BTW, I'm not completely sure it needs to be possible to run vignettes
without the Suggested packages they need.  Vignettes are allowed to
depend on things that aren't available to all users, and adding all the
require() tests could make them less clear.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best way to implement optional functions?

2015-10-22 Thread Paul Gilbert




On 10/22/2015 03:55 PM, Duncan Murdoch wrote:

I'm planning on adding some new WebGL functionality to the rgl package,
but it will pull in a very large number of dependencies. Since many
people won't need it, I'd like to make the new parts optional.

The general idea I'm thinking of is to put the new stuff into a separate
package, and have rgl "Suggest" it.  But I'm not sure whether these
functions  should only be available in the new package (so users would
have to attach it to use them), or whether they should be in rgl, but
fail if the new package is not available for loading.

Can people suggest other packages that solve this kind of problem in a
good way?


I do something similar in several packages. I would distinguish between 
the situation where the new functions have some functionality without 
all the extra dependencies, and the case where they really do not. In 
the former case it makes sense to put the functions in rgl and then fail 
when the extra functionality is demanded and not available. In the 
latter case, it "feels like" you are trying to defeat Depends: or 
Imports:. That route has usually gotten me in trouble.


Another thing you might want to consider is that, at least for awhile, 
the new functions in rglPlus will probably be less stable then those in 
rgl. Being able to change those and update rglPlus without needing to 
update rgl can be a real advantage (i.e. if the API for the new 
functions is in rgl, and you need to change it, then you are required to 
notify all the package maintainers that depend on rgl, do reverse 
testing, and you have to explain that your update of rgl is going to 
break rglPlus and you have a new version of that but you cannot submit 
that yet because it will not work until the new rgl is in place.)


Paul


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] download.file and https

2015-07-02 Thread Paul Gilbert




On 07/02/2015 10:52 PM, Henrik Bengtsson wrote:

 From R 3.2.0, check:


capabilities(libcurl)

libcurl
TRUE

TRUE means R was built such that HTTPS is supported.  If you see
FALSE, make sure libcurl is available when/if you build R from source.


I do have TRUE for this. The default behaviour still does not work.

Paul



/Henrik

On Thu, Jul 2, 2015 at 7:46 PM, Paul Gilbert pgilbert...@gmail.com wrote:

(This problem with download.file() affects quantmod, and possibly several
other packages. e.g. getSymbols('M2',src='FRED') fails.)

I think the St Louis Fed has moved to using https for connections, and I
believe all the US government web sites are doing this. An http request is
automatically switched to https. The default download.file method does not
seem to handle this, but method=wget does:


  tmp - tempfile()

download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;,
destfile = tmp)

trying URL
'http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv'
Error in
download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;,
:
   cannot open URL
'http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv'




download.file(http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv;,
destfile = tmp, method=wget)

--2015-07-02 22:29:49--
http://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv
Resolving research.stlouisfed.org (research.stlouisfed.org)... 65.89.18.120
Connecting to research.stlouisfed.org
(research.stlouisfed.org)|65.89.18.120|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location:
https://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv
[following]
--2015-07-02 22:29:49--
https://research.stlouisfed.org/fred2/series/M2/downloaddata/M2.csv
Connecting to research.stlouisfed.org
(research.stlouisfed.org)|65.89.18.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/x-comma-separated-values]
Saving to: ‘/tmp/RtmpOX7kA1/file1ba639d7fd0f’

 [ =   ] 34,519   178KB/s   in 0.2s

2015-07-02 22:29:50 (178 KB/s) - ‘/tmp/RtmpOX7kA1/file1ba639d7fd0f’ saved
[34519]

Paul

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Defining a `show` function breaks the print-ing of S4 object -- bug or expected?

2015-06-30 Thread Paul Gilbert




On 06/30/2015 11:33 AM, Duncan Murdoch wrote:

On 30/06/2015 5:27 PM, Lorenz, David wrote:

There is something I'm really missing here. The function show is a
standardGeneric function, so the correct way to write it as method like
this:


That describes methods::show.  The problem is that the default print
mechanism isn't calling methods::show() (or base::print() as Luke says),
it's calling show() or print() in the global environment, so the user's
function overrides the generic, and you get the error.


These are two different problems aren't they? I can see that you might 
want to ensure that base::print() calls methods::show(), but forcing the 
default print to go to base::print(), rather than whatever print() is 
first on the search path, would seem like a real change of philosophy. 
What about all the other base functions that can be overridden by 
something in the global environment?


Paul


Luke, are you going to look at this, or should I?

Duncan Murdoch



setMethod(show,  Person, function(object) {

for an object of class Person for example.




Dave

On Tue, Jun 30, 2015 at 10:11 AM, luke-tier...@uiowa.edu wrote:


Same thing happens with S3 if you redefine print(). I thought that
code was actually calculating the function to call rather than the
symbol to use, but apparently not. Shouldn't be too hard to fix.

luke

On Tue, 30 Jun 2015, Hadley Wickham wrote:

  On Tue, Jun 30, 2015 at 2:20 PM, Duncan Murdoch

murdoch.dun...@gmail.com wrote:


On 30/06/2015 1:57 PM, Hadley Wickham wrote:


A slightly simpler formulation of the problem is:

show - function(...) stop(My show!)
methods::setClass(Person, slots = list(name = character))
methods::new(Person, name = Tom)
# Error in (function (...)  : My show!



Just to be clear:  the complaint is that the auto-called show() is not
methods::show?  I.e. after

x - methods::new(Person, name = Tom)

you would expect

show(x)

to give the error, but not

x

??



Correct - I'd expect print() to always call methods::show(), not
whatever show() is first on the search path.

Hadley




--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] appropriate directory for data downloads in examples, demos and vignettes

2015-06-29 Thread Paul Gilbert

Regarding alternative places for scripts, you can add a directory (eg 
inst/testLocalScripts) and then with a recently added R CMD feature you 
can do


 R CMD check --test-dir=inst/testLocalScripts your-package.tar.gz

This will not (automatically) be checked on CRAN. Beware that you also 
need to run R CMD check without this option to run your regular tests.


Paul


On 06/29/2015 11:25 AM, Jonathan Callahan wrote:

Hi,

The MazamaSpatialUtils
http://cran.r-project.org/package=MazamaSpatialUtils package has a
required package state variable which users set to specify where they
want to store large amounts of GIS data that is being downloaded and
converted by the package. The implementation of this follows Hadley's
advice here:

http://adv-r.had.co.nz/Environments.html#explicit-envs

The functionality is implemented with package environment and getter and
setter functions:

spatialEnv - new.env(parent = emptyenv())
spatialEnv$dataDir - NULL


getSpatialDataDir - function() {
   if (is.null(spatialEnv$dataDir)) {
 stop('No data directory found. Please set a data directory with
setSpatialDataDir(YOUR_DATA_DIR).',call.=FALSE)
   } else {
 return(spatialEnv$dataDir)
   }
}


setSpatialDataDir - function(dataDir) {
   old - spatialEnv$dataDir
   dataDir - path.expand(dataDir)
   tryCatch({
 if (!file.exists(dataDir)) dir.create(dataDir)
 spatialEnv$dataDir - dataDir
   }, warning = function(warn) {
 warning(Invalid path name.)
   }, error   = function(err) {
 stop(paste0(Error in setSpatialDataDir(,dataDir,).))
   })
   return(invisible(old))
}


My question is:

*What is an appropriate directory to specify for vignettes (or demos or
examples) that need to go through CRAN testing?*

The R code in vignettes need to specify a directory that is writable during
the package build process but that will also be available to users.

Should we create a /tmp/hash directory? Would that be available on all
systems?

Alternatively,

*What is an alternative to vignettes and demos for tutorial scripts that
should not be tested upon submission to CRAN?*


Thanks for any suggestions.

Jon




__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] Print output during long tests?

2015-05-04 Thread Paul Gilbert

If your tests can be divided into multiple files in the tests/ directory 
then you will get lines like


 * checking tests ...
 Running ‘test1.R’
 Running ‘test2.R’
 Running ‘test3.R’
 ...

Paul

On 05/04/2015 11:52 AM, Toby Hocking wrote:

I am the author of R package animint which uses testthat for unit tests.

This means that there is a single test file (animint/tests/testthat.R) and
during R CMD check we will see the following output

* checking tests ...
Running ‘testthat.R’

I run these tests on Travis, which has a policy that if no output is
received after 10 minutes, it will kill the check. Because animint's
testthat tests take a total of over 10 minutes, Travis kills the R CMD
check job before it has finished all the tests. This is a problem since we
would like to run animint tests on Travis.

One solution to this problem would be if R CMD check could output more
lines other than just Running testthat.R. Can I give some command line
switch to R CMD check or set some environment variable, so that some more
verbose test output could be shown on R CMD check?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check and missing imports from base packages

2015-04-29 Thread Paul Gilbert




On 04/29/2015 05:38 PM, William Dunlap wrote:

And in general a developer would avoid masking a function
in a base package, so as not to require the user to distinguish
between stats::density() and igraph::density(). Maybe the
example is not meant literally.


The 'filter' function in the popular 'dplyr' package masks the one
that has been in the stats package forever, and they have nothing
in common, so that may give you an example.


As I recall, several packages mask the simulate generic in stats, if you 
are looking for examples.


Paul Gilbert



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Apr 29, 2015 at 1:24 PM, Martin Morgan mtmor...@fredhutch.org
wrote:


On 04/28/2015 01:04 PM, Gábor Csárdi wrote:


When a symbol in a package is resolved, R looks into the package's
environment, and then into the package's imports environment. Then, if the
symbol is still not resolved, it looks into the base package. So far so
good.

If still not found, it follows the 'search()' path, starting with the
global environment and then all attached packages, finishing with base and
recommended packages.

This can be a problem if a package uses a function from a base package,
but
it does not formally import it via the NAMESPACE file. If another package
on the search path also defines a function with the same name, then this
second function will be called.

E.g. if package 'ggplot2' uses 'stats::density()', and package 'igraph'
also defines 'density()', and 'igraph' is on the search path, then
'ggplot2' will call 'igraph::density()' instead of 'stats::density()'.



stats::density() is an S3 generic, so igraph would define an S3 method,
right? And in general a developer would avoid masking a function in a base
package, so as not to require the user to distinguish between
stats::density() and igraph::density(). Maybe the example is not meant
literally.

Being able to easily flag non-imported, non-base symbols would definitely
improve the robustness of package code, even if not helping the end user
disambiguate duplicate symbols.

Martin Morgan



I think that for a better solution, either
1) the search path should not be used at all to resolve symbols in
packages, or
2) only base packages should be searched.

I realize that this is something that is not easy to change, especially 1)
would break a lot of packages. But maybe at least 'R CMD check' could
report these cases. Currently it reports missing imports for non-base
packages only. Is it reasonable to have a NOTE for missing imports from
base packages as well?

[As usual, please fix me if I am missing or misunderstood something.]

Thank you, Best,
Gabor

 [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Which function can change RNG state?

2015-02-08 Thread Paul Gilbert



On 02/08/2015 09:33 AM, Dirk Eddelbuettel wrote:


On 7 February 2015 at 19:52, otoomet wrote:
| random numbers.   For instance, can I be sure that
| set.seed(0); print(runif(1)); print(rnorm(1))
| will always print the same numbers, also in the future version of R?  There

Yes, pretty much.


This is nearly correct. The user could change the uniform or normal 
generator, since there are options other than the defaults, which would 
mean the result would be different. And obviously if they changed print 
precision then the printed result may be truncated differently.


I think you could prepare for future versions of R by saving information 
about the generators you are using. The precedent has already been set 
(R-1.7.0) that the default could change if there is a good reason. A 
good reason might be that the RNG is found not to be so good relative to 
others that become available. But I think the old generator would 
continue to be available, so people can reproduce old results. (Package 
setRNG has some utilities to help save and reset, but there is nothing 
especially difficult or fancy, just a few details that need to be 
remembered.)


I've been lurking here over fifteen years, and while I am getting old and
forgetful I can remember exactly one such change where behaviour was changed,
and (one of the) generators was altered---if memory serves in the earlier
days of R 1.* days . [ Goes digging...] Yes, see `help(RNGkind)` which
details that R 1.7.0 made a change when Buggy Kinderman-Ramage was added as
the old value, and Kinderman-Ramage was repaired.  There once was a similar
fix in the very early days of the Mersenne-Twister which is why the GNU GSL
has two variants with suffixes _1998 and _1998.


I seem to recall a bit of change around R-0.49 but old and forgetful 
would cover this too. For me, a bigger change was an unadvertised change 
in Splus - they compiled against a different math library at some point. 
This changed the lower bits in results, mostly insignificant but 
accumulated simulation results could amount to something fairly 
important. The amount of time I spent trying to find why results would 
not reproduce was one of my main motivations for starting to use R.


So your issue seems like pilot error to me:  don't attach the parallel package
if you do not plan to work in parallel.  But do if you do, and see its fine
vignette on how it provides you reproducibility for multiple RNG streams.

In general, you can very much trust R (and R Core) in these matters.

Dirk


On 02/08/2015 09:40 AM, Gábor Csárdi wrote: On Sat, Feb 7, 2015 at
 I don't know if there is intention to keep this reproducible across R
 versions, but it is already not reproducible across platforms (with
the same R version):
 
http://stackoverflow.com/questions/21212326/floating-point-arithmetic-and-reproducibility


The situation is better in some respects, and worse in others, than what 
is described on stackoverflow. I think the point is made pretty well 
there that you should not be trying to reproduce results beyond machine 
precision. My experience is that you can compare within a fuzz of 1e-14 
usually, even across platforms. (The package setRNG on CRAN has a 
function random.number.test() which is run in the package's tests/ and 
makes uniform and normal comparisons to 1e-14. It has passed checks on 
all R platforms since 2004. Actual, the checks have been done since 
about 1995 but they were part of package dse earlier.)  If you 
accumulate lots of lower order parts (eg sum(simulated - true) in a long 
monte-carlo) then the fuzz may need to get much larger, especially 
comparing across platforms. And you will have trouble with numerically 
unstable calculations. Once-upon-a-time I was annoyed by this, but then 
I realized that it was better not to do unstable calculations.


In addition to not being reproducible beyond machine precision across R 
versions and across platforms, you can really not be guaranteed even on 
the same platform and same version of R. You may get different results 
if you upgrade the OS and there has been a change in the math libraries. 
In my experience this happens rather often. I don't think there is any 
specific 32 vs 64 bit issue, but math libraries sometimes do things a 
bit differently on different processors (eg processor bug fixes) so you 
can occasionally get differences with everything the same except the 
hardware.



On 02/07/2015 10:52 PM, otoomet wrote:
 It turned out that this is because package parallel, buried deep
 in my dependencies, calls runif() during it's initialization and
 in this way changes the random number sequence.

Guessing a bit about what you are saying: 1/you set the random seed 
2/you did some things which included loading package parallel 3/you ran 
some things for which you expected to get results comparable to some 
previous run when you did 1/ and 2/ in the reverse order.


If I understand this correctly, I suggest you always do everything

Re: [Rd] unloadNamespace

2015-01-09 Thread Paul Gilbert

Thanks Winston. That seems like a workaround that might be usefully 
included into unloadNamespace.


Paul

On 15-01-09 12:09 PM, Winston Chang wrote:

It's probably because the first thing that unloadNamespace does is this:
ns - asNamespace(ns, base.OK = FALSE)

If you call asNamespace(tseries), it calls getNamespace(tseries),
which has the side effect of loading that package (and its dependencies).

One way to work around this is to check loadedNamespaces() before you
try to unload a package.

-Winston

On Thu, Jan 8, 2015 at 9:45 AM, Paul Gilbert pgilbert...@gmail.com
mailto:pgilbert...@gmail.com wrote:

In the documentation the closed thing I see to an explanation of
this is that ?detach says Unloading some namespaces has undesirable
side effects

Can anyone explain why unloading tseries will load zoo? I don't
think this behavior is specific to tseries, it's just an example. I
realize one would not usually unload something that is not loaded,
but I would expect it to do nothing or give an error. I only
discovered this when trying to clean up to debug another problem.

R version 3.1.2 (2014-10-31) -- Pumpkin Helmet
and
R Under development (unstable) (2015-01-02 r67308) -- Unsuffered
Consequences
...
Type 'q()' to quit R.

  loadedNamespaces()
[1] base  datasets  graphics  grDevices methods   stats
[7] utils
  unloadNamespace(tseries) # loads zoo ?
  loadedNamespaces()
  [1] base  datasets  graphics  grDevices grid lattice
  [7] methods   quadprog  stats utils zoo
 

Somewhat related, is there an easy way to get back to a clean
state for loaded and attached things, as if R had just been started?
I'm trying to do this in a vignette so it is not easy to stop and
restart R.

Paul


R-devel@r-project.org mailto:R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-devel
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] unloadNamespace

2015-01-08 Thread Paul Gilbert

In the documentation the closed thing I see to an explanation of this is 
that ?detach says Unloading some namespaces has undesirable side effects


Can anyone explain why unloading tseries will load zoo? I don't think 
this behavior is specific to tseries, it's just an example. I realize 
one would not usually unload something that is not loaded, but I would 
expect it to do nothing or give an error. I only discovered this when 
trying to clean up to debug another problem.


R version 3.1.2 (2014-10-31) -- Pumpkin Helmet
and
R Under development (unstable) (2015-01-02 r67308) -- Unsuffered 
Consequences

...
Type 'q()' to quit R.

 loadedNamespaces()
[1] base  datasets  graphics  grDevices methods   stats
[7] utils
 unloadNamespace(tseries) # loads zoo ?
 loadedNamespaces()
 [1] base  datasets  graphics  grDevices grid 
lattice

 [7] methods   quadprog  stats utils zoo


Somewhat related, is there an easy way to get back to a clean state 
for loaded and attached things, as if R had just been started? I'm 
trying to do this in a vignette so it is not easy to stop and restart R.


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] testing dontrun examples

2014-11-27 Thread Paul Gilbert




On 14-11-26 05:49 PM, Duncan Murdoch wrote:

On 26/11/2014, 1:45 PM, Paul Gilbert wrote:

Is there a good strategy for testing examples which should not be run by
default? For instance, I have examples which get data from the Internet.
If I wrap them in try() then they can be skipped if the Internet is not
available, but may not be tested in cases when I would like to know
about the failure. (Not to mention that the example syntax is ugly.)

If I mark them \dontrun or \donttest then they are not tested. I could
mark them \dontrun and then use example() but for this, in addition to
run.dontrun=TRUE, I would need to specify all topics for a package, and
I don't see how to do this, missing topic does not work.

Wishlist: what I would really like is R CMD check --run-dontrun   pkg



We have that in R-devel, so everyone will have it next April, but there
will possibly be bugs unless people like you try it out now.


Are you anticipating my wishes now, or did you tell me this and it 
entered my subconscious? So far it works as advertised.


Thanks,
Paul


Duncan Murdoch



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] testing dontrun examples

2014-11-26 Thread Paul Gilbert

Is there a good strategy for testing examples which should not be run by 
default? For instance, I have examples which get data from the Internet. 
If I wrap them in try() then they can be skipped if the Internet is not 
available, but may not be tested in cases when I would like to know 
about the failure. (Not to mention that the example syntax is ugly.)


If I mark them \dontrun or \donttest then they are not tested. I could 
mark them \dontrun and then use example() but for this, in addition to 
run.dontrun=TRUE, I would need to specify all topics for a package, and 
I don't see how to do this, missing topic does not work.


Wishlist: what I would really like is R CMD check --run-dontrun   pkg

Suggestions?

Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] testing dontrun examples

2014-11-26 Thread Paul Gilbert


On 14-11-26 02:09 PM, Spencer Graves wrote:

Hi, Paul:


   if(!fda::CRAN()) runs code except with R CMD check –as-cran.
I use it so CRAN checks skip examples that (a) need the Internet or (b)
take too long for CRAN.


Spencer

fda::CRAN() gives TRUE on my home machine, I think because I use several 
variables like  _R_CHECK_HAVE_MYSQL_=TRUE to control whether some tests 
get run. (Not all CRAN test servers have all resources.)


But, more importantly, wouldn't this strategy prevent CRAN from 
automatically running more extensive testing of the examples if they 
decided to do that sometimes?


Paul




Hope this helps.


   Spencer


On 11/26/2014 10:45 AM, Paul Gilbert wrote:

Is there a good strategy for testing examples which should not be run
by default? For instance, I have examples which get data from the
Internet. If I wrap them in try() then they can be skipped if the
Internet is not available, but may not be tested in cases when I would
like to know about the failure. (Not to mention that the example
syntax is ugly.)

If I mark them \dontrun or \donttest then they are not tested. I could
mark them \dontrun and then use example() but for this, in addition to
run.dontrun=TRUE, I would need to specify all topics for a package,
and I don't see how to do this, missing topic does not work.

Wishlist: what I would really like is R CMD check --run-dontrun pkg

Suggestions?

Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Changing style for the Sweave vignettes

2014-11-13 Thread Paul Gilbert


You might also consider starting your vignettes with

\begin{Scode}{echo=FALSE,results=hide}
 options(continue=  )
\end{Scode}

Then you get one prompt but it is still easy to cut and paste. This has 
been in many of my packages for many years, so I think it would be fair 
to assume it is acceptable.


Paul

On 11/13/2014 06:56 AM, January Weiner wrote:

Thank you, Søren and Brian for your answers.

Whether this is the right list -- well, I think it is, since I am
developing a package and would like to create a vignette which is
useful and convenient for my users. I know how to extract the vignette
code. However, most of my users don't.  Or if they do, they do not
bother, but copy the examples from the PDF while they are reading it.
At least that is my observation.

I'm sorry that my e-mail was unclear -- I started my e-mail with as a
user, ..., but I did mention that it is my vignettes that I am
concerned with.

options(prompt=...) is an idea, though I'm still not sure as to the
second part of my question - whether a vignette without a command
prompt is acceptable in a package or not.

Kind regards,

j.


On 13 November 2014 12:36, Brian G. Peterson br...@braverock.com wrote:

On 11/13/2014 05:09 AM, January Weiner wrote:


As a user, I am always annoyed beyond measure that Sweave vignettes
precede the code by a command line prompt. It makes running examples
by simple copying of the commands from the vignette to the console a
pain. I know the idea is that it is clear what is the command, and
what is the output, but I'd rather precede the output with some kind
of marking.

Is there any other solution possible / allowed in vignettes? I would
much prefer to make my vignettes easier to use for people like me.



I agree with Søren that this is not the right list, but to complete the
thread...

See the examples in

?vignette

start just above

## Now let us have a closer look at the code

All vignette's are compiled.  You can trivially extract all the code used
for any vignette in R, including any code not displayed in the text and
hidden from the user, from within R, or saved out to an editor so you can
source it line by line from Rstudio (or vim or emacs or...). That's the
whole point.

Regards,

Brian

--
Brian G. Peterson
http://braverock.com/brian/
Ph: 773-459-4973
IM: bgpbraverock


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem with build and check

2014-11-12 Thread Paul Gilbert

I certainly have longer argument lists with no problem. More likely the 
Rd file needs special consideration for %.


Paul

On 11/12/2014 02:11 PM, Therneau, Terry M., Ph.D. wrote:

I am getting failure of build and check, for an Rd file that has a long
argument list.
Guess diagnosis: a quoted string beyond a certain point in the argument
list is fatal.

Example:  Use the function below, create an Rd file for it with
prompt().  Move the .Rd file to the man directory (no need to edit it)
and try building

dart.control - function(server=c(production, integration,
development,
   http),
  out.poll.duration = 5,
  out.poll.increase = 1.1,
  out.poll.max = 30,
  out.poll.timeout = 3600,
  netrc.path,
  netrc.server = ldap,
  rtype = c(xml, json),
  dateformat= %Y-%m-%d) {

 server - match.arg(server)
 server
}

I created a package dummy with only this function, and get the
following on my Linux box.

tmt-local2021% R CMD build dummy
* checking for file ‘dummy/DESCRIPTION’ ... OK
* preparing ‘dummy’:
* checking DESCRIPTION meta-information ... OK
Warning: newline within quoted string at dart.control.Rd:11
Warning:
/tmp/RtmpjPjz9V/Rbuild398d6e382572/dummy/man/dart.control.Rd:46:
unexpected section header '\value'
Warning: newline within quoted string at dart.control.Rd:11
Error in
parse_Rd(/tmp/RtmpjPjz9V/Rbuild398d6e382572/dummy/man/dart.control.Rd, :
   Unexpected end of input (in  quoted string opened at
dart.control.Rd:88:16)
Execution halted

Session info for my version
  sessionInfo()
R Under development (unstable) (2014-10-30 r66907)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=C
  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base


Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] extra package tests directory

2014-09-22 Thread Paul Gilbert

I am trying to decide on a name for a directory where I will put some 
extra package tests. The main motivation for this is the need to limit 
the package test time on CRAN. That is, these are tests that could be in 
the tests/ directory and could be run on CRAN, but will take longer than 
CRAN likes.


Scanning through names currently being used in packages on CRAN I see a 
large number of inst/tests/ directories, but they seem to be instead of 
a tests/ directory at the top level of the package. (There are also some 
occurrences of inst/test and test/ at the top level.) I would prefer not 
to use these directories as I don't like the possible confusion over 
whether these are the standard package tests or additional ones.


The other name that is used a fair amount is inst/unitTests/ (plus 
inst/UnitTests/, plus inst/UnitTest/, plus inst/unittests).  In many 
case these seem to be run by a script in the tests/ directory using a 
unit testing framework, so they cannot easily be distinguished from the 
normal package tests/ run by CRAN.


I see also an occurrence each of inst/otherTests/ inst/testScripts/ and 
inst/test_cases.


My own preference would be inst/extraTests but no one is using that.

Have I missed anything? Does anyone have suggestions or comments? Are 
there other reasons one might want tests that are not usually run by CRAN?


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] requireNamespace() questions

2014-09-12 Thread Paul Gilbert



I am trying to follow directions at 
http://cran.r-project.org/doc/manuals/r-patched/R-exts.html#Suggested-packages 
regarding handling suggested packages with requireNamespace() rather 
than require(), and I have some questions.


1/ When I do requireNamespace() in a function is the loading of the 
namespace only effective within the function?


2/ At the link above in the manual it says Note the use of rgl:: as 
that object would not necessarily be visible   When the required 
package is loading methods, will the method be found when I reference 
the generic, which is not in the package, or do I need to do something 
different?


3/ In some packages I have functions that return an object defined in 
the suggested package being required. For example, a function does 
require(zoo) and then returns a zoo object. So, to work with the 
returned object I am really expecting that zoo will be available in the 
session afterwards. Is it recommended that I just check if the package 
is available on the search path the user has set rather than use 
require() or requireNamespace()?.


Regarding checking the path without actually attaching the package to 
the search path, is there something better than package:zoo %in% 
search() or is that the best way?


4/ I have a function in a package that Depends on DBI and suggests 
RMySQL, RPostgreSQL, RSQLite. The function uses dbDriver() in DBI which 
uses do.call(). If I use requireNamespace() in place of require() I get


 requireNamespace(RMySQL)
Loading required namespace: RMySQL
 m - dbDriver(MySQL)
Error in do.call(as.character(drvName), list(...)) :
  could not find function MySQL

 require(RMySQL)
Loading required package: RMySQL
 m - dbDriver(MySQL)


Is there a different way to handle this without altering the search path?

The function do.call() does not seem to work with an argument like
   do.call(RMySQL::MySQL, list())
even at the top level, and this situation may be more complicated when 
it is in a required package. What am I missing?


Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to test impact of candidate changes to package?

2014-09-10 Thread Paul Gilbert




On 09/10/2014 06:12 AM, Kirill Müller wrote:

If you don't intend to keep the old business logic in the long run,
perhaps a version control system such as Git can help you. If you use it
in single-user mode, you can think of it as a backup system where you
manually create each snapshot and give it a name, but it actually can do
much more. For your use case, you can open a new *branch* where you
implement your changes, and implement your testing logic simultaneously
in both branches (using *merge* operations). The system handles
switching between branches, so you can really perform invasive changes,
and revert if you find that a particular change breaks something.
...


Yes, I would strongly recommend some version control system for this, 
probably either Git or svn (Subversion). If this is all code and test 
data that you can release publicly then you might choose some public 
repository like Github or R-forge. (You will get lots of opinions about 
the relative merits of different repositories if you ask, but the main 
point is that any one of them will be better than nothing.) If part of 
your code and data cannot be released then you might check if something 
is already supported in your place of business. Chances are that it is, 
but only programmers in IT have been told about it.


On 09/10/2014 11:14 AM, Stephanie Locke wrote:
 ...

Has anyone else had to do this sort of testing before on their
packages? How did you do it? Am I missing an obvious package /
framework that can do this?


Most package maintainers would face some version of this problem, some 
simpler and some much more complicated. If you set up the tests as 
scripts in the package tests/ directory that issue stop() in the case of 
a problem, then R-forge pretty much does the checking for you on 
multiple platforms, at least when it is working properly.


It is probably more trouble than it is worth for a single package, but 
if you have several packages with inter-dependencies then you might want 
to look at the develMake framework at 
http://automater.r-forge.r-project.org/


Regards,
Paul


Cheers,
Steph

--
Stephanie Locke
BI  Credit Risk Analyst


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re R CMD check checking in development version of R

2014-08-28 Thread Paul Gilbert

(Please correct me if I'm wrong. I thought I mostly understood this, 
finally, but I've made the mistake of thinking I understood something 
too many times before.)


On 08/28/2014 10:39 AM, Simon Urbanek wrote:


On Aug 27, 2014, at 6:01 PM, Gavin Simpson ucfa...@gmail.com
wrote:


On 27 August 2014 15:24, Hadley Wickham h.wick...@gmail.com
wrote:


Is that the cause of these NOTEs? Is the expectation that if I
am using a function from a package, even a package that I have
in Depends:, that I have to explicitly declare these imports in
NAMESPACE?


Yes.

(Otherwise your package won't work if it's only attached and not
loaded. i.e. if someone does analogue::foo() only the imported
functions are available, not the functions in packages you depend
on)



Cheers Hadley. Thanks for the confirmation, but...

...I don't get this; what is the point of Depends? I thought it was
my package needs these other packages to work, i.e. be loaded.
Hence it is user error (IMHO ;-) to do `analogue::foo()` without
having the dependencies loaded too.



No. The point of Depends is that if your package is attached, it also
attaches the other packages to make them available for the user.
Essentially you're saying if you want to use my package
interactively, you will also want to use those other packages
interactively.


I agree that interactively catches the flavour of what Depends does, 
but technically that is the wrong word. The important point is whether 
the functions in a Depended upon package should be available to the user 
directly without them needing to use library() or require() to make them 
available, in an interactive session or a batch job.



You still need to use import() to define what exactly
is used by your package -


Amplifying a bit: By import() in the NAMESPACE, which you need whether 
you have Depends or Imports in the DESCRIPTION file, you ensure that the 
functions in your package use the ones in the package imported and do 
not get clobbered by anything the user might do. The user might redefine 
functions available to the interactive session, or require() another 
package with functions having the same names, and those are the ones his 
interactive direct calls will find, but your package functions will not 
use those.


People are sure to have differences of opinion about the trade-off 
between the annoyance of having to specifically attach packages being 
used, and the clarity this provides. At first I was really annoyed, but 
have eventually decided I do like the clarity.


In my experience it turns out to be surprisingly rare that you need 
packages in Depends, but there are legitimate cases beyond the annoyance 
case mentioned above. I think if you are putting packages in Depends you 
really do want to have a very good understanding of why you are doing 
that. If you use Depends then you are inviting support difficulties. 
Users will contact you about bugs in the package you attach, even though 
your package may not use the broken functions. If they attach the 
package themselves then they are much more likely to understand who they 
should contact. Core seems to have forgotten to take credit for trying 
to make life easier for package developers.


Paul

as opposed to what you want to be available

to the user in case it is attached.

Cheers, Simon




This check (whilst having found some things I should have imported
and didn't - which is a good thing!) seems to be circumventing the
intention of having something in Depends. Is Depends going to go
away?



(And really you shouldn't have any packages in depends, they
should all be in imports)



I disagree with *any*; having say vegan loaded when one is using
analogue is a design decision as the latter borrows heavily from
and builds upon vegan. In general I have moved packages that didn't
need to be in Depends into Imports; in the version I am currently
doing final tweaks on before it goes to CRAN I have remove all but
vegan from Depends.

Or am I thinking about this in the wrong way?

Thanks again

Gavin




Hadley


-- http://had.co.nz/





-- Gavin Simpson, PhD

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__ R-devel@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check for the R code from vignettes

2014-06-02 Thread Paul Gilbert




On 06/02/2014 12:16 AM, Gabriel Becker wrote:

Carl,

I don't really have a horse in this race other than a strong feeling that
whatever check does should be mandatory.

That having been said, I think it can be argued that the fact that check
does this means that it IS in the R package vignette specification that all
vignettes must be such that their tangled code will run without errors.


My understanding of this is that the package maintainer can turn off 
building the vignette (--no-vignettes) but R CMD check and CRAN still 
check that the tangle code runs, and the check fails if it does not. 
Running the tangle code can be turned off, just not by the package 
maintainer. You have to make a special appeal to the CRAN maintainers, 
and give reasons they are prepared to accept. I think the intention is 
that the tangle code should run without errors. I doubt they would 
accept it doesn't work as an acceptable reason. But there are reasons, 
like the vignette requires access to a commercial database engine. Also, 
I think, turning this off means they just do not run it regularly, in 
the daily checks. I don't think it necessarily means the code is never 
tested. The testing may need to be done on machines with special resources.


Thus, --no-vignettes provides a mechanism to avoid running the tangle 
code twice but, without special exemption, it is still run once. Some 
package maintainers may not think of several feature of 'R CMD check' as 
'aids'. I think of it having more to do with maintaining some quality 
assurance, which I think of as an aid but not a debugging aid.


I believe the CRAN maintainers have intentionally, and successfully, 
made disabling the running of tangled code  more trouble than it is 
generally worth. Effectively, a package should have tangle code that 
runs without errors.


(Of course, I could be wrong about all this, it has happened before.)

Paul



~G


On Sun, Jun 1, 2014 at 8:43 PM, Carl Boettiger cboet...@gmail.com wrote:


Yihui, list,

Focusing the behavior of R CMD check, the only reason I have seen put
forward in the discussion for having check tangle and then source as well
as knit/weave the very same vignette is to assist the package maintainer in
debugging R errors vs pdflatex errors.  As tangle (and many other tools)
are already available to an author needing extra help debugging, and as the
error messages are usually clear on whether errors come from the R code or
whatever format compiling (pdflatex, markdown html, etc), this seems like a
poor reason for R CMD check to be wasting time doing two versions of almost
(but not literally) the same check.

As has already been discussed, it is possible to write vignettes that can
be Sweave'd but not source'd, due to the different treatments of inline
chunks.  While I see the advantages of this property, I don't see why R CMD
check should be enforcing it through the arbitrary mechanism of running
both Sweave and tangle+source. If that is the desired behavior for all
Sweave documents it should be in part of the Sweave specification not to be
able to write/change values in inline expressions, or part of the tangle
definition to include inline chunks.  I any event I don't see any reason
for R CMD check doing both.  Perhaps someone can fill in whatever I've
overlooked?

Carl


On Sat, May 31, 2014 at 8:17 PM, Yihui Xie x...@yihui.name wrote:


1. The starting point of this discussion is package vignettes, instead
of R scripts. I'm not saying we should abandon R scripts, or all
people should write R code to generate reports. Starting from a
package vignette, you can evaluate it using a weave function, or
evaluate its derivative, namely an R script. I was saying the former
might not be a bad idea, although the latter sounds more familiar to
most R users. For a package vignette, within the context of R CMD
check, is it necessary to do tangle + evaluate _besides_ weave?

2. If you are comfortable with reading pure code without narratives,
I'm totally fine with that. I guess there is nothing to argue on this
point, since it is pretty much personal taste.

3. Yes, you are absolutely correct -- Sweave()/knit() does more than
source(), but let me repeat the issue to be discussed: what harm does
it bring if we disable tangle for R package vignettes?

Sorry if I did not make it clear enough, my priority of this
discussion is the necessity of tangle for package vignettes. After we
finish this issue, I'll be happy to extend the discussion towards
tangle in general.

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Web: http://yihui.name


On Sat, May 31, 2014 at 9:20 PM, Gabriel Becker gmbec...@ucdavis.edu
wrote:




On Sat, May 31, 2014 at 6:54 PM, Yihui Xie x...@yihui.name wrote:


I agree that fully evaluating the code is valuable, but
it is not a problem since the weave functions do fully evaluate the
code. If there is a reason for why source() an R script is preferred,

I guess it is users' familiarity with .R instead of

Re: [Rd] type.convert and doubles

2014-04-17 Thread Paul Gilbert




On 04/17/2014 02:21 PM, Murray Stokely wrote:

On Thu, Apr 17, 2014 at 6:42 AM, McGehee, Robert
robert.mcge...@geodecapital.com wrote:

Here's my use case: I have a function that pulls arbitrary
financial data from a web service call such as a stock's industry,
price, volume, etc. by reading the web output as a text table. The
data may be either character (industry, stock name, etc.) or
numeric (price, volume, etc.), and the function generally doesn't
know the class in advance. The problem is that we frequently get
numeric values represented with more precision than actually
exists, for instance a price of 2.6999 rather than
2.70. The numeric representation is exactly one digit too much
for type.convert which (in R 3.10.0) converts it to character
instead of numeric (not what I want). This caused a bunch of
non-numeric argument to binary operator errors to appear today as
numeric data was now being represented as characters.

I have no doubt that this probably will cause some unwanted RODBC
side effects for us as well. IMO, getting the class right is more
important than infinite precision. What use is a character
representation of a number anyway if you can't perform arithmetic
on it? I would favor at least making the new behavior optional, but
I think many packages (like RODBC) potentially need to be patched
to code around the new feature if it's left in.


The uses of character representation of a number are many: unique
identifiers/user ids, hash codes, timestamps, or other values where
rounding results to the nearest value that can be represented as a
numeric type would completely change the results of any data
analysis performed on that data.

Database join operations are certainly an area where R's previous
behavior of silently dropping precision of numbers with type.convert
can get you into trouble.  For example, things like join operations
or group by operations performed in R code would produce erroneous
results if you are joining/grouping by a key without the full
precision of your underlying data.  Records can get joined up
incorrectly or aggregated with the wrong groups.


I don't understand this. Assuming you are sending the SQL statement to 
the database engine, none of this erroneous matching is happening in R. 
The calculations all happens on the database.


But, for the case where the database does know that numbers are double 
precision, it would be nice if they got transmitted by ODBC to R as 
numerics (the usual translation) just as they are by the native 
interfaces like RPostgreSQL. Do you get the erroneous results when you 
use a native interface?


( from second response:)

You want a casting operation in your SQL query or similar if you want
a rounded type that will always fit in a double.  Cast or Convert
operators in SQL, or similar for however you are getting the data you
want to use with type.convert().  This is all application specific and
sort of beyond the scope of type.convert(), which now behaves as it
has been documented to behave.


This seems to suggests I need to use different SQL statements depending 
on which interface I use to talk to the database.


If you do 1/3 in a database calculation and that ends up being 
represented as something more accurate than double precision on the 
database, then it needs to be transmitted as something with higher 
precision (character/factor?). If the result is double precision it 
should be sent as double precision, not as something pretending to be 
more accurate.


I suspect the difficulty with ODBC may be that type.convert() really 
should not be called when both ends of the communication know that a 
double precision number is being exchanged.


Paul


If you later want to do arithmetic on them, you can choose to lose
precision by using as.numeric() or use one of the large number
packages on CRAN (GMP, int64, bit64, etc.).  But once you've dropped
the precision with as.numeric you can never get it back, which is
why the previous behavior was clearly dangerous.

I think I had some additional examples in the original bug/patch I
filed about this issue a few years ago, but I'm unable to find it on
bugs.r-project.org and its not referenced in the cl descriptions or
news file.

- Murray



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] NOTE when detecting mismatch in output, and codes for NOTEs, WARNINGs and ERRORs

2014-04-13 Thread Paul Gilbert




On 04/10/2014 04:34 AM, Kirill Müller wrote:


On 03/26/2014 06:46 PM, Paul Gilbert wrote:



On 03/26/2014 04:58 AM, Kirill Müller wrote:

Dear list


It is possible to store expected output for tests and examples. From the
manual: If tests has a subdirectory Examples containing a file
pkg-Ex.Rout.save, this is compared to the output file for running the
examples when the latter are checked. And, earlier (written in the
context of test output, but apparently applies here as well): ...,
these two are compared, with differences being reported but not causing
an error.

I think a NOTE would be appropriate here, in order to be able to detect
this by only looking at the summary. Is there a reason for not flagging
differences here?


The problem is that differences occur too often because this is a
comparison of characters in the output files (a diff). Any output that
is affected by locale, node name or Internet downloads, time, host, or
OS, is likely to cause a difference. Also, if you print results to a
high precision you will get differences on different systems,
depending on OS, 32 vs 64 bit, numerical libraries, etc. A better test
strategy when it is numerical results that you want to compare is to
do a numerical comparison and throw an error if the result is not
good, something like

  r - result from your function
  rGood - known good value
  fuzz - 1e-12  #tolerance

  if (fuzz  max(abs(r - rGood))) stop('Test xxx failed.')

It is more work to set up, but the maintenance will be less,
especially when you consider that your tests need to run on different
OSes on CRAN.

You can also use try() and catch error codes if you want to check those.



Thanks for your input.

To me, this is a different kind of test,


Yes, if you meant that you intended to compare character output, it is a 
different kind of test. With a file in the tests/ directory of a package 
you can construct a test of character differences in individual commands 
with something like


  z1 - as.character(rnorm(5))
  z2 - as.character(type.convert(z1))
  if(any(z1 != z2)) stop(character differences exist.)

for which no one would be required to make any changes to the existing 
package checking system. One caveat is output that is done as a side 
effect. For longer output streams from multiple commands you might 
construct your own testing with R CMD Rdiff.


As you point out, adding something to flag different levels of severity 
for differences from a .Rout.save file would require some work by someone.


HTH,
Paul

for which I'd rather use the

facilities provided by the testthat package. Imagine a function that
operates on, say, strings, vectors, or data frames, and that is expected
to produce completely identical results on all platforms -- here, a
character-by-character comparison of the output is appropriate, and I'd
rather see a WARNING or ERROR if something fails.

Perhaps this functionality can be provided by external packages like
roxygen and testthat: roxygen could create the good output (if asked
for) and set up a testthat test that compares the example run with the
good output. This would duplicate part of the work already done by
base R; the duplication could be avoided if there was a way to specify
the severity of a character-level difference between output and expected
output, perhaps by means of an .Rout.cfg file in DCF format:

OnDifference: mute|note|warning|error
Normalize: [R expression]
Fuzziness: [number of different lines that are tolerated]

On that note: Is there a convenient way to create the .Rout.save files
in base R? By convenient I mean a single function call, not checking
and manually copying as suggested here:
https://stat.ethz.ch/pipermail/r-help/2004-November/060310.html .


Cheers

Kirill


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] type.convert and doubles

2014-04-11 Thread Paul Gilbert




On 04/11/2014 01:43 PM, Simon Urbanek wrote:

Greg,

On Apr 11, 2014, at 11:50 AM, Gregory R. Warnes g...@warnes.net
wrote:


Hi All,

I see this in the NEWS for R 3.1.0:

type.convert() (and hence by default read.table()) returns a
character vector or factor when representing a numeric input as a
double would lose accuracy. Similarly for complex inputs.

This behavior seems likely to surprise users.


Can you elaborate why that would be surprising? It is consistent with
the intention of type.convert() to determine the correct type to
represent the value - it has always used character/factor as a
fallback where native type doesn't match.


Strictly speaking, I don't think this is true. If it were, it would not 
have been necessary to make the change so that it does now fallback to 
using character/factor. It may, however, have always been the intent.


I don't really think a warning is necessary, but there are some surprises:

 str(type.convert(format(1/3, digits=17))) # R-3.0.3
 num 0.333

 str(type.convert(format(1/3, digits=17))) # R-3.1.0
 Factor w/ 1 level 0.1: 1

Now you could say that one should never do that, and the change is just 
flushing out a bug that was always there. But the point is that in 
serialization situations there can be some surprises. So, for example, 
RODBC talking to PostgresSQL databases is now returning factors rather 
than numerics for double precision fields, whereas with RPostgresSQL the 
behaviour has not changed.


Paul

It has never issued any

warning in that case historically, so IMHO it would be rather
surprising if it did now…

Cheers, Simon



Would it be possible to issue a warning when this occurs?

Aside: I’m very happy to see the new ’s’ and ‘f’ browser (debugger)
commands!

-Greg [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__ R-devel@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Varying results of package checks due to random seed

2014-03-24 Thread Paul Gilbert




On 03/22/2014 01:32 PM, Radford Neal wrote:

From: Philippe GROSJEAN philippe.grosj...@umons.ac.be

... for latest CRAN version, we have successfully installed 4999
packages among the 5321 CRAN package on our platform. ... It is
strange that a large portion of R CMD check errors on CRAN occur and
disappear *without any version update* of a package or any of its
direct or indirect dependencies! That is, a fraction of errors or
warnings seem to appear and disappear without any code update.


Some of these are likely the result of packages running tests using
random number generation without setting the random numbers seed, in
which case the seed is set based on the current time and process id,
with an obvious possibility of results varying from run to run.

In the current development version of pqR (in branch 19-mods, found at
https://github.com/radfordneal/pqR/tree/19-mods), I have implemented a
change so that if the R_SEED environment variable is set, the random
seed is initialized to its value, rather than from the time and
process id.  This was motivated by exactly this problem - I can now
just set R_SEED to something before running all the package checks.


Beware, if you are serious about reproducing things, that you really 
need to save information about the uniform and other generators you use, 
such as the normal generator. The defaults do not change often, but have 
in the past, and could in the future if something better comes along. 
There are some small utilities and examples in the package setRNG which 
can help.


Also remember that you need to beware of a side effect of the 
environment variable approach. It is great for reproducing things, as 
you would want to do in package tests, but be careful how you use it in 
functions as it may mess up the randomness if you always set the seed to 
the same starting value.


Paul



Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] No repository set, so cyclic dependency check skipped

2014-01-26 Thread Paul Gilbert


When checking a package I am getting

* checking package dependencies ... NOTE
  No repository set, so cyclic dependency check skipped

How/where do I set the repository so I don't get this note?

No doubt this is explained in Writing R Extension, but I have not found it.

Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] No repository set, so cyclic dependency check skipped

2014-01-26 Thread Paul Gilbert




On 01/26/2014 12:31 PM, Uwe Ligges wrote:



On 26.01.2014 17:52, Paul Gilbert wrote:

When checking a package I am getting

* checking package dependencies ... NOTE
   No repository set, so cyclic dependency check skipped

How/where do I set the repository so I don't get this note?


Set a repository (e.g,. via optiopns(repos=) in your .Rprofile.


I'm getting this note when I check in R-devel on your win-builder site. 
Does that mean I need to set .Rprofile or you do?


Best,
Paul



Best,
Uwe Ligges




No doubt this is explained in Writing R Extension, but I have not
found it.

Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert

Would Suggests not work in this situation? I don't understand why you 
would need Depends. In what sense do you rely on the data only package?


Paul

On 13-12-06 04:20 PM, Hadley Wickham wrote:

Hi all,

What should you do when you rely on a data only package. If you just
Depend on it, you get the following from R CMD check:

Package in Depends field not imported from: 'hflights'
   These packages needs to imported from for the case when
   this namespace is loaded but not attached.

But there's nothing in the namespace to import, so adding it to
imports doesn't seem like the right answer.  Is that just a spurious
note?

Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert




On 13-12-07 01:47 PM, Gabor Grothendieck wrote:

On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert pgilbert...@gmail.com wrote:



On 13-12-07 12:19 PM, Gábor Csárdi wrote:


I don't know about this particular case, but in general it makes sense
to rely on a data package. E.g. I am creating a package that does
Bayesian inference for a particular problem, potentially relying on
prior knowledge. I think it makes sense to put the data that is used
to calculate the prior into another package, because it will be larger
than the code, and it does not change that often.

Gabor

On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert pgilbert...@gmail.com
wrote:


Would Suggests not work in this situation? I don't understand why you
would need Depends. In what sense do you rely on the data only package?



HW Because I want someone who downloads the package to be able to run
HW the examples without having to take additional action.
HW
HW Hadley

I went through this myself, including thinking it was a nuisance for users
to need to attach other packages to run examples. In the end I decided it is
not so bad to be explicit about what package the example data comes from, so
illustrate it in the examples. Users may not always want this data, and
other packages that build on yours probably do not want it.

Even in the Bayesian inference case pointed out by Gábor, I am not
convinced. It means the prior knowledge base cannot be exchanged for another
one. The package would be more general if it allowed the possibility of
attaching a different database of prior information. But this is clearly a
more important case, since the code probably does not work without some
database. (There are a few other situations where something like
RequireOneOf: would be useful.)



Requiring users to load packages which could be loaded automatically
seems to go against ease of use.  Its just one more thing that they
have to remember to do.

It really should be possible to write a batteries included package
while leveraging off of other packages.

Just to be clear, I distinguish the batteries included situation from 
the spare batteries included situation. I think it should be possible 
to automatically load everything that is really needed, that is why I 
think the Bayesian database is a more important case. But it strikes me 
as bad to attach everything that could ever possibly be wanted by a 
user. After all, it would be possible to automatically attach all 
packages. Some packages seemed to be headed in that direction before the 
new rules started to be enforced.


There is certainly a trade-off here between ease of use, not needing the 
user to attach packages, and namespace conflicts, which will result in 
time and difficulty debugging. For packages that no one ever uses in 
other packages, there would be a tendency to lean toward ease of use. 
But as soon as anyone starts building on top of a package with another 
one, I think that avoiding potential conflicts will dominate.


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Paul Gilbert




On 13-12-07 05:21 PM, Hadley Wickham wrote:

The Writing R Extensions manual says that Suggests is for packages which
are required only for examples, which I believe matches Hadley's original
question.


Yes, but without this package they won't be able to run the majority
of examples, which I think delivers a poor experience to the user. It
also means I have to litter my examples with if(require(x)),


I think you just need require(x) or library(x). If it is in Suggests 
then it is available whenever examples are tested, so you don't need the 
if(). In my opinion, this increases the signal by indicating to the 
reader where the data comes from.



decreasing the signal to noise ratio in the examples.

But we're getting a bit far from my original question about the NOTE:

   Package in Depends field not imported from: 'hflights'
   These packages needs to imported from for the case when
   this namespace is loaded but not attached.

Depending on (or linking to) a package is not just about making the
functions in the package available.


Several of us used to think that, but the modern interpretation seems to 
be just about making things in the package yours depends on available to 
users of your package. Exports: might be a better term than 
Depends:, at least if Depends: was not trying to mean both Imports: 
and Exports:.


Paul


Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Where to drop a python script?

2013-11-01 Thread Paul Gilbert


Jonathan

Below is a python script I have been playing with for extracing some 
system information and additional information about modules that I need 
to import in my TSjson package. My intention is to put R code in tests/ 
that will check this before going on to do other tests, but I have not 
yet done that. The reason for this is that error message are not 
especially enlightening when other tests fail because python is not 
working as needed.


Please let me know if you find improvements.

Paul
__

def test():

try:
import sys
have_sys = True
except:
have_sys = False

if sys.version_info = (3, 0): return dict( error=
TSjson requires Python 2. Running + str(sys.version_info))
# mechanize is not (yet) available for Python 3,
# Also, urllib2 is split into urllib.request, urllib.error in Python 3


try:
import urllib2
have_urllib2 = True
except:
have_urllib2 = False

try:
import re
have_re = True
except:
have_re = False

try:
import csv
have_csv = True
except:
have_csv = False

try:
import mechanize
have_mechanize = True
except:
have_mechanize = False

if (have_sys  have_urllib2  have_re  have_csv  have_mechanize):
err = 0
else:
err = 1

return dict(
 exit=err,
 have_sys=have_sys, have_urllib2=have_urllib2,
 have_re = have_re, have_csv = have_csv,
 have_mechanize = have_mechanize)


try:
import json
print(json.JSONEncoder().encode(test()))
except:
print(dict(exit=1, have_json = False))



On 13-11-01 10:17 AM, Jonathan Greenberg wrote:

This was actually the little script I was going to include (prompting me
to ask the question): a test for the python version number.
Save this (between the ***s) as e.g. python_version.py:

***

import sys
print(sys.version_info)

***

I've done almost no python coding, so I was going to call this with a
system(/pathto/python /pathto/python_version.py,intern=TRUE) call and
post-process the one-line text output.

--j


On Thu, Oct 31, 2013 at 12:45 PM, Paul Gilbert pgilbert...@gmail.com
mailto:pgilbert...@gmail.com wrote:



On 13-10-31 01 tel:13-10-31%2001:16 PM, Prof Brian Ripley wrote:

On 31/10/2013 15:33, Paul Gilbert wrote:



On 13-10-31 03 tel:13-10-31%2003:01 AM, Prof Brian Ripley
wrote:

On 31/10/2013 00:40, Paul Gilbert wrote:

The old convention was that it went in the exec/
directory, but as you
can see at

http://cran.at.r-project.org/__doc/manuals/r-devel/R-exts.__html#Non_002dR-scripts-in-__packages

http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages



   it can be in inst/anyName/. A minor convenience
of exec/ is that the
directory has the same name in source and when
installed, whereas
inst/anyName gets moved to anyName/, so debugging
can be a tiny bit
easier with exec/.

Having just put a package (TSjson) on CRAN with a
python script, here
are a few other pointers for getting it on CRAN:

-SystemRequirements: should indicate if a particular
version of python
is needed, and any non-default modules that are
needed. (My package
does
not work with Python 3 because some modules are not
available.) Some of
the libraries have changed, so it could be a bit
tricky to make
something work easily with both 2 and 3.

-You need a README to explain how to install Python.
(If you look at or
use mine, please let me know if you find problems.)


Better to describe exactly what you need: installation
instructions go
stale very easily.

-The Linux and Sun CRAN test machines have Python 2
whereas winbuilder
has Python 3. Be prepared to explain that the
package will not work on
one or the other.


Not true.  Linux and Solaris (sic) have both: the
Solaris machines have
2.6 and 3.3.


For an R package how does one go about specifying which
should be used?


You ask the user to tell you the path or at least the command
name, e.g.
by an environment variable or R function argument.  Just like
any other
external

Re: [Rd] Where to drop a python script?

2013-10-31 Thread Paul Gilbert




On 13-10-31 03:01 AM, Prof Brian Ripley wrote:

On 31/10/2013 00:40, Paul Gilbert wrote:

The old convention was that it went in the exec/ directory, but as you
can see at
http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages

  it can be in inst/anyName/. A minor convenience of exec/ is that the
directory has the same name in source and when installed, whereas
inst/anyName gets moved to anyName/, so debugging can be a tiny bit
easier with exec/.

Having just put a package (TSjson) on CRAN with a python script, here
are a few other pointers for getting it on CRAN:

-SystemRequirements: should indicate if a particular version of python
is needed, and any non-default modules that are needed. (My package does
not work with Python 3 because some modules are not available.) Some of
the libraries have changed, so it could be a bit tricky to make
something work easily with both 2 and 3.

-You need a README to explain how to install Python. (If you look at or
use mine, please let me know if you find problems.)


Better to describe exactly what you need: installation instructions go
stale very easily.


-The Linux and Sun CRAN test machines have Python 2 whereas winbuilder
has Python 3. Be prepared to explain that the package will not work on
one or the other.


Not true.  Linux and Solaris (sic) have both: the Solaris machines have
2.6 and 3.3.


For an R package how does one go about specifying which should be used?


Please do not spread misinformation about machines you do
not have any access to.



Another option to system() is pipe()

Paul

On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote:


On 30 October 2013 at 13:54, Jonathan Greenberg wrote:
| R-developers:
|
| I have a small python script that I'd like to include in an R
package I'm
| developing, but I'm a bit unclear about which subfolder it should go
in.  R
| will be calling the script via a system() call.  Thanks!

Up to you as you control the path. As Writing R Extensions explains,
everything below the (source) directory inst/ will get installed.  I
like
inst/extScripts/ (or similar) as it denotes that it is an external
script.

As an example, the gdata package has Perl code for xls reading/writing
below a
directory inst/perl/ -- and I think there are more packages doing this.

Dirk




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Where to drop a python script?

2013-10-31 Thread Paul Gilbert




On 13-10-31 01:16 PM, Prof Brian Ripley wrote:

On 31/10/2013 15:33, Paul Gilbert wrote:



On 13-10-31 03:01 AM, Prof Brian Ripley wrote:

On 31/10/2013 00:40, Paul Gilbert wrote:

The old convention was that it went in the exec/ directory, but as you
can see at
http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages



  it can be in inst/anyName/. A minor convenience of exec/ is that the
directory has the same name in source and when installed, whereas
inst/anyName gets moved to anyName/, so debugging can be a tiny bit
easier with exec/.

Having just put a package (TSjson) on CRAN with a python script, here
are a few other pointers for getting it on CRAN:

-SystemRequirements: should indicate if a particular version of python
is needed, and any non-default modules that are needed. (My package
does
not work with Python 3 because some modules are not available.) Some of
the libraries have changed, so it could be a bit tricky to make
something work easily with both 2 and 3.

-You need a README to explain how to install Python. (If you look at or
use mine, please let me know if you find problems.)


Better to describe exactly what you need: installation instructions go
stale very easily.


-The Linux and Sun CRAN test machines have Python 2 whereas winbuilder
has Python 3. Be prepared to explain that the package will not work on
one or the other.


Not true.  Linux and Solaris (sic) have both: the Solaris machines have
2.6 and 3.3.


For an R package how does one go about specifying which should be used?


You ask the user to tell you the path or at least the command name, e.g.
by an environment variable or R function argument.  Just like any other
external program such as GhostScript.


Yes, but since I don't have direct access to the CRAN test machines, 
specifically, on the CRAN test machines, how do I specify to use Python 
2 or Python 3? (That is, I think you are the user when CRAN tests are 
done on Solaris, so I am asking you.)







Please do not spread misinformation about machines you do
not have any access to.



Another option to system() is pipe()

Paul

On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote:


On 30 October 2013 at 13:54, Jonathan Greenberg wrote:
| R-developers:
|
| I have a small python script that I'd like to include in an R
package I'm
| developing, but I'm a bit unclear about which subfolder it should go
in.  R
| will be calling the script via a system() call.  Thanks!

Up to you as you control the path. As Writing R Extensions explains,
everything below the (source) directory inst/ will get installed.  I
like
inst/extScripts/ (or similar) as it denotes that it is an external
script.

As an example, the gdata package has Perl code for xls reading/writing
below a
directory inst/perl/ -- and I think there are more packages doing
this.

Dirk




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel








__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Where to drop a python script?

2013-10-30 Thread Paul Gilbert

The old convention was that it went in the exec/ directory, but as you 
can see at 
http://cran.at.r-project.org/doc/manuals/r-devel/R-exts.html#Non_002dR-scripts-in-packages 
 it can be in inst/anyName/. A minor convenience of exec/ is that the 
directory has the same name in source and when installed, whereas 
inst/anyName gets moved to anyName/, so debugging can be a tiny bit 
easier with exec/.


Having just put a package (TSjson) on CRAN with a python script, here 
are a few other pointers for getting it on CRAN:


-SystemRequirements: should indicate if a particular version of python 
is needed, and any non-default modules that are needed. (My package does 
not work with Python 3 because some modules are not available.) Some of 
the libraries have changed, so it could be a bit tricky to make 
something work easily with both 2 and 3.


-You need a README to explain how to install Python. (If you look at or 
use mine, please let me know if you find problems.)


-The Linux and Sun CRAN test machines have Python 2 whereas winbuilder 
has Python 3. Be prepared to explain that the package will not work on 
one or the other.


Another option to system() is pipe()

Paul

On 13-10-30 03:15 PM, Dirk Eddelbuettel wrote:


On 30 October 2013 at 13:54, Jonathan Greenberg wrote:
| R-developers:
|
| I have a small python script that I'd like to include in an R package I'm
| developing, but I'm a bit unclear about which subfolder it should go in.  R
| will be calling the script via a system() call.  Thanks!

Up to you as you control the path. As Writing R Extensions explains,
everything below the (source) directory inst/ will get installed.  I like
inst/extScripts/ (or similar) as it denotes that it is an external script.

As an example, the gdata package has Perl code for xls reading/writing below a
directory inst/perl/ -- and I think there are more packages doing this.

Dirk




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] advise on Depends

2013-10-25 Thread Paul Gilbert




On 13-10-25 05:21 PM, Henrik Bengtsson wrote:

On Fri, Oct 25, 2013 at 1:39 PM, John Chambers j...@r-project.org
wrote:

One additional point to Michael's summary:

The methods package itself should stay in Depends:, to be safe.


It would be nice to have more detail about when this is necessary, 
rather than suggested as a general workaround. I thought the principle 
of putting things in Imports was that it is safer. I have methods listed 
in Imports rather than Depends in 16 of my packages, doing roughly what 
was the basis for the original question, and I am not aware of a 
problem, yet.


Paul



There are a number of function calls to the methods package that
may be included in generated methods for user classes.  These have
not been revised to work when the methods package is not attached,
so importing the package only may run into problems.  This has been
an issue, for example, in using Rscript.


To clarify that last sentence for those not aware (and hopefully
spare someone having to troubleshoot this), executing R
scripts/expressions using 'Rscript' rather than 'R' differs by which
packages are attached by default.  Example:

% Rscript -e search() [1] .GlobalEnvpackage:stats
package:graphics [4] package:grDevices package:utils
package:datasets [7] Autoloads package:base

% R --quiet -e search()

search()

[1] .GlobalEnvpackage:stats package:graphics [4]
package:grDevices package:utils package:datasets [7]
package:methods   Autoloads package:base

Note how 'methods' is not attached when using Rscript.  This is
explained in help(Rscript), help(options), and in 'R
Installation and Administration'.

/Henrik




John

On Oct 25, 2013, at 11:26 AM, Michael Lawrence
lawrence.mich...@gene.com wrote:


On Wed, Oct 23, 2013 at 8:33 PM, Kasper Daniel Hansen 
kasperdanielhan...@gmail.com wrote:


This is about the new note

Depends: includes the non-default packages: ‘BiocGenerics’
‘Biobase’ ‘lattice’ ‘reshape’ ‘GenomicRanges’ ‘Biostrings’
‘bumphunter’ Adding so many packages to the search path is
excessive and importing selectively is preferable.

Let us say my package A either uses a class B (by producing an
object that has B embedded as a slot) from another package or
provides a specific method for a generic defined in another
package (both examples using S4). In both case my impression is
that best practices is I ought to Depend on such a package, so
it is a available at run time to the user.



For classes, you just need to import the class with
importClassesFrom(). For generics, as long as your package
exports the method with exportMethods(), the generic will also be
exported from your package, regardless of whether the defining
package is attached. And the methods from the
loaded-but-not-attached packages are available for the generic.
So neither of these two is really a problem.

The rationale for Depends is that the user might always want to
use functions defined by another package with objects
consumed/produced by your package, such as generics for which
your package has not defined any methods. For example,
rtracklayer Depends on GenomicRanges, because it imports objects
from files as GenomicRanges objects.  So just consider what the
user sees when looking at your API. What's private, what's
public?

Michael



Comments?

Best, Kasper

[[alternative HTML version deleted]]


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__ R-devel@r-project.org
mailing list https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Question about selective importing of package functions...

2013-10-20 Thread Paul Gilbert




On 13-10-20 04:58 PM, Gabor Grothendieck wrote:

On Sun, Oct 20, 2013 at 4:49 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:

On 13-10-20 4:43 PM, Jonathan Greenberg wrote:


I'm working on an update for my CRAN package spatial.tools and I noticed
a new warning when running R CMD CHECK --as-cran:

* checking CRAN incoming feasibility ... NOTE
Maintainer: 'Jonathan Asher Greenberg spatial-to...@estarcion.net'
Depends: includes the non-default packages:
'sp' 'raster' 'rgdal' 'mmap' 'abind' 'parallel' 'foreach'
'doParallel' 'rgeos'
Adding so many packages to the search path is excessive
and importing selectively is preferable.

Is this a warning that would need to be fixed pre-CRAN (not really sure
how, since I need functions from all of those packages)?  Is there a way
to
import only a single function from a package, if that function is a
dependency?



You really want to use imports.  Those are defined in the NAMESPACE file;
you can import everything from a package if you want, but the best style is
in fact to just import exactly what you need.  This is more robust than
using Depends, and it doesn't add so much to the user's search path, so it's
less likely to break something else (e.g. by putting a package on the path
that masks some function the user already had there.)


That may answer the specific case of the poster but how does one
handle the case
where one wants the user to be able to access the functions in the
dependent package.


There are two answers to this, depending on how much of the dependent 
package you want to make available to the user. If you want most of that 
package to be available then this is the (only?) exception to the rule. 
From Writing R Extensions:


  Field ‘Depends’ should nowadays be used rarely, only for packages
  which are intended to be put on the search path to make their
  facilities available to the end user (and not to the package itself):
  for example it makes sense that a user of package latticeExtra would
   want the functions of package lattice made available.

If you really only want to make a couple of functions available then you 
can import and export the functions. Currently this has the unfortunate 
side effect that you need to document the functions, you cannot just 
re-direct to the documentation in the imported package, at least, I have 
not figured out how to do that.


Paul



For example, sqldf depends on gsubfn which provides fn which is used
with sqldf to
perform substitutions in the SQL string.

library(sqldf)
tt - 3
fn$sqldf(select * from BOD where Time  $tt)

I don't want to ask the user to tediously issue a library(gsubfn) too since
fn is frequently needed and for literally years this has not been necessary.
Also I don't want to duplicate fn's code in sqldf since that makes the whole
thing less modular -- it would imply having to change fn in two places
if anything
in fn changed.



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problems when moving to Imports from Depends

2013-09-27 Thread Paul Gilbert




On 13-09-27 06:05 PM, Peter Langfelder wrote:

On Fri, Sep 27, 2013 at 2:50 PM, Kasper Daniel Hansen
kasperdanielhan...@gmail.com wrote:

Peter,

This is a relatively new warning from R CMD check (for some definition of
new).  The authors of Hmisc have clearly not yet gone through the process of
cleaning it up, as you are doing right now (and there are many other
packages that still need to address this, including several of mine).  Given
who the authors are of Hmisc, I would suggest writing to them and ask them
to look into this, and ask for a time estimate.


thanks for the suggestion, but I must be missing something: since
Hmisc imports survival (as well as Depends: on it), what can Hmisc
change to make the survival functionality visible to my package?


The terminology around imports has had many of us confused. (My copy 
of) Hmisc has survival in both Imports: and Depends: in the DESCRIPTION 
file (for which they will now be getting flagged) but it does not have 
it in the NAMSPACE file, which it needs, whether it is in Depends: or 
Imports: (and for which they are getting another flag). When this is 
fixed then the Hmisc function rcorr.cens will look at its own NAMSPACE 
determined path for finding functions, and find is.Surv. As Kasper 
pointed out, this is not really your problem, except of course that you 
need to work around the Hmisc problem. Until Hmisc is fixed, I think you 
have the option of adding survival to Depends:, or leaving Hmisc in 
Depends:. (I would be inclined to leave it the way you had it until 
packages further down the chain are fixed.)


Paul





In the meantime, you may have to do something about this, and whatever you
do I would suggest following the Hmisc package and undo it as soon as
possible, as the right thing is to fix Hmisc.  Having said that, it is not
clear to me that you can easily solve this yourself, because I don't think
that putting survival into your own imports will make the package available
to Hmisc functions, but it is not impossible there is some way around it.


Well, as I said, things work fine if I leave Hmisc in the Depends:
field, which, however, is against CRAN policy. The trouble is that I
don't have a good way of checking whether something breaks by moving a
package from Depends into Imports...

Peter

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Capture output of install.packages (pipe system2)

2013-09-23 Thread Paul Gilbert



On 13-09-23 08:20 PM, Hadley Wickham wrote:

Brian Ripley's reply describes how it is done in the tools package.  For
example, as I sent privately to Jeroen,

x - system2(Rscript, -e \install.packages('MASS',
repos='http://probability.ca/cran')\, stdout=TRUE, stderr=TRUE)

captures all of the output from installing MASS.  As Jeroen pointed out,
that isn't identical to running install.packages() in the current session; a
real version of it should fill in more of the arguments, not leave them at
their defaults.


It does seems a little crazy that you're in a R process, then open
another one, which then opens a 3rd session! (often indirectly by
calling R CMD install which then calls an internal function in tools)


It does seem very much more straight forward to do this in the process 
above R:


  R --vanilla --slave -e install.packages('whatever',
   repo='http://cran.r-project.org') R.out  21

(Omit mailer wrap.) Your mileage may vary depending on your OS.

Paul



Hadley



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Design for classes with database connection

2013-09-19 Thread Paul Gilbert


Simon

Your idea to use SQLite and the nature of some of the sorting and 
extracting you are suggesting makes me wonder why you are thinking of R 
data structures as the home for the data storage. I would be inclined to 
put the data in an SQL database as the prime repository, then extract 
parts you want with SQL queries and bring them into R for analysis and 
graphics. If the full data set is large, and the parts you want to 
analyze in R at any one time are relatively small, then this will be 
much faster. After all, SQL is primarily for databases, whereas R's 
strength is more in statistics and graphics.


In the project http://tsdbi.r-forge.r-project.org/ I have code that does 
some of the things you probably want. There the focus is on a single 
identifier for a series, and various observation frequencies are 
supported. Tick data is supported (as time stamped data) but not 
extensively tested as I do not work with tick data much. There is a 
function TSquery, currently in TSdbi on CRAN but very shortly being 
split with the SQL specific parts of the interface into a package TSsql. 
It is very much like the queries you seem to have in mind, but I have 
not used it with tick data. It is used to generate a time series by 
formulating a query to a database with several possible sorting fields, 
very much like you describe, and then order the data according to the 
time index.


If your data set is large, then you need to think carefully about which 
fields you index. You certainly do not want to be building the indexes 
on the fly, as you would need to do if you dump all the data out of R 
into an SQL db just to do a sort. If the data set is small then indexing 
does not matter too much. Also, for a small data set there will be much 
less advantage of keeping the data in an SQL db rather than in R. You do 
need to be a bit more specific about what huge means. (Tick data for 5 
days or 20 years? A 100 IDs or 10 million?) Large for an R structure is 
not necessarily large for an SQL db. With more specifics I might be able 
to give more suggestions.


(R-SIG-DB may be a better forum for this discussion.)

HTH,
Paul

On 13-09-18 01:06 PM, Simon Zehnder wrote:

Dear R-Devels,

I am designing right now a package intended to simplify the handling
of market microstructure data (tick data, order data, etc). As these
data is most times pretty huge and needs to be reordered quite often
(e.g. if several security data is batched together or if only a
certain time range should be considered) - the package needs to
handle this.

Before I start, I would like to mention some facts which made me
decide to construct an own package instead of using e.g. the packages
bigmemory, highfrequency, zoo or xts: AFAIK big memory does not
provide the opportunity to handle data with different types
(timestamp, string and numerics) and their appropriate sorting, for
this task databases offer better tools. Package highfrequency is
designed to work specifically with a certain data structure and the
data in market microstructure has much greater versatility. Packages
zoo and xts offer a lot of versatility but do not offer the data
sorting ability needed for such big data.

I would like to get some feedback in regard to my decision and in
regard to the short design overview following.

My design idea is now:

1. Base the package on S4 classes, with one class that handles
data-reading from external sources, structuring and reordering.
Structuring is done in regard to specific data variables, i.e.
security ID, company ID, timestamp, price, volume (not all have to be
provided, but some surely exist on market microstructure data). The
less important variables are considered as a slot @other and are only
ordered in regard to the other variables. Something like this:

.mmstruct - setClass('mmstruct', representation( name   =
character, index= array, N  = integer, K= 
integer, compiD
= array, secID  = array, tradetime  = POSIXlt, flag =
array, price= array, vol= array, other  = 
data.frame))

2. To enable a lightweight ordering function, the class should
basically create an SQLite database on construction and delete it if
'rm()' is called. Throughout its life an object holds the database
path and can execute queries on the database tables. By this, I can
use the table sorting of SQLite (e.g. by constructing an index for
each important variable). I assume this is faster and more efficient
than programming something on my own - why reinventing the wheel? For
this I would use VIRTUAL classes like:

.mmstructBASE   - setClass('mmstructBASE', representation( dbName   =
character, dbTable  = character))

.mmstructDB - setClass('mmstructDB', representation( conn  
 =
SQLiteConnection), contains = c(mmstructBASE))

.mmstruct - setClass('mmstruct', representation( name   =
character, index= array, N  = integer, K

[Rd] helping R-forge build

2013-09-16 Thread Paul Gilbert


(subject changed from Re: [Rd] declaring package dependencies )
...

Yes useful. But that includes a package build system (which is what
breaks on
R-Forge). If you could do that on a six-pack then could you fix R-Forge
on a
three-pack first please? The R-Forge build system is itself an open source
package on R-Forge. Anyone can look at it, understand it and change it
to be
more stable. That build system is here :

https://r-forge.r-project.org/R/?group_id=34

(I only know this because Stefan told me once. So I suspect others don't
know
either, or it hasn't sunk in that we're pushing on an open door.)

Matthew


Open code is necessary, but to debug one needs access to logs, etc, to 
see where it is breaking.  Do you know how to find that information?


(And, BTW, there are also tools to help automatically build R and test 
packages at http://automater.r-forge.r-project.org/ .)


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] FOSS licence with BuildVignettes: false

2013-09-16 Thread Paul Gilbert


On 13-09-16 05:19 AM, Uwe Ligges wrote:
...

Yes, and I could see really rare circumstances where vignette building
takes a long time and the maintainer decides not to build vignettes as
part of the daily checks.

...

I thought 'BuildVignettes: FALSE' only turns of assembling the pdf, all 
the code is still run.  I don't think that would affect the time very 
much. Am I wrong (again)?


Paul


Uwe Ligges



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] declaring package dependencies

2013-09-15 Thread Paul Gilbert




On 13-09-14 07:20 PM, Duncan Murdoch wrote:

On 13-09-14 12:19 PM, Paul Gilbert wrote:



On 13-09-14 09:04 AM, Duncan Murdoch wrote:

On 13-09-13 12:00 PM, Dirk Eddelbuettel wrote:


On 13 September 2013 at 11:42, Paul Gilbert wrote:
| On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote:
|  It's not so much Rcpp itself or my 20-ish packages but the fact
that we (as
|  in the Rcpp authors) now stand behind an API that also has to
accomodate
|  changes in R CMD check. Case in point is current (unannounced)
change that
|  makes all Depends: Rcpp become Imports: Rcpp because of the
NAMESPACE checks.
|
| I am a bit confused by this Dirk, so maybe I am missing something. I
| think this is still a Note in R-devel so you do have some time to
make
| the change, at least several months, maybe more. It is not quite
what I
| think of as an announcement, more like a shot across the bow,
but it
| is also not unannounced.

One package author [as in user of Rcpp and not an author of it] was
told by
CRAN this week to change his package and came to me for help -- so in
that
small way the CRAN non-communication policy is already creating more
work
for me, and makes me look silly as I don't document what Rcpp-using
packages
need as I sadly still lack the time machine or psychic powers to
infer what
may get changed this weekend.

| More importantly, I don't think that the requirement is
necessarily to
| change Depends: Rcpp to Imports: Rcpp, the requirement is to put
| imports(Rcpp) in the NAMESPACE file. I think this is so that the
package
| continues to work even if the user does something with the search
path.
| The decision to change Depends: Rcpp to Imports: Rcpp really
depends on
| whether the package author wants Rcpp functions to be available
directly

Rcpp is a bit of an odd-ball as you mostly need it at compile-time,
and you
require very few R-level functions (but there is package
initialization etc
pp).  We also only about two handful of functions, and those are for
functionality not all 135 packages use (eg Modules etc).

But the focus here should not be on my hobby package. The focus needs
to be
on how four CRAN maintainers (who do a boatload of amazing work
which is
_truly_ appreciated in its thoroughness and reach) could make the
life of
authors of 4800+ packages easier by communicating and planning a tad
more.


Let me paraphrase that:  The CRAN maintainers do a lot of work, and it
helps me a lot, but if they only did a little bit more work it would
help me even more.

I suspect they'd be more receptive to suggestions that had them doing
less work, not more.


Actually, this is one of the parts that I do not understand. It seems to
me that it would be a lot less work for CRAN maintainers if the
implications and necessary changes to packages were explained a bit more
clearly in a forum like R-devel that many package developers actually
read regularly.


Then why don't you explain them?  They aren't secret.


Well, I have been trying to do that on this and related threads over the 
past few weeks. But there is a large credibility difference between my 
explanation of something I am just learning about myself and an 
explanation by a core member or CRAN maintainer of something they have 
implemented. (At least, I hope most readers of this list know there is a 
difference.)




I many not fully understand how much of the response to

package submission gets done automatically, but I do get the sense that
there is a fairly large amount of actual human time spent dealing with
just my submissions alone. If that is representative of all developers,
then CRAN maintainers don't have time to do much else. (The fact that
they do much more suggests I may not be representative.)

Two specific points have already been mentioned implicitly. CRAN
submission testing is often done at a higher/newer level using the
latest devel version. This results in lots of rejections for things that
I would fix before submission, if I knew about them.


Then why don't you test against R-devel before submitting?


I have been relying on R-forge to provide that testing. One practical 
suggestion in this thread (Matthew Dowle) was to test with win-builder 
R-devel. This needs to be amplified. I had thought of win-builder as a 
mechanism to test on Windows, since I rarely work on that platform. 
Following the CRAN submission guidelines I test on win-builder if I am 
not doing the Windows testing on my own machine and the R-forge results 
are not available. (I think for a single package they are equivalent 
when R-forge is working.) But on win-builder I have usually used the 
R-release directory. Using the R-devel directory has the advantage that 
it gives an as-cran test that is almost up-to-date with the one against 
which the package is tested when it is submitted.  Another feature of 
win-builder that I had not recognized is that submitted packages are 
available in its library for a short time, so packages with version 
dependencies can

Re: [Rd] declaring package dependencies

2013-09-14 Thread Paul Gilbert




On 13-09-14 09:04 AM, Duncan Murdoch wrote:

On 13-09-13 12:00 PM, Dirk Eddelbuettel wrote:


On 13 September 2013 at 11:42, Paul Gilbert wrote:
| On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote:
|  It's not so much Rcpp itself or my 20-ish packages but the fact
that we (as
|  in the Rcpp authors) now stand behind an API that also has to
accomodate
|  changes in R CMD check. Case in point is current (unannounced)
change that
|  makes all Depends: Rcpp become Imports: Rcpp because of the
NAMESPACE checks.
|
| I am a bit confused by this Dirk, so maybe I am missing something. I
| think this is still a Note in R-devel so you do have some time to
make
| the change, at least several months, maybe more. It is not quite what I
| think of as an announcement, more like a shot across the bow, but it
| is also not unannounced.

One package author [as in user of Rcpp and not an author of it] was
told by
CRAN this week to change his package and came to me for help -- so in
that
small way the CRAN non-communication policy is already creating more
work
for me, and makes me look silly as I don't document what Rcpp-using
packages
need as I sadly still lack the time machine or psychic powers to
infer what
may get changed this weekend.

| More importantly, I don't think that the requirement is necessarily to
| change Depends: Rcpp to Imports: Rcpp, the requirement is to put
| imports(Rcpp) in the NAMESPACE file. I think this is so that the
package
| continues to work even if the user does something with the search path.
| The decision to change Depends: Rcpp to Imports: Rcpp really depends on
| whether the package author wants Rcpp functions to be available
directly

Rcpp is a bit of an odd-ball as you mostly need it at compile-time,
and you
require very few R-level functions (but there is package
initialization etc
pp).  We also only about two handful of functions, and those are for
functionality not all 135 packages use (eg Modules etc).

But the focus here should not be on my hobby package. The focus needs
to be
on how four CRAN maintainers (who do a boatload of amazing work which is
_truly_ appreciated in its thoroughness and reach) could make the life of
authors of 4800+ packages easier by communicating and planning a tad
more.


Let me paraphrase that:  The CRAN maintainers do a lot of work, and it
helps me a lot, but if they only did a little bit more work it would
help me even more.

I suspect they'd be more receptive to suggestions that had them doing
less work, not more.


Actually, this is one of the parts that I do not understand. It seems to 
me that it would be a lot less work for CRAN maintainers if the 
implications and necessary changes to packages were explained a bit more 
clearly in a forum like R-devel that many package developers actually 
read regularly. I many not fully understand how much of the response to 
package submission gets done automatically, but I do get the sense that 
there is a fairly large amount of actual human time spent dealing with 
just my submissions alone. If that is representative of all developers, 
then CRAN maintainers don't have time to do much else. (The fact that 
they do much more suggests I may not be representative.)


Two specific points have already been mentioned implicitly. CRAN 
submission testing is often done at a higher/newer level using the 
latest devel version. This results in lots of rejections for things that 
I would fix before submission, if I knew about them. If the tests were 
rolled out with R, and only later incorporated into CRAN submission 
testing, I think there would be a lot less work for the CRAN 
maintainers. (This is ignoring the possibility that CRAN submission is 
really the testing ground for the tests, and to prove the tests requires 
a fair amount of manual involvement. I'm happy to continue contributing 
to this -- I've often felt my many contribution is an endless supply of 
bugs for the checkers to catch.)


The second point is that a facility like R-forge that runs the latest 
checks, on many platforms, is really useful in order to reduce work for 
both package developers and CRAN maintainers. With R-forge broken, the 
implication for additional work for CRAN maintainers seems enormous. But 
even with it working, not all packages are kept on R-forge, and with 
package version dependencies R-forge does not really work. (i.e. I have 
to get new versions of some packages onto CRAN before the new versions 
of other packages will build on R-forge.)  Perhaps the package checking 
part of R-forge should be separated into a pre-submission clearing house 
to which packages are submitted. If they pass checks there then the 
package developer could click on a submit button to do the actual 
submission to CRAN. (Of course there needs to be a mechanism to plead 
for the fact that the test systems do not have needed resources.) 
Something like the daily, but with new pre-release versions of packages 
might actually be better than the R-forge

Re: [Rd] declaring package dependencies

2013-09-13 Thread Paul Gilbert




On 13-09-13 11:02 AM, Dirk Eddelbuettel wrote:


On 13 September 2013 at 10:38, Duncan Murdoch wrote:
| On 13/09/2013 10:18 AM, Dirk Eddelbuettel wrote:
|  On 13 September 2013 at 09:51, Duncan Murdoch wrote:
|  | Changes are generally announced in the NEWS.Rd file long before release,
|  | but R-devel is an unreleased version, so you won't see the news until it
|  | is there.  Announcing things that nobody can try leads to fewer useful
|  | comments than putting them into R-devel where at least people can see
|  | what is really happening.
| 
|  That comment makes sense _in theory_.
| 
|  Yet _in practice_ it does not as many of us have been shot down by tests in
|  R-devel which had been implemented within a 48 hour window of the package
|  submission.
|
| It sounds as though you are talking about CRAN here, not R.  I can't
| speak for CRAN.

Hah :) -- in practive you actually do as the service you built to create RSS
summaries of R NEWS changes (ie R Core) is one good way to learn about CRAN
changes as the CRAN folks use the R Core access to R itself (via R CMD check)
to effect change.

And yes: we all want change for the better.

But we also want a more grown-up process.

|  Absent a time machine or psychic powers, I do not see how package developers
|  can reasonably be expected to cope with this.
|
| I'm a CRAN user as a package developer, and I do get emails about
| changes, but I don't find them overwhelming, and I don't recall
| receiving any that were irrational.  Generally the package is improved
| when I follow their advice.  It has happened that I have been slower
| than they liked in responding, but the world didn't end.

Of course they improve. The long arc of history points to progress. Packages
are better than they used to be (cf NAMESPACE discussion). Nobody disputes
that.

But what we take excpetion with is the _process_ and the matter in which
changes are (NOT REALLY) communicated, or even announced with a windows.

| I imagine Rcpp pushes the limits more than my packages do, but I think
| most developers can cope.  After all, the number of packages on CRAN is
| increasing, not decreasing.

It's not so much Rcpp itself or my 20-ish packages but the fact that we (as
in the Rcpp authors) now stand behind an API that also has to accomodate
changes in R CMD check. Case in point is current (unannounced) change that
makes all Depends: Rcpp become Imports: Rcpp because of the NAMESPACE checks.


I am a bit confused by this Dirk, so maybe I am missing something. I 
think this is still a Note in R-devel so you do have some time to make 
the change, at least several months, maybe more. It is not quite what I 
think of as an announcement, more like a shot across the bow, but it 
is also not unannounced.


More importantly, I don't think that the requirement is necessarily to 
change Depends: Rcpp to Imports: Rcpp, the requirement is to put 
imports(Rcpp) in the NAMESPACE file. I think this is so that the package 
continues to work even if the user does something with the search path. 
The decision to change Depends: Rcpp to Imports: Rcpp really depends on 
whether the package author wants Rcpp functions to be available directly 
by users without them needing to specifically attach Rcpp. They are 
available with Depends but with Imports they are just used internally in 
the package.


So, one of us is confused. Usually it is me.
Paul


Yet I cannot really talk to 135 packages using Rcpp as I have CRAN Policy
document to point to.

Dirk



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] declaring package dependencies

2013-09-12 Thread Paul Gilbert


Michael

(Several of us are struggling with these changes, so my comments are 
from the newly initiated point of view, rather than the fully 
knowledgeable.)


On 13-09-12 09:38 AM, Michael Friendly wrote:

I received the following email note re: the vcdExtra package


A vcd update has shown that packages TIMP and vcdExtra are not
declaring their dependence on colorspace/MASS: see

http://cran.r-project.org/web/checks/check_results_vcdExtra.html

But, I can't see what to do to avoid this, nor understand what has
changed in R devel.


Lots in this respect.


Sure enough, CRAN now reports errors in examples using MASS::loglm(),
using R Under development (unstable) (2013-09-11 r63906)

  Caesar.mod0 - loglm(~Infection + (Risk*Antibiotics*Planned),
data=Caesar)
Error: could not find function loglm

In DESCRIPTION I have
Depends: R (= 2.10), vcd, gnm (= 1.0.3)


The modern way of thinking about this is that the Depends line should 
not have much in it, only things from other packages that you want 
directly available to the user. (There are a few other exceptions 
necessary for packages that have not themselves embraced the modern 
way.) Since you may want users of vcdExtra to automatically have access 
to functions in vcd, without needing to execute library(vcd), this 
classifies as one of the official exceptions and you probably want cvd 
in the Depends line. However, chances are that gnm should be in 
Imports:. If vcd is in the Depends line then it is automatically 
attached and your examples do not need library(vcd) or requires(vcd).


The Note

  Unexported object imported by a ‘:::’ call: ‘vcd:::rootogram.default’

is harder to decide how to deal with. (This is sill just a note, but it 
looks to me like a note that will soon become a warning or error.) The 
simple solution is to export rootogram.default from vcd, but that 
exposes it to all users, and really you may just want to expose it to 
packages like vcdExtra. There was some recent discussion about this on 
R-devel. I suggested one possibility would be some sort of limited 
export. Since that was a suggestion that required work by someone else, 
it probably went the same place as most of those suggestion do. The 
solution I have adopted for the main case where this causes me problems 
is to split the classes, generics, and methods into one package, and the 
user functions into another. For example, if you have rootogram.default 
in a package called vcdClasses and exported it, then both vcd and 
vcdExtra can import it, but if it is not in their Depends line then it 
will not be visible to a user that executes library(vcd) or 
library(vcdExtra).


Beware that there is currently a small gotcha if the generics are S3, 
which was discussed recently and a patch submitted by Henrik Bengtsson 
(See Re: [Rd] False warning on replacing previous import when 
re-exporting identical object .)



Although there has been much moaning about these changes, including my 
own, I think the general logic is a real improvement. The way I think of 
it, the namespace imports for a package provide the equivalent of a 
search path for functions in the package, which is not changed by what 
packages a user or other packages attach or import. Thus a package 
developer has much more certain control over where the functions used by 
the package will come from. This is a trade-off for safety rather than 
convenience, thus the moaning. I am a complete newbie on this, but there 
seems to be a pretty good unofficial description at 
http://obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/.



Suggests:
ca,gmodels,Fahrmeir,effects,VGAM,plyr,rgl,lmtest,MASS,nnet,ggplot2,Sleuth2,car


If it is only in Suggests you can refer to it in the example by 
MASS::loglm(), or require(MASS)/library(MASS). (I might have that wrong, 
at least one works but I'm not certain of both.)



and the vcd DESCRIPTION has

Depends: R (= 2.4.0), grid, stats
Suggests: KernSmooth, mvtnorm, kernlab, HSAUR, coin
Imports: utils, MASS, grDevices, colorspace


Probably grid and stats should be in Imports.



so, in an R 3.0.0 console, library(vcdExtra) loads vcd and its
dependencies:

  library(vcdExtra)
Loading required package: vcd
Loading required package: MASS
Loading required package: grid
Loading required package: colorspace
Loading required package: gnm
Warning messages:
1: package ‘vcd’ was built under R version 3.0.1
2: package ‘MASS’ was built under R version 3.0.1
 

Note: these CRAN errors do not occur on R-Forge, using R version 3.0.1


Are you actually getting anything to build on R-forge? All my packages 
have been stuck for a couple of weeks, as have many others.


Paul


Patched (2013-08-21 r63645)
and the latest devel version (0.5-11) of vcdExtra.

-Michael



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] False warning on replacing previous import when re-exporting identical object

2013-08-30 Thread Paul Gilbert

This is related to the recent thread on correct NAMESPACE approach when writing 
S3 methods. If your methods are S4 I think pkgB does not need to export the 
generic. Just export the method and everything works magically and your problem 
disappears. For S3 methods there seems to be the difficultly you describe. Of 
course, the difference between S3 and S4 on this appears somewhat bug like. (I 
have not tested all this very carefully so I may have something wrong.)
Paul

Henrik Bengtsson h...@biostat.ucsf.edu wrote:

Hi,

SETUP:
Consider three packages PkgA, PkgB and PkgC.

PkgA defines a generic function foo() and exports it;

export(foo)

PkgB imports PkgA::foo() and re-exports it;

importFrom(PkgA, foo)
export(foo)

PkgC imports everything from PkgA and PkgB:

imports(PkgA, PkgB)


PROBLEM:
Loading or attaching the namespace of PkgC will generate a warning:

  replacing previous import by 'PkgA::foo' when loading 'PkgC'

This in turn causes 'R CMD check' on PkgC to generate a WARNING (no-go at 
CRAN):

* checking whether package 'PkgC' can be installed ... WARNING
Found the following significant warnings:
  Warning: replacing previous import by 'PkgA::foo' when loading
'CellularAutomaton'


FALSE?
Isn't it valid to argue that this is a false warning, because
identical(PkgB::foo, PkgA::foo) is TRUE and therefore has no effect?


/Henrik

PS. The above can be avoided by using explicit importFrom() on PkgA
and PkgB, but that's really tedious.  In my case this is out of my
reach, because I'm the author of PkgA and PkgB but not many of the
PkgC packages.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] ‘:::’ call

2013-08-28 Thread Paul Gilbert

I have a package (TSdbi) which provides end user functions that I 
export, and several utilities for plugin packages (e.g. TSMySQL) that I 
do not export because I do not intend them to be exposed to end users. I 
call these from the plugin packages using TSdbi:::  but that now 
produces a note in the checks:


* checking dependencies in R code ... NOTE
Namespace imported from by a ‘:::’ call: ‘TSdbi’
  See the note in ?`:::` about the use of this operator. :: should be
  used rather than ::: if the function is exported, and a package
  almost never needs to use ::: for its own functions.

Is there a preferred method to accomplish this in a way that does not 
produce a note?


Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ‘:::’ call

2013-08-28 Thread Paul Gilbert


On 13-08-28 12:29 PM, Marc Schwartz wrote:


On Aug 28, 2013, at 11:15 AM, Paul Gilbert pgilbert...@gmail.com wrote:


I have a package (TSdbi) which provides end user functions that I export, and 
several utilities for plugin packages (e.g. TSMySQL) that I do not export 
because I do not intend them to be exposed to end users. I call these from the 
plugin packages using TSdbi:::  but that now produces a note in the checks:

* checking dependencies in R code ... NOTE
Namespace imported from by a ‘:::’ call: ‘TSdbi’
  See the note in ?`:::` about the use of this operator. :: should be
  used rather than ::: if the function is exported, and a package
  almost never needs to use ::: for its own functions.

Is there a preferred method to accomplish this in a way that does not produce a 
note?

Thanks,
Paul




Paul,

See this rather lengthy discussion that occurred within the past week:

   https://stat.ethz.ch/pipermail/r-devel/2013-August/067180.html

Regards,

Marc Schwartz


I did follow the recent discussion, but no one answered the question Is 
there a preferred method to accomplish this? (I suppose the answer is 
that there is no other way, given that no one actually suggested 
anything else.) Most of the on topic discussion in that thread was about 
how to subvert the CRAN checks, which is not what I am trying to do and 
was also pointed out as a bad idea by Duncan. The substantive response was


r63654 has fixed this particular issue, and R-devel will no longer
warn against the use of ::: on packages of the same maintainer.

Regards,
Yihui

but that strikes me as a temporary work around rather than a real 
solution: suppose plugins are provided by a package from another maintainer.


Since CRAN notes have a habit of becoming warnings and then errors, it 
seems useful to identify the preferred legitimate approach while this is 
still a note. That would save work for both package developers and CRAN 
maintainers.


My thinking is that there is a need for a NAMESPACE directive something 
like limitedExport() that allows ::: for identified functions without 
provoking a CRAN complaint when packages use those functions. But there 
may already be a better way I don't know about. Or perhaps the solution 
is to split the end user functions and the utilities for plugin packages 
into two separate packages?


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ‘:::’ call

2013-08-28 Thread Paul Gilbert

I may have confused things by referring to ':::' which everyone reads as 
not exported, not documented, not part of the API, constantly changing, ...


In my mind, the real question is about two levels of exporting, one to 
other package developers, and another to end users. In both cases they 
are part of the API, relatively constant, and documented. (I try to 
document even internal functions, otherwise I can't remember what they do.)


So far, I see three possible solutions:

  1/ R adds another namespace directive allowing certain functions to 
be exported differently, possibly just by causing the checks to be 
silent about ::: when those functions are used in that way by other 
packages.


 2/ The package gets split in two, one for use by other packages and 
one for use by end users.


 3/ Some functions are exported normally but hidden by using . in the 
beginning of their names. Other package maintainers would know they 
exist, but end users would not so easily find them. (Duncan's other 
suggestion of using \keyword{internal} in the .Rd file strikes me as 
problematic. I'm surprised CRAN checks do not already object to 
functions exported and documented with \keyword{internal}.)


Paul

On 13-08-28 03:44 PM, Yihui Xie wrote:

If this issue is going to be solved at all, it might end up as yet
another hack like utils::globalVariables just to fix R CMD check
which was trying to fix things that were not necessarily broken.

To be clear, I was not suggesting subvert this check. What I was
hoping is a way to tell CRAN that Yes, I have read the documentation;
I understand the risk, and I want to take it like a moth flying into
the flames.

Many people have been talking about this risk, and how about some
evidence? Who was bitten by :::? How many real cases in which a
package was broken by :::?

Yes, unexported functions may change, so are exported functions (they
may change API, be deprecated, add new arguments, change defaults, and
so on). Almost everything in a package is constantly evolving, and I
believe the correct way (and the only way) to stop things from being
broken is to write enough test cases. When something is broken, we
will be able to know that. Yes, we may not have control over other
people's packages, but we always have control over our own test cases.
IMHO, testing is the justification of CRAN's reputation and quality,
and that is a part of what CRAN does.

In God we trust, and everyone else should bring tests.

Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA


On Wed, Aug 28, 2013 at 1:50 PM, Paul Gilbert pgilbert...@gmail.com wrote:

On 13-08-28 12:29 PM, Marc Schwartz wrote:



On Aug 28, 2013, at 11:15 AM, Paul Gilbert pgilbert...@gmail.com wrote:


I have a package (TSdbi) which provides end user functions that I export,
and several utilities for plugin packages (e.g. TSMySQL) that I do not
export because I do not intend them to be exposed to end users. I call these
from the plugin packages using TSdbi:::  but that now produces a note in the
checks:

* checking dependencies in R code ... NOTE
Namespace imported from by a ‘:::’ call: ‘TSdbi’
   See the note in ?`:::` about the use of this operator. :: should be
   used rather than ::: if the function is exported, and a package
   almost never needs to use ::: for its own functions.

Is there a preferred method to accomplish this in a way that does not
produce a note?

Thanks,
Paul





Paul,

See this rather lengthy discussion that occurred within the past week:

https://stat.ethz.ch/pipermail/r-devel/2013-August/067180.html

Regards,

Marc Schwartz



I did follow the recent discussion, but no one answered the question Is
there a preferred method to accomplish this? (I suppose the answer is that
there is no other way, given that no one actually suggested anything else.)
Most of the on topic discussion in that thread was about how to subvert the
CRAN checks, which is not what I am trying to do and was also pointed out as
a bad idea by Duncan. The substantive response was


r63654 has fixed this particular issue, and R-devel will no longer
warn against the use of ::: on packages of the same maintainer.

Regards,
Yihui


but that strikes me as a temporary work around rather than a real solution:
suppose plugins are provided by a package from another maintainer.

Since CRAN notes have a habit of becoming warnings and then errors, it seems
useful to identify the preferred legitimate approach while this is still a
note. That would save work for both package developers and CRAN maintainers.

My thinking is that there is a need for a NAMESPACE directive something like
limitedExport() that allows ::: for identified functions without provoking a
CRAN complaint when packages use those functions. But there may already be a
better way I don't know about. Or perhaps the solution is to split the end
user functions and the utilities for plugin

Re: [Rd] ‘:::’ call

2013-08-28 Thread Paul Gilbert




On 13-08-28 05:13 PM, Hadley Wickham wrote:

  3/ Some functions are exported normally but hidden by using . in the
beginning of their names. Other package maintainers would know they exist,
but end users would not so easily find them. (Duncan's other suggestion of
using \keyword{internal} in the .Rd file strikes me as problematic. I'm
surprised CRAN checks do not already object to functions exported and
documented with \keyword{internal}.)


Why? I think this is exactly the use case of \keyword{internal}.



From Writing R extensions The special keyword ‘internal’ marks a page 
of internal objects that are not part of the package’s API which 
suggests to me that a function with \keyword{internal} should not be 
exported, since that makes it part of the API. And, if it is really for 
internal use in a package, why would you export it? I think you are 
interpreting internal to mean internal to a group of packages, not 
internal to a package. But that is just the complement of what I am 
saying: there may be a need for two levels of export.


(Also, if you export it then you should document it, but for many 
maintainers \keyword{internal} is shorthand for I don't need to document 
this properly because no one is suppose to use it outside the package.)


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Correct NAMESPACE approach when writing an S3 method for a generic in another package

2013-08-26 Thread Paul Gilbert




On 13-08-26 12:04 PM, Gavin Simpson wrote:

Right Henrik, but then you have to document it or R CMD check raises a
Warning, which is less likely to pass muster when submitting to CRAN.
So you document that method on your existing method's Rd page (just
via an \alias{}), which is fine until the user does end up attaching
the original source of the method, and then you get the annoying
warnings about masking and `?plot3d` will bring up a dialogue asking
which version of the help you want to read.

Part of me thinks it would be better if there was a mechanism whereby
a generic will just work if package foo imports that generic and
exports a method for it.


Either I am messing up something again (reasonably likely) or it does 
just work with S4 methods. I can import the namespace that has the 
generic and the methods work, I do not seem to need to export the 
generic.  Is S3 working differently? I do have the documentation problem 
when I try to export other imported functions that I would like 
available to users.


Paul


Cheers,

G

On 26 August 2013 09:42, Henrik Bengtsson h...@biostat.ucsf.edu wrote:

On Mon, Aug 26, 2013 at 1:28 AM, Martyn Plummer plumm...@iarc.fr wrote:

I think rgl should be in Depends.  You are providing a method for a
generic function from another package. In order to use your method, you
want the user to be able to call the generic function without scoping
(i.e. without calling rgl::plot3d), so the generic should be on the
search path, so the package that provides it should be listed in Depends
in the NAMESPACE file.


You can re-export an imported object, but it has to be done via an
explicit export(), cf. It is possible to export variables from a
namespace which it has imported from other namespaces: this has to be
done explicitly and not via exportPattern [Writing R Extensions].

/H



Martyn

On Fri, 2013-08-23 at 22:01 -0600, Gavin Simpson wrote:

Dear List,

In one of my packages I have an S3 method for the plot3d generic
function from package rgl. I am trying to streamline my Depends
entries but don't know how to have

plot3d(foo)

in the examples section for the plot3d method in my package, without
rgl being in Depends.

Note that I importFrom(rgl, plotd3d) and register my S3 method via
S3Method() in the NAMESPACE.

If rgl is not in Depends but in Imports, I see this when checking the package


## 3D plot of data with curve superimposed
plot3d(aber.pc, abernethy2)

Error: could not find function plot3d

I presume this is because rgl's namespace is only loaded but the
package is not attached to the search path.

Writing R extensions indicates that one can export from a namespace
something that was imported from another package namespace. I thought
that might help the situation, and now the code doesn't raise an
error, I get

* checking for missing documentation entries ... WARNING
Undocumented code objects:
   ‘plot3d’
All user-level objects in a package should have documentation entries.
See the chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.

as I don't document plot3d() itself.

What is the recommended combination of Depends and Imports plus
NAMESPACE directives etc that one should use in this situation? Or am
I missing something else?

I have a similar issue with my package including an S3 method for a
generic in the lattice package, so if possible I could get rid of both
of these from Depends if I can solve the above issue.

Thanks in advance.

Gavin



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Depends vs Imports

2013-07-31 Thread Paul Gilbert

I am being asked to modernize the Depends line in the DESCRIPTION file 
of some packages. Writing R Extensions says:


  The general rules are

 Packages whose namespace only is needed to load the package using
   library(pkgname) must be listed in the ‘Imports’ field and not in
   the ‘Depends’ field. Packages listed in imports or importFrom
   directives in the NAMESPACE file should almost always be in
   ‘Imports’ and not ‘Depends’.

 Packages that need to be attached to successfully load the
package using library(pkgname) must be listed in the
‘Depends’ field, only.

Could someone please explain a few points I thought I understood but 
obviously do not, or point to where these are explained:


   -What does it mean for the namespace only to be needed? I thought 
the namespace was needed if the package or some of its functions were 
mentioned in the NAMESPACE file, and that only the namespace was needed 
if only the generics were called, and not other functions. The above 
suggests that I may be wrong about this. If so, that is, Imports will 
usually suffice, then when would Depends ever be needed when a package 
is mentioned in the NAMESPACE file?


  -Should the package DESCRIPTION make any accommodation for the 
situation where users will probably need to directly call functions in 
the imported package, even though the package itself does not?


   -What does need to be attached mean? Is there a distinction 
between a package being attached and a namespace being attached.


   -Does successfully load mean something different from actually 
using the package? That is, can we assume that if the package loads then 
all the functions to run things will actually be found?


   -If pkg1 uses a function foo in pkg3 indirectly, by a call to a 
function in  pkg2 which then uses foo, how should pkg1 indicate the 
relationship with foo's pkg3, or is there no need to indicate any 
relationship with pkg3 because that is all looked after by pkg2?


Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Depends vs Imports

2013-07-31 Thread Paul Gilbert


Simon

Thanks, that helps a lot, but see below ..

On 13-07-31 08:35 PM, Simon Urbanek wrote:


On Jul 31, 2013, at 7:14 PM, Paul Gilbert wrote:


I am being asked to modernize the Depends line in the DESCRIPTION
file of some packages. Writing R Extensions says:

The general rules are

Packages whose namespace only is needed to load the package using
library(pkgname) must be listed in the ‘Imports’ field and not in
the ‘Depends’ field. Packages listed in imports or importFrom
directives in the NAMESPACE file should almost always be in
‘Imports’ and not ‘Depends’.

Packages that need to be attached to successfully load the package
using library(pkgname) must be listed in the ‘Depends’ field,
only.

Could someone please explain a few points I thought I understood
but obviously do not, or point to where these are explained:

-What does it mean for the namespace only to be needed? I thought
the namespace was needed if the package or some of its functions
were mentioned in the NAMESPACE file, and that only the namespace
was needed if only the generics were called, and not other
functions. The above suggests that I may be wrong about this. If
so, that is, Imports will usually suffice, then when would Depends
ever be needed when a package is mentioned in the NAMESPACE file?



In the namespace era Depends is never really needed. All modern
packages have no technical need for Depends anymore. Loosely speaking
the only purpose of Depends today is to expose other package's
functions to the user without re-exporting them.


This seems to mostly work, except in the situation where a package is 
used that enhances an imported package. For example, I Import DBI but 
the call dbDriver(MySQL) fails looking for MySQL in package RMySQL if 
I only import that and do not list it in Depends. Am I missing something?


Similarly, I have a package tframePlus that provides extra methods (for 
zoo and xts) for my package tframe. Since tframe does not depend or 
import tframePlus (in fact, the reverse), I seem to need tframePlus in 
Depends not Imports of another package that Imports tframe. Does this 
sound right or am I missing something else?


Also, I have a package TSMySQL which enhances my package TSdbi. When a 
user uses TSMySQL they will want to use many functions in TSdbi. Here 
again, I seem to need TSMySQL to Depend on TSdbi, for the reason you 
mention, exposing all the functions to the user.


(I'm glad this is simple, I have trouble when things are difficult.)

Thanks again,
Paul




-Should the package DESCRIPTION make any accommodation for the
situation where users will probably need to directly call functions
in the imported package, even though the package itself does not?

-What does need to be attached mean? Is there a distinction
between a package being attached and a namespace being attached.



No, the distinction is between loaded and attached (namespace/package
is synonymous here).



-Does successfully load mean something different from actually
using the package? That is, can we assume that if the package loads
then all the functions to run things will actually be found?



Define found - they will not be attached to the search path, so
they will be found if you address them fully via myPackage::myFn but
not just via myFn (except for another package that imports
myPackage).



-If pkg1 uses a function foo in pkg3 indirectly, by a call to a
function in  pkg2 which then uses foo, how should pkg1 indicate the
relationship with foo's pkg3, or is there no need to indicate any
relationship with pkg3 because that is all looked after by pkg2?



There is no need - how would you imagine being responsible for code
that you did not write? pkg2 will import function from pkg1, but
you're not importing them in pkg3, you don't even care about them so
you have no direct relationship with pkg1 (imagine pkg2 switched to
use pkg4 instead of pkg1).


IMHO it's all really simple:

load = functions exported in myPkg are available to interested
parties as myPkg::foo or via direct imports - essentially this means
the package can now be used

attach = the namespace (and thus all exported functions) is attached
to the search path - the only effect is that you have now added the
exported functions to the global pool of functions - sort of like
dumping them in the workspace (for all practical purposes, not
technically)

import a function into a package = make sure that this function works
in my package regardless of the search path (so I can write fn1
instead of pkg1::fn1 and still know it will come from pkg1 and not
someone's workspace or other package that chose the same name)

Cheers, Simon




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-3.0.1 - transient make check failure in splines-EX.r

2013-05-29 Thread Paul Gilbert


Avraham

I resolved this only by switching to a different BLAS on the 32 bit 
machine.Since no one else seemed to be having problems, I considered it 
possible that there was a hardware issue on my old 32 bit machine. The R 
check test failed somewhat randomly, but often. most disconcertingly, it 
failed because it gives different answers. If you source the code in an 
R session a few times you have no trouble reproducing this. It gives the 
impression of an improperly zeroed matrix.


(All this from memory, I'm on the road.)

Paul

On 13-05-28 06:36 PM, Adler, Avraham wrote:


Hello.

I seem to be having the same problem that Paul had in the thread titled [Rd] R 2.15.2 make check 
failure on 32-bit --with-blas=-lgoto2 from October of last year 
https://stat.ethz.ch/pipermail/r-devel/2012-October/065103.html Unfortunately, that thread ended 
without an answer to his last question.

Briefly, I am trying to compile an Rblas for Windows NT 32bit using OpenBlas (successor 
to GotoBlas) (Nehalem - corei7), and the compiled version passes all tests except for the 
splines-Ex test in the exact same place that Paul had issues:



stopifnot(identical(ns(x), ns(x, df = 1)),

+   identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)), # not true 
till 2.15.2
+   !is.null(kk - attr(ns(x), knots)), # not true till 1.5.1
+   length(kk) == 0)
Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE


Yet, opening up R and running the actual code shows that the error is transient:



identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] FALSE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] TRUE

identical(ns(x, df = 2), ns(x, df = 2, knots = NULL))

[1] FALSE


This is the only error I have on the 32-bit version, I believe (trying to build 
a blas for 64-bit on SandyBridge is a completely different kettle of fish that 
is causing me to pull out what little hair I have left), and if it can be 
solved that would be great.

Thank you,

Avraham







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R 3.0, Rtools3.0,l Windows7 64-bit, and permission agony

2013-04-29 Thread Paul Gilbert

Being generally uninformed about Windows, I have to admit to almost 
total confusion trying to follow this thread. However, since I have 
recently been trying to do something in Windows, I would appreciate a 
newbie friendly explanation of a few points:


-Rtools is used to build R and to build (some?) R packages. If you make 
Rtools an R package, how do you bootstrap the R build process?


-in unix-like OSes, configure is used before make to set things similar 
to the question of where to find Rtools, and what version of various 
tools are available, and give warnings and errors if these are not 
adequate. Is there a reason configure cannot be used in Windows, or is 
there not something similar?


-or am I really confused and should not consider the possibility that 
people actually build R, so the discussion is just about packages?


Thanks,
Paul

On 13-04-22 11:16 AM, Gabor Grothendieck wrote:

On Mon, Apr 22, 2013 at 10:27 AM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:

On 21/04/2013 6:57 PM, Hadley Wickham wrote:



PS. Hadley, is this what you meant when you wrote Better solutions
(e.g. Rstudio and devtools) temporarily set the path on when you're
calling R CMD *., or those approaches are only when you call 'R CMD'
from the R prompt?  I believe the latter, but I just want to make sure
I didn't miss something.


Well both devtools and RStudio allow you to do package development
without leaving R, so neither do anything to your path when you're not
using them.

In teaching windows users to develop R packages, I found the use of
the command line to be a substantial road-block, and if you can
develop packages without leaving R, why not?



The idea of temporary additions to the path during the INSTALL/build/check
code sounds reasonable.  R could probably do it more accurately than
devtools or RStudio can (since we know the requirements, and you have to
guess at them), but could hopefully do it in a way that isn't incompatible
with those.

The code called by install.packages() and related functions within R is
essentially the same code as called by R CMD INSTALL etc from the command
line, so this would help both cases.


I would like to comment on this as I have had to implement similar
facilities myself as part R.bat in the batchfiles.

There is an issue of keeping R and Rtools in sync.   Currently
different Rtools versions will work with the same R version.  For
example, I have used both Rtools 1927 and 1930 with the current
version of R.  Its necessary to determine the relative paths that the
version of Rtools in use requires since in principle the relative
Rtools paths can vary from one version of Rtools to the next if the
gcc version changes, say.

Ideally the system would be able to figure this out even if registry
entries and environment variables are not set by looking in standard
locations for the Rtools root and finding the relative paths by
querying some file in the Rtools installation itself.  devtools does
this by querying the Rtools version and uss an internal database of
relative paths keyed by version.   R.bat in batchfiles does it by
scanning the Rtools unins000.dat file and extracting the relative
paths directly from it.  This has the advantage that no database need
be maintained and it also automatically adapts to new versions of
Rtools without any foreknowledge of them.   Of course since you have
control of both ends you alternately could add the relative paths to
an expanded version of the VERSION file or add some additional text
file into Rtools for the purpose of identifying he relative paths..

Another possibility if significant changes were to be considered would
be to make Rtools into an R package thereby leveraging existing
facilities and much simplifying any synchronization.

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] package file permissions problem R 3.0.0/Windows OS

2013-04-15 Thread Paul Gilbert




On 13-04-15 03:19 PM, Prof Brian Ripley wrote:

On 15/04/2013 14:11, John Fox wrote:

Dear Brian,

On Mon, 15 Apr 2013 06:56:26 +0100
  Prof Brian Ripley rip...@stats.ox.ac.uk wrote:

POSIX-style execute permission isn't a Windows concept, so it was
fortuitous this ever worked.  One possibility is that Cygwin was
involved, and a Cygwin emulation got set when tar unpacked the file
and converted back to the tar representation when Cygwin tar produced
the tarball. (The tar in Rtools is a fixed version of Cygwin tar,
fixed to use Windows file paths.)



Recall that the problem was first detected when I submitted to CRAN
a

new version of the sem package that I built on one of my Windows
systems. I'm guessing that you unpacked that on a Linux system. Perhaps
I misunderstand the point, but if the problem is in unpacking, then
shouldn't I see it when the package is built on R 2.15.2 (not 2.5.2 --
sorry, my typo)?

The puzzle is how you got execute permissions recorded for files on your
Windows system.  They are not part of the Windows file system: Cygwin
uses ACLs to emulate them.  Once the ACLs are there, a Cygwin-based tar
will put them as permissions into the tarball.  But a native Windows
tool would not (it might or might not capture the ACLs using a tar
extension, but those would be ignored by most unpacking tools on a
Unix-alike).

The issue is not really Windows: if you use a FAT file system on a
Unix-alike you have the same problem -- this is why SMB mounts at least
did not work on OS X for building R (and much else), and you need to be
careful transferring directories via USB sticks (which are usually
FAT-formatted).  That route usually makes the opposite compromise: to
assume everything is executable.


What are those screen shots of?


7zip, which I use on Windows to manage file archives.


Ah, so that's a listing of the .tar.gz, a graphical form of tar -tvf.


R 2.5.2 was a very long time ago.  A recent change is


Indeed. Again, that is my unfortunate typo -- I used 2.15.2. I wanted
to confirm that I can build packages with the correct permissions on
my Windows systems using an older (but recent) version of R.



  • R CMD build by default uses the internal method of tar() to
prepare the tarball.  This is more likely to produce a tarball
compatible with R CMD INSTALL and R CMD check: an external tar
program, including options, can be specified _via_ the
environment variable R_BUILD_TAR.



I saw that but didn't understand its import. That makes sense of a
difference between R 2.15.2 and 3.0.0, though I'm not sure why this
change would introduce a problem with the permissions.


Can you try using an external tar?  (Using the internal tar on
Windows was first trialled in 2.15.3.)



Yes, when I set R_BUILD_TAR=tar on my Windows 8 system, the tarball
for the package is built with the correct permissions under R 3.0.0.
The tar should be found in the Rtools\bin directory, which is first on
my path. I don't have Cygwin installed on this machine independently
of Rtools.

What's curious to me is that I'm seeing the problem on two different
Windows system but, AFAIK, no one else has experienced a similar problem.


Very few Windows users will ever get a file that appears to 'tar' to
have execute permissions.  For example, svn checkouts on Windows lose
execute permissions, something which has caught me for time to time over
the years.


I am just having the opposite problem, sliksvn is adding x permission on 
checkout, to some but not all files. Not sure why and I don't want it 
to, so would be happy to hear suggestions.


Paul




Thanks for your help,
  John



On 14/04/2013 22:17, John Fox wrote:

Dear list members,

I'm experiencing a file permissions problem with a package built under
Windows with R 3.0.0. I've encountered the problem on two Windows
computers,
one running Windows 7 and the other Windows 8, and both when I build
the
package under RStudio or directly in a Windows console via R CMD
build.

In particular, the cleanup file for the package, which as I
understand it
should have permissions set at rwx-r-r, instead has permissions
rw-rw-rw.
I've attached two .png screen shots showing how the permissions are
set when
the package is built under R 2.5.2 and R 3.0.0.

I think that my two Windows systems are reasonably vanilla. Here are
the
system and session info from R 3.0.0 run from a Windows console:


Sys.info()

   sysname  release
 Windows  7 x64
   version nodename
build 7601, Service Pack 1  JOHN-DELL-XPS
   machinelogin
 x86   User
  user   effective_user
User   User


sessionInfo()

R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)

[Rd] R-3.0.0 reg-tests-3.R / survival

2013-04-03 Thread Paul Gilbert



make check is failing on reg-test3.R with a message that survival was 
built with an older version of R.  (On my Ubuntu 32 bit and Ubuntu 64 
bit machines). Why would make check be looking anywhere that it would 
find something built with an older version of R?


~/RoboAdmin/R-3.0.0/tests$ tail reg-tests-3.Rout.fail
 print(1.001, digits=16)
[1] 1.001
 ## 2.4.1 gave  1.001
 ## 2.5.0 errs on the side of caution.


 ## as.matrix.data.frame with coercion
 library(survival)
Error: package 'survival' was built before R 3.0.0: please re-install it
Execution halted

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-3.0.0 reg-tests-3.R / survival

2013-04-03 Thread Paul Gilbert



make check seems to be picking up the setting of my  R_LIBS_SITE which 
is still set for 2.15.3 and had survival in it. At least, when I set 
that to empty then the check passes.  I'm not sure if checking worked 
like that before, I don't think so. I have not usually had base packages 
in my site-library. In any case, it seems like a bad idea for make 
check to use an existing setting of R_LIBS_SITE. At least, I think the 
idea is that it should be checking the just built library.


Paul


On 13-04-03 11:36 AM, peter dalgaard wrote:

Any chance that you might have a personal library, which isn't versioned? If 
you do and you for some reason installed survival into it, it would explain it.

E.g., I have, with the system-wide R


.libPaths()

[1] /Users/pd/Library/R/2.15/library
[2] /opt/local/Library/Frameworks/R.framework/Versions/2.15/Resources/library

lapply(.libPaths(), list.files)

[[1]]
  [1] abind  aplpackcarcolorspace e1071
  [6] effectsellipseHmisc  ISwR   leaps
[11] lmtest matrixcalc mclust multcomp   mvtnorm
[16] pcaPP  Rcmdr  relimp represent  rgl
[21] robustbase rrcov  semxtable zoo

[[2]]
  [1] base   boot   class  clustercodetools
  [6] compiler   datasets   foreigngraphics   grDevices
[11] grid   KernSmooth latticeMASS   Matrix
[16] methodsmgcv   nlme   nnet   parallel
[21] rpart  spatialsplinesstats  stats4
[26] survival   tcltk  tools  utils

but the one in my development build tree of 3.0.0 has


.libPaths()

[1] /Users/pd/r-release-branch/BUILD-dist/library

If I explicitly set R_LIBS, I can easily reproduce your error.


On Apr 3, 2013, at 17:00 , Paul Gilbert wrote:



make check is failing on reg-test3.R with a message that survival was built with an 
older version of R.  (On my Ubuntu 32 bit and Ubuntu 64 bit machines). Why would make 
check be looking anywhere that it would find something built with an older version of R?

~/RoboAdmin/R-3.0.0/tests$ tail reg-tests-3.Rout.fail

print(1.001, digits=16)

[1] 1.001

## 2.4.1 gave  1.001
## 2.5.0 errs on the side of caution.


## as.matrix.data.frame with coercion
library(survival)

Error: package 'survival' was built before R 3.0.0: please re-install it
Execution halted

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [BioC] enabling reproducible research R package management install.package.version BiocLite

2013-03-05 Thread Paul Gilbert


(More on the original question further below.)

On 13-03-05 09:48 AM, Cook, Malcolm wrote:

All,

What got me started on this line of inquiry was my attempt at
balancing the advantages of performing a periodic (daily or weekly)
update to the 'release' version of locally installed R/Bioconductor
packages on our institute-wide installation of R with the
disadvantages of potentially changing the result of an analyst's
workflow in mid-project.


I have implemented a strategy to try to address this as follows:

1/ Install a new version of R when it is released, and packages in the R 
version's site-library with package versions as available at the time 
the R version is installed. Only upgrade these package versions in the 
case they are severely broken.


2/ Install the same packages in site-library-fresh and upgrade these 
package versions on a regular basis (e.g. daily).


3/ When a new version of R is released, freeze but do not remove the old 
R version, at least not for a fairly long time, and freeze 
site-library-fresh for the old version. Begin with the new version as in 
1/ and 2/. The old version remains available, so reverting is trivial.



The analysts are then responsible for choosing the R version they use, 
and the library they use. This means they do not have to change R and 
package version mid-project, but they can if they wish. I think the 
above two libraries will cover most cases, but it is possible that a few 
projects will need their own special library with a combination of 
package versions. In this case the user could create their own library, 
or you might prefer some more official mechanism.


The idea of the above strategy is to provide the stability one might 
want for an ongoing project, and the possibility of an upgraded package 
if necessary, but not encourage analysts to remain indefinitely with old 
versions (by say, putting new packages in an old R version library).


This strategy has been implemented in a set of make files in the project 
RoboAdmin available at http://automater.r-forge.r-project.org/. It can 
be done entirely automatically with a cron job. Constructive comments 
are always appreciated.


(IT departments sometimes think that there should be only one version of 
everything available, which they test and approve. So the initial 
reaction to this approach could be negative. I think they have not 
really thought about the advantages. They usually cannot test/approve an 
upgrade without user input, and timing is often extremely complicate 
because of ongoing user needs. This strategy is simply shifting 
responsibility and timing to the users, or user departments, that can 
actually do the testing and approving.)


Regarding NFS mounts, it is relatively robust. There can be occasional 
problems, especially for users that have a habit of keeping an R session 
open for days at a time and using site-library-fresh packages. In my 
experience this did not happen often enough to worry about a blackout 
period.


Regarding the original question, I would like to think it could be 
possible to keep enough information to reproduce the exact environment, 
but I think for potentially sensitive numerical problems that is 
optimistic. As others have pointed out, results can depend not only on R 
and package versions, configuration, OS versions, and library and 
compiler versions, but also on the underlying hardware. You might have 
some hope using something like an Amazon core instance. (BTW, this 
problem is not specific to R.)


It is true that restricting to a fixed computing environment at your 
institution may ease things somewhat, but if you occasionally upgrade 
hardware or the OS then you will probably lose reproducibility.


An alternative that I recommend is that you produce a set of tests that 
confirm the results of any important project. These can be conveniently 
put in the tests/ directory of an R package, which is then maintained 
local, not on CRAN, and built/tested whenever a new R and packages are 
installed. (Tools for this are also available at the above indicated web 
site.) This approach means that you continue to reproduce the old 
results, or if not, discover differences/problems in the old or new 
version of R and/or packages that may be important to you. I have been 
successfully using a variant of this since about 1993, using R and 
package tests/ since they became available.


Paul



I just got the green light to institute such periodic updates that
I have been arguing is in our collective best interest.  In return,
I promised my best effort to provide a means for preserving or
reverting to a working R library configuration.

Please note that the reproducibility I am most eager to provide is
limited to reproducibility within the computing environment of our
institute, which perhaps takes away some of the dragon's nests,
though certainly not all.

There are technical issues of updating package installations on an
NFS mount that might have

Re: [Rd] maintaining multiple R versions

2013-01-23 Thread Paul Gilbert


Aaron

For the problem I had in mind, changing a couple of environment 
variables does not seem like more work than this,  but it may solve a 
bigger problem than the one I was thinking about. If I understand 
correctly, you can use this to switch among versions of R, similar to 
what I am doing and still with versions in different directories found 
by a PATH setting. But, in addition, it is also possible that the R 
versions were compiled with different gcc and other tools, as long as 
those are still installed on the system.  Does it also work if you 
upgrade the OS and have newer versions of system libraries, etc, or do 
you then need to recompile the R versions?


Thanks,
Paul

On 13-01-18 02:58 PM, Aaron A. King wrote:

Have you looked at Environment Modules (http://modules.sourceforge.net/)?  I 
use it to maintain multiple versions of R.  Users can choose their default and 
switch among them at the command line.

Aaron

On Fri, Jan 18, 2013 at 02:04:13PM -0500, Paul Gilbert wrote:

(somewhat related to thread [Rd] R CMD check not reading R_LIBS )

For many years I have maintained R versions by building R
(./configure ; make)  in a directory indicating the version number,
putting the directory/bin on my path, and setting R_LIBS_SITE.

It seems only one version can easily be installing in /usr/bin, and
in any case that requires root, so I do not do that. There may be an
advantage to installing somewhere in a directory with the version
number, but that does not remove the need to set my path. (If there
is an advantage to installing I would appreciate someone explaining
briefly what it is.)

My main question is whether there is a better ways to maintaining
multiple versions, in some way that lets users choose which one they
are using?

(The only problem I am aware of with my current way of doing this is:
if the system has some R in /usr/bin then I have to set my preferred
version first, which means shell commands like man find R's pager
first, and do not work.)

Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] maintaining multiple R versions

2013-01-18 Thread Paul Gilbert


(somewhat related to thread [Rd] R CMD check not reading R_LIBS )

For many years I have maintained R versions by building R (./configure ; 
make)  in a directory indicating the version number, putting the 
directory/bin on my path, and setting R_LIBS_SITE.


It seems only one version can easily be installing in /usr/bin, and in 
any case that requires root, so I do not do that. There may be an 
advantage to installing somewhere in a directory with the version 
number, but that does not remove the need to set my path. (If there is 
an advantage to installing I would appreciate someone explaining briefly 
what it is.)


My main question is whether there is a better ways to maintaining 
multiple versions, in some way that lets users choose which one they are 
using?


(The only problem I am aware of with my current way of doing this is:
if the system has some R in /usr/bin then I have to set my preferred 
version first, which means shell commands like man find R's pager 
first, and do not work.)


Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-forge, package dependencies

2013-01-15 Thread Paul Gilbert

I'm surprised this works on Windows and Mac, since RMonetDB does not 
seem to be on CRAN. I thought it was still a requirement that 
dependencies need to be on CRAN (which makes development difficult for 
related packages like this).


A related long standing request is that R-forge checking look for 
required newer versions on R-forge rather than just on CRAN. Does anyone 
know if that works on Windows and/or Mac?


Paul

On 13-01-15 03:09 PM, Uwe Ligges wrote:



On 15.01.2013 20:47, Thomas Lumley wrote:

I have a project on R-forge (sqlsurvey.r-forge.r-project.org) with two
packages, RMonetDB and sqlsurvey.

At the moment, sqlsurvey is listed as failing to build.  The error is on
the Linux package check, which says that RMonetDB is not available:


* checking package dependencies ... ERROR
Package required but not available: ‘RMonetDB’

RMonetDB has built successfully: r-forge lists its status as 'current',
with Linux, Windows, and Mac packages available for download.  The
package
check for sqlsurvey on Windows and Mac finds RMonetDB without any
problems,
it's just on Linux that it appears to be unavailable.

Any suggestions for how to fix this? I've tried uploading a new
version of
RMonetDB, but the situation didn't change: it built successfully, but the
Linux check of sqlsurvey still couldn't find it.



I think you have to ask Stefan to check the details.

Best,
Uwe



 -thomas




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] SystemRequirements’ field

2012-12-12 Thread Paul Gilbert

Am I correct in thinking that the ‘SystemRequirements’ field in a 
package DESCRIPTION file is purely descriptive, there are no standard 
elements that can be extracted by parsing it and used automatically?


This field does not seem to be widely used, even for some obvious cases 
like backend database driver requirements, perl, perl modules, etc.
It might help to have a list of possibilities. Some I think about 
immediately are SQLLite, MySQL, PostgreSQL, ODBC, Perl, Perl_CSVXS,

MPI, rpcgen, Oracle-license,  Bloomberg-license and Fame-license.
Maybe there could be a generic OTHER_* for things not in a standard list?

Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] SystemRequirements’ field

2012-12-12 Thread Paul Gilbert




On 12-12-12 02:19 PM, Prof Brian Ripley wrote:

On 12/12/2012 18:33, Paul Gilbert wrote:

Am I correct in thinking that the ‘SystemRequirements’ field in a
package DESCRIPTION file is purely descriptive, there are no standard
elements that can be extracted by parsing it and used automatically?


No.



Where can I find more details?  The section The DESCRIPTION file in 
Writing R Extensions says only:


   Other dependencies (external to the R system) should be listed in
   the ‘SystemRequirements’ field, possibly amplified in a separate
   README file.

Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] An idea: Extend mclapply's mc.set.seed with an initial seed value?

2012-11-02 Thread Paul Gilbert

I appreciate your problem, and getting reproducible random generator 
results on a parallel system is something to be careful about. However, 
I would avoid making it too easy to have a fixed seed. In earlier days 
there were mistakes too often made by users inadvertently using the same 
seed over and over again (on simple single processor systems), for 
example, by reloading a session with the seed set.


Paul

On 12-11-01 08:46 PM, Ivan Popivanov wrote:

Hello,

Have been thinking that sometimes users may want each process to initialize
its random seed with a specific value rather then the current seed. This
could be keyed off depending whether mc.set.seed is logical, preserving the
current behaviour, or numerical, using the value in a call to set.seed.
Does this make sense?

If you wonder how I came up with the idea: I spent a couple of hours
debugging unstable results from parallel tuning of svms, which was caused
by the parallel execution. In my case I can simply do the set.seed in the
FUN argument function, but that may not be always the case.

Ivan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Retrieving data from aspx pages

2012-10-31 Thread Paul Gilbert

I must be really dense, I know RCurl provides a POST capability, but I 
don't see how this allows interaction. Suppose the example actually 
worked, which it does not. (Unfortunately many of the examples in RCurl 
seem to be marked \dontrun{} or disabled with some if() condition.) When 
you post to a page like this you will often get something back that has 
a dynamically generated URI, and you will need  to post more information 
to that page. But how do you find out the URI of that next dynamically 
generated page? Even when you know what you will need to post, you need 
the URI to do it. If RCurl provided interaction you would be able get 
the URI so you could post to the next page. Maybe you can do that, but I 
have not discover how. If you know how, I would appreciate a real 
working example.


Paul

On 12-10-31 12:14 PM, jose ramon mazaira wrote:

I'd like to make you note that I've discovered that package RCurl
already provides a utility that allows interaction via POST requests
with servers. In fact, the FAQ for RCurl contains specifically an
example with an aspx page:

x = postForm(http://www.fas.usda.gov/psdonline/psdResult.aspx;,
  style = post,
 .params = list(visited=1,
   lstGroup = all,
   lstCommodity=2631000,
   lstAttribute=88,
   lstCountry=**,
   lstDate=2011,
   lstColumn=Year,
   lstOrder=Commodity%2FAttribute%2FCountry))

Check this link: http://www.omegahat.org/RCurl
However, I think that it would be more useful to automate the
interaction with servers retrieving automatically the name-value pairs
required by the server (parsing the page source code) instead of
examining in each web page the appropiate fields.

2012/10/30, Paul Gilbert pgilbert...@gmail.com:

Jose

As far as getting to the data, I think the best way to do this sort of
thing would be if the site supports a SOAP or REST interface. When they
don't (yet) then one is faced with clicking through some pages. Python
or Java is one way to automate the process of clicking through the
pages. I don't know how to do that in R, but would like to know if it is
possible.

But, I guess I was confused about the part you want to improve. What I
have works fairly smoothly parsing and passing back JSON data, converted
from a csv file, into R. The downside is that this approach requires
more than R to be installed on the client machine. But if the object you
get back is ASPX, then you either need to parse it directly, or convert
it to JSON, or something else you can deal with. I suspect that will be
fairly specific to a particular web site, but I don't really know enough
about ASPX to be sure.

Paul

On 12-10-30 01:12 PM, jose ramon mazaira wrote:

Thanks for your interest, Paul.
I've checked the source code of TSjson and I've seen that what it does
is to call a Python script to retrieve the data. In fact, I've already
done this with Java using the URLConnection class and sending the
requested values to fill the form.
However, I think it would be more useful to open a connection with R
and to send the requested values within R, and not through an external
program.
The application I've designed, like yours, is also page-specific
(i.e., designed for
http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx),
but I think that our applications would be more powerful if they were
able to parse the name-value pairs generated from ASPX (or of any
other dynamically generated web page) and ask the user to select the
appropiate values.

2012/10/30, Paul Gilbert pgilbert...@gmail.com:

I think RHTMLForms works if you have a single form, but I have not been
able to see how to use it when you need to go through a sequence of
dynamically generated forms (like you can do with Python mechanize).

Paul

On 12-10-30 09:08 AM, Gabriel Becker wrote:

I haven't used it extensively myself, and can't speak to it's current
state but on quick inspection RHTMLForms seems worth a look for what
you
want.

http://www.omegahat.org/RHTMLForms/

~G

On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com
mailto:pgilbert...@gmail.com wrote:

  I don't know of an easy way to do this in R. I've been doing
  something similar with python scripts called from R. If anyone
knows
  how to do this with just R, I would appreciate hearing too.

  Paul


  On 12-10-29 04:11 PM, jose ramon mazaira wrote:

  Hi. I'm trying to write an application to retrieve financial
data
  (specially bonds data) from FINRA. The web page is served
  dynamically
  from an asp.net http://asp.net application:


http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

  I'd like to know if it's possible to fill

Re: [Rd] Retrieving data from aspx pages

2012-10-30 Thread Paul Gilbert

I don't know of an easy way to do this in R. I've been doing something 
similar with python scripts called from R. If anyone knows how to do 
this with just R, I would appreciate hearing too.


Paul

On 12-10-29 04:11 PM, jose ramon mazaira wrote:

Hi. I'm trying to write an application to retrieve financial data
(specially bonds data) from FINRA. The web page is served dynamically
from an asp.net application:

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

I'd like to know if it's possible to fill dynamically the web page
form from R and, after filling it (with the issuer name), retrieve the
web page, parse the data, and covert it to appropiate R objects.
For example, suppose I want to search data for ATT bonds. I'd like to
know if it's possible, within R, to fill the page served from:

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

select the corporate option and fill with ATT the field for Issuer
name, ask the page to display the results, and retrieve the results
for each of the bonds issued by ATT (for example:

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3)

and parsing the data from the web page.

Thanks in advance.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Retrieving data from aspx pages

2012-10-30 Thread Paul Gilbert

I think RHTMLForms works if you have a single form, but I have not been 
able to see how to use it when you need to go through a sequence of 
dynamically generated forms (like you can do with Python mechanize).


Paul

On 12-10-30 09:08 AM, Gabriel Becker wrote:

I haven't used it extensively myself, and can't speak to it's current
state but on quick inspection RHTMLForms seems worth a look for what you
want.

http://www.omegahat.org/RHTMLForms/

~G

On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com
mailto:pgilbert...@gmail.com wrote:

I don't know of an easy way to do this in R. I've been doing
something similar with python scripts called from R. If anyone knows
how to do this with just R, I would appreciate hearing too.

Paul


On 12-10-29 04:11 PM, jose ramon mazaira wrote:

Hi. I'm trying to write an application to retrieve financial data
(specially bonds data) from FINRA. The web page is served
dynamically
from an asp.net http://asp.net application:


http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

I'd like to know if it's possible to fill dynamically the web page
form from R and, after filling it (with the issuer name),
retrieve the
web page, parse the data, and covert it to appropiate R objects.
For example, suppose I want to search data for ATT bonds. I'd
like to
know if it's possible, within R, to fill the page served from:


http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

select the corporate option and fill with ATT the field for
Issuer
name, ask the page to display the results, and retrieve the results
for each of the bonds issued by ATT (for example:


http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/BondDetail.aspx?ID=__MDAxOTU3Qko3

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3)

and parsing the data from the web page.

Thanks in advance.


R-devel@r-project.org mailto:R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-devel
https://stat.ethz.ch/mailman/listinfo/r-devel



R-devel@r-project.org mailto:R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-devel
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Retrieving data from aspx pages

2012-10-30 Thread Paul Gilbert

Jose

As far as getting to the data, I think the best way to do this sort of
thing would be if the site supports a SOAP or REST interface. When they
don't (yet) then one is faced with clicking through some pages. Python
or Java is one way to automate the process of clicking through the
pages. I don't know how to do that in R, but would like to know if it is
possible.

But, I guess I was confused about the part you want to improve. What I
have works fairly smoothly parsing and passing back JSON data, converted
from a csv file, into R. The downside is that this approach requires
more than R to be installed on the client machine. But if the object you
get back is ASPX, then you either need to parse it directly, or convert
it to JSON, or something else you can deal with. I suspect that will be
fairly specific to a particular web site, but I don't really know enough
about ASPX to be sure.

Paul

On 12-10-30 01:12 PM, jose ramon mazaira wrote:

Thanks for your interest, Paul.
I've checked the source code of TSjson and I've seen that what it does
is to call a Python script to retrieve the data. In fact, I've already
done this with Java using the URLConnection class and sending the
requested values to fill the form.
However, I think it would be more useful to open a connection with R
and to send the requested values within R, and not through an external
program.
The application I've designed, like yours, is also page-specific
(i.e., designed for
http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx),
but I think that our applications would be more powerful if they were
able to parse the name-value pairs generated from ASPX (or of any
other dynamically generated web page) and ask the user to select the
appropiate values.

2012/10/30, Paul Gilbert pgilbert...@gmail.com:

I think RHTMLForms works if you have a single form, but I have not been
able to see how to use it when you need to go through a sequence of
dynamically generated forms (like you can do with Python mechanize).

Paul

On 12-10-30 09:08 AM, Gabriel Becker wrote:

I haven't used it extensively myself, and can't speak to it's current
state but on quick inspection RHTMLForms seems worth a look for what you
want.

http://www.omegahat.org/RHTMLForms/

On Tue, Oct 30, 2012 t 5:38 AM, Paul Gilbert pgilbert...@gmail.com
mailto:pgilbert...@gmail.com wrote:

I don't know of an easy way to do this in R. I've been doing
something similar with python scripts called from R. If anyone knows
how to do this with just R, I would appreciate hearing too.

Paul

On 12-10-29 04:11 PM, jose ramon mazaira wrote:

Hi. I'm trying to write an application to retrieve financial data
(specially bonds data) from FINRA. The web page is served
dynamically
from an asp.net http://asp.net application:

http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

I'd like to know if it's possible to fill dynamically the web
page
form from R and, after filling it (with the issuer name),
retrieve the
web page, parse the data, and covert it to appropiate R objects.
For example, suppose I want to search data for ATT bonds. I'd
like to
know if it's possible, within R, to fill the page served from:

http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/AdvancedScreener.__aspx

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/AdvancedScreener.aspx

select the corporate option and fill with ATT the field for
Issuer
name, ask the page to display the results, and retrieve the
results
for each of the bonds issued by ATT (for example:

http://cxa.gtm.__idmanagedsolutions.com/finra/__BondCenter/BondDetail.aspx?ID=__MDAxOTU3Qko3

http://cxa.gtm.idmanagedsolutions.com/finra/BondCenter/BondDetail.aspx?ID=MDAxOTU3Qko3)

and parsing the data from the web page.

Thanks in advance.

R-devel@r-project.org mailto:R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-devel
https://stat.ethz.ch/mailman/listinfo/r-devel

--
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R 2.15.2 make check failure on 32-bit --with-blas=-lgoto2

2012-10-26 Thread Paul Gilbert

Is --with-blas=-lgoto2  a known problem (other than possibly not being 
the preferred choice)?


I thought I had been testing RC with the same setup I regularly use, but 
I now see there was a slight difference. I am now getting the following 
failure in make check on  32-bit Ubuntu 12.04, configuring with 
--with-blas=-lgoto2. (These may not be surprising statistically or 
numerically, but it is a bit disconcerting when make check fails.)


...
Testing examples for package ‘stats’
  comparing ‘stats-Ex.Rout’ to ‘stats-Ex.Rout.save’ ...
2959c2959
 N:K1  33.13   33.13   2.146 0.16865
---
 N:K1  33.14   33.14   2.146 0.16865
12782c12782
 Murder   -0.536  0.418  0.341  0.649
---
 Murder   -0.536  0.418 -0.341  0.649
12783c12783
 Assault  -0.583  0.188  0.268 -0.743
---
 Assault  -0.583  0.188 -0.268 -0.743
12784c12784
 UrbanPop -0.278 -0.873  0.378  0.134
---
 UrbanPop -0.278 -0.873 -0.378  0.134
12785c12785
 Rape -0.543 -0.167 -0.818
---
 Rape -0.543 -0.167  0.818
12943c12943
 6  -0.5412  20.482886-0.845157
---
 6  -0.5412  20.482887-0.845157
14481c14481
 Sum of Squares   780.1250  276.1250 2556.1250  112.5000  774.0937
---
 Sum of Squares   780.1250  276.1250 2556.1250  112.5000  774.0938
15571c15571
 Murder   -0.54   0.42   0.34   0.65
---
 Murder   -0.54   0.42  -0.34   0.65
15572c15572
 Assault  -0.58  0.27  -0.74
---
 Assault  -0.58 -0.27  -0.74
15573c15573
 UrbanPop -0.28  -0.87   0.38
---
 UrbanPop -0.28  -0.87  -0.38
15574c15574
 Rape -0.54 -0.82
---
 Rape -0.54  0.82
Testing examples for package ‘datasets’
  comparing ‘datasets-Ex.Rout’ to ‘datasets-Ex.Rout.save’ ... OK
...

I inadvertently seemed to have set things slightly differently while 
testing RC. While testing the RC,  I was using

./configure --prefix=/home/paul/RoboRC/R-test/  --enable-R-shlib

and configure gave
...
  External libraries:readline
  Additional capabilities:   PNG, NLS
  Options enabled:   shared R library, shared BLAS, R 
profiling, Java


whereas with the release I used
./configure --prefix=/home/paul/RoboAdmin/R-2.15.2 --enable-R-shlib 
--with-blas=-lgoto2


and configure gave
...
  External libraries:readline, BLAS(generic)
  Additional capabilities:   PNG, NLS
  Options enabled:   shared R library, R profiling, Java

Thanks,
Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R 2.15.2 make check failure on 32-bit --with-blas=-lgoto2

2012-10-26 Thread Paul Gilbert




On 12-10-26 12:15 PM, Prof Brian Ripley wrote:

On 26/10/2012 16:37, Paul Gilbert wrote:

Is --with-blas=-lgoto2  a known problem (other than possibly not being
the preferred choice)?


And what precisely is it? And what chipset are you using?


I thought I had been testing RC with the same setup I regularly use, but
I now see there was a slight difference. I am now getting the following
failure in make check on  32-bit Ubuntu 12.04, configuring with
--with-blas=-lgoto2. (These may not be surprising statistically or
numerically, but it is a bit disconcerting when make check fails.)


No failure shown here ...  surely you know that the signs of principal
components are not determined?


I apologize, I missed the real error. (But yes, I am aware of this, and 
also that I should expect precision differences with different libraries 
and different architectures.) I thought for a moment that make was 
throwing an error because of the differences, but in fact it was later:


Testing examples for package ‘grid’
  comparing ‘grid-Ex.Rout’ to ‘grid-Ex.Rout.save’ ... OK
Testing examples for package ‘splines’
Error: testing 'splines' failed
Execution halted
make[3]: *** [test-Examples-Base] Error 1
make[3]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests/Examples'
make[2]: *** [test-Examples] Error 2
make[2]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests'
make[1]: *** [test-all-basics] Error 1
make[1]: Leaving directory `/home/paul/RoboAdmin/R-2.15.2/tests'
make: *** [check] Error 2
paul@toaster:~/RoboAdmin/R-2.15.2$


The problem seems to be here:

 source(~/RoboAdmin/R-2.15.2/tests/Examples/splines-Ex.R)
List of 2
 $ x: num [1:51] 58 58.3 58.6 58.8 59.1 ...
 $ y: num [1:51] 115 115 116 117 117 ...
 - attr(*, class)= chr xyVector
Warning in bs(height, degree = 3L, knots = c(62.7, 
67.3 :

  some 'x' values beyond boundary knots may cause ill-conditioned bases
Error: identical(ns(x, df = 2), ns(x, df = 2, knots = NULL)) is not TRUE
 traceback()
6: stop(paste0(ch,  is not , if (length(r)  1L) all , TRUE),
   call. = FALSE)
5: stopifnot(identical(ns(x), ns(x, df = 1)), identical(ns(x, df = 2),
   ns(x, df = 2, knots = NULL)), !is.null(kk - attr(ns(x),
   knots)), length(kk) == 0) at splines-Ex.R#130
4: eval(expr, envir, enclos)
3: eval(ei, envir)
2: withVisible(eval(ei, envir))
1: source(~/RoboAdmin/R-2.15.2/tests/Examples/splines-Ex.R)


It also seems that this error is transient. If I rerun several times, it 
does not always happen. Is anyone aware of other cases of transient 
problems with 32-bit goto2?


Here is the cpu info:

paul@toaster:~$ cat /proc/cpuinfo
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU T7300  @ 2.00GHz
stepping: 10
microcode   : 0x92
cpu MHz : 800.000
cache size  : 4096 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc 
arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 
ssse3 cx16 xtpr pdcm lahf_lm ida dtherm tpr_shadow vnmi flexpriority

bogomips: 3989.99
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 Duo CPU T7300  @ 2.00GHz
stepping: 10
microcode   : 0x92
cpu MHz : 800.000
cache size  : 4096 KB
physical id : 0
siblings: 2
core id : 1
cpu cores   : 2
apicid  : 1
initial apicid  : 1
fdiv_bug: no
hlt_bug : no
f00f_bug: no
coma_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc 
arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 
ssse3 cx16 xtpr pdcm lahf_lm ida dtherm tpr_shadow vnmi flexpriority

bogomips: 3989.97
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Paul



And 32-bit platforms seem prone to round in different ways (to each
other and to 64-bit platforms): the differences in the last digit are
typical of 32-bit platforms.


...
Testing examples for package ‘stats’
   comparing ‘stats-Ex.Rout’ to ‘stats-Ex.Rout.save’ ...
2959c2959
 N:K1  33.13   33.13   2.146 0.16865
---
  N:K1  33.14   33.14   2.146 0.16865
12782c12782
 Murder

Re: [Rd] Is there an automatic method for updating an existing CRAN package from R-forge?

2012-10-01 Thread Paul Gilbert

Yes, the button is still there if you are logged in and unhide it. (You 
probably also need the appropriate developer permission.) Beware that 
R-forge needs to indicate the package build status is current, but also, 
if you have just made svn updates it may falsely indicate current. Check 
the revision number is accurate, or wait a couple of days, or submit by 
hand.


If you do submit by ftp to incoming, beware that the CRAN policy now 
requires that your email state that you have read and agree with the 
CRAN policies. (Probably everyone else noticed that, but there is no 
need for others to generate extra work for CRAN maintainers, like I have 
just done.)


Paul

On 12-10-01 11:09 AM, Spencer Graves wrote:

On 10/1/2012 4:47 AM, Duncan Murdoch wrote:

On 12-10-01 7:38 AM, S Ellison wrote:

I have a package on CRAN and now have a modest update that's passing
build checks on R-forge.

Is there a mechanism on R-forge for updating an existing CRAN
package, analogous to the 'submit to cran' link on the R-forge
package page, or should I just follow the instructions at
http://cran.r-project.org/web/packages/policies.html for FTP upload?


If there were a Submit to CRAN button, that would be the method. But
I think that button has gone away, so the description on that page is
the way to go.



   The Submit to CRAN button on that page is hidden by default but
is exposed by clicking on the Show/Hide extra info button right below
where it gives Build Status and R install command. (I had trouble
finding the Submit to CRAN button for a while after it became hidden.
You need Build status:  Current.  Also, at least the for last
submissions I made, the CRAN maintainers did NOT accept updates if there
were Warnings or Notes in the Package Checks;  these warnings are also
hidden until you click Show/Hide extra info.)


   Spencer


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] CRAN test / avoidance

2012-09-19 Thread Paul Gilbert


( subject changed from Re: [Rd] R-devel Digest, Vol 115, Issue 18 )

I have the impression from this, and previous discussions on the 
subject, that package developers and CRAN maintainers are talking at 
cross-purposes. Many package maintainers are thinking that they should 
be responsible for choosing which tests are run and which are not run by 
CRAN, whereas CRAN maintainers may want to run all possible tests 
sometimes, or a trimmed down set when time constraints demand this. With 
good reason, CRAN may want to run all possible tests sometimes. There 
are too many packages on CRAN that remain there because they don't have 
any testing or vignettes, and very few examples. Encouraging more of 
that is a bad thing.


If I understand correctly, the --as-cran option was introduced to help 
developers specify options that CRAN uses, so they would find problems 
that CRAN would notice, and correct before submitting. The Rd 
discussions of this have morphed into a discussion of how package 
developers can use --as-cran to control which tests are run by CRAN.


I tend to be more sympathetic with what I call the CRAN maintainer view 
above, even though I am a package developer. I think packages should 
have extensive testing and that all the tests should go in the source 
package on CRAN, so the testing is available for CRAN and everyone else. 
(Although, it is sometimes not clear if CRAN maintainers like me doing 
this, because they are torn between time demands and maintaining quality 
- that is part of the confusion.)


The question becomes: how does information get passed along to indicate 
things that may take a long time to run. The discussion so far has 
focused on developers setting, or using, some flags to indicate tests 
and examples that take a long time. Another option would be to have the 
check/build process generate a file with information about the time it 
took to run tests, vignettes, and examples, probably with some 
information about the speed of the machine it was run on. Then CRAN and 
anyone else that wants to run tests can take this information into 
consideration.


Paul

On 12-09-19 10:08 AM, Terry Therneau wrote:

In general, as a package user, I don't want people to be able to
suppress checks on CRAN.  I want things fixed.

So I am pretty sure there won't ever be a reliable CRAN-detector put
into R.  It would devalue the brand.

Duncan Murdoch


My problem is that CRAN demands that I suppress a large fraction of my
checks, in order to fit within time constraints.  This leaves me with 3
choices.

1. Add lines to my code that tries to guess if CRAN is invoker.  A cat
and mouse game per your desire above.

2. Remove large portions of my test suite.  I consider the survival
package to be one of the pre-eminent current code sets in the world
precisely because of the extensive validations, this action would change
it to a second class citizen.

3. Add a magic environment variable to my local world, only do the full
tests if it is present, and make the dumbed down version the default.
Others who want to run the full set are then SOL, which I very much
don't like.

I agree that CRAN avoidence, other than the time constraint, should be
verboten.  But I don't think that security through obscurity is the
answer.  And note that under scenario 3, which is essentially what is
currently being forced on us, I can do such micshief as easily as under
number 1.

Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] problem with vignettes when S4 classes in packages overlap

2012-09-18 Thread Paul Gilbert




On 12-09-18 07:23 PM, Duncan Murdoch wrote:

On 12-09-18 5:40 PM, Paul Gilbert wrote:


( A similar problem is also reported by Sebastian P. Luque with
library(maptools)
library(trip)
in the vignette as below ).

I am writing a vignette which loads RMySQL and RPostgreSQL. This
produces the warning:

Loading required package: DBI
Warning in .simpleDuplicateClass(def, prev) :
A specification for class “dbObjectId” in package ‘RPostgreSQL’ seems
equivalent to one from package ‘RMySQL’ and is not turning on duplicate
class definitions for this class

This can be reproduced by running
R CMD Sweave --pdf Atest.Stex

where the file Atest.Stex has the lines

\documentclass{article}
\usepackage{Sweave}
\begin{document}
\begin{Scode}
library(RMySQL)
library(RPostgreSQL)
\end{Scode}
\end{document}

These warnings only happen in a vignette. They are not produced if the
lines are entered in an R session.

(Using R version 2.15.1 (2012-06-22) -- Roasted Marshmallows on Ubunt


You'll get the warning in a regular session if you set options(warn=1).
  I think Sweave is probably doing this so that warnings show up around
the time of the chunk they correspond to.  It does it in the command
line version, but not in the Sweave() function (which would save them up
to the end).

I don't know if the warning is something you should worry about or not.


It doesn't interfere with producing the vignette, but for submitting to 
CRAN it is better not to have warnings coming from my package, even 
though they are caused by a problem with other packages. Now that I know 
why it only happens in the vignette, I guess I can suppress it (but it 
would be nice to see the other packages fixed).


Thanks,
Paul


Duncan Murdoch


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Requests for vignette clarification (re: Writing R Extensions)

2012-06-03 Thread Paul Gilbert


I'll make a guess at some parts of this.

On 12-06-01 02:53 PM, Paul Johnson wrote:

I apologize that these questions are stupid and literal.

I write to ask for clarification of comments in the R extensions
manual about vignettes.  I'm not great at LaTeX, but I'm not a
complete novice either, and some of the comments are puzzling to me.

1. I'm stumbling over this line:

Make sure all files needed to run the R code in the vignette (data
sets, ...) are accessible by either placing them in the inst/doc
hierarchy of the source package or by using calls to system.file().

Where it says inst/doc, can I interpret it to mean vignettes?  The
vignette files are under vignettes. Why wouldn't those other files be
in there? Or does that mean I'm supposed to copy the style and bib
files from the vignettes folder to the inst/doc folder?  Or none of
the above :)


I think the idea is that a user looking at an installed version of the 
package will be able to see things that are in the doc/ directory of the 
installed package. This automatically includes the source files (eg 
*.Stex) from vignettes/ and also the generated *.pdf and the *.R files 
stripped from the *.Stex files. If you want them to have access to other 
files then you should put those somewhere so they get installed, such as 
in the source package /inst/doc directory so they get put in the doc/ 
directory of the installed package. That should probably include 
anything else that is important to reproduce the results in the 
vignette, but I do not count the .bib file in that list (so I have it in 
vignettes/ and users would need to look at the package source to find it).


2. I'm also curious about the implications of the parenthesized
section of this comment:

By default R CMD build will run Sweave on all files in Sweave format
in vignettes, or if that does not exist, inst/doc (but not in
sub-directories).

At first I though that meant it will search vignettes and
subdirectories under vignettes, or it will look under inst/doc, but no
subdirectories under inst/doc.  So I created vignettes in
subdirectories under vignettes and they are ignored by the build
process, so that was obviously wrong.  For clarification, it would
help me if the manual said

By default R CMD build will run Sweave on all files in Sweave format
in vignettes (but not in sub-directories), or if that does not exist,
inst/doc .

In this list I've read several questions/complaints from people who
don't want their vignettes rebuild during the package check or build
process, and I wondered if there is a benefit to having vignettes in
subdirectories.   Could inclusion of troublesome vignettes in
subdirectories be a way that people can circumvent the rebuilding and
re-checking of vignettes during build, check, or install?  If I build
my vignettes manually and copy the pdf output over to inst/doc, will
those pdf files be legitimate vignette files as far as CRAN is
concerned?  The writeup in R Extensions is a little bit confusing on
that point.

By including the PDF version in the package sources it is not
necessary that the vignette PDFs can be re-built at install time,
i.e., the package author can use private R packages, screen snapshots
and LaTeX extensions which are only available on his machine.

Its just confusing, that's all I can say about it.


There was at least one earlier R-devel discussion of this, in which I 
contributed an incorrect understanding, but was generally straightened 
out by Uwe. I hope I have a correct understanding now.


You can put a pdf file in inst/doc and specify BuildVignettes: false 
in the DESCRIPTION file, in which case the already constructed pdf from 
inst/doc will be used. The purpose of this is to allow vignettes which 
would not be completely constructed from sources, for example, because 
certain data or other resources may not be generally available. However, 
R CMD check will still try to parse the Sweave file and run the R code, 
and fail if it does not run. So, when the resources to build the 
vignette are not generally available this does require some special 
attention, often with try(), in the code for your vignette.


It is possible to claim special exemption for a vignette. If the reasons 
seem valid then that package will be put on a special list which allows 
skipping the vignette when the package is tested for CRAN. The reason 
for somewhat tight control on this by CRAN maintainers is that the 
vignettes have proven to be a good check on problems with packages, so 
skipping them will reduce quality, and so CRAN maintainers do not want 
to provide an easy option to skip this check.


There have been a variety of mechanism suggested on R-devel for 
subverting the CRAN checks of the vignette code. My interpretation is 
that these should generally be considered contrary to the spirit of what 
CRAN maintainers are attempting to do, and package maintainers should 
expect continuing problems as the loopholes are removed.


Paul Gilbert



I could

Re: [Rd] equivalent to source() inside a package

2012-05-25 Thread Paul Gilbert

Is there a reason for not using a vignette or putting a file in the 
demo/  directory?  This seems like the sort of thing for which they are 
intended.


Paul

On 12-05-25 03:33 PM, Wei Hao wrote:

Hi all:

I'm working on a project that I have packaged for ease of
distribution. The different simulations in the package share code, so
obviously I have those parts organized as functions. Now, I want to
show people my code, but the structure with the internal functions
might be a little confusing to follow. One thing I tried was to have
the code of the functions as their own R files in the R/ folder, and
then using source() instead of calling the functions (with consistent
variable names and such) but this didn't work. The goal is for the
user to be able to see the entirety of the code in the interactive R
session, i.e. with a standard package implementation:


library(wei.simulations)
sim1

function (seed=)
{
 [stuff]
 a = internal_function1(data)
 [stuff]
}



I would like the user to see:


sim1

function (seed=)
{
 [stuff]
 tmp = apply(data,1,mean)
 a = sum(tmp) #or whatever, this is just an example
 [stuff]
}

where I can change those two lines in their own file, and have the
changes apply for all the simulation functions. I know this seems like
a weird question to ask, but it would be useful for me to make it as
foolproof as possible for the user to see all the simulation code (I'm
presuming the user is a casual R user and not familiar with looking
through package sources).

Thanks
Wei

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Vignette questions

2012-04-12 Thread Paul Gilbert




On 12-04-12 03:15 AM, Uwe Ligges wrote:



On 12.04.2012 01:16, Paul Gilbert wrote:



On 12-04-11 04:41 PM, Terry Therneau wrote:

Context: R2.15-0 on Ubuntu.

1. I get a WARNING from CMD check for Package vignette(s) without
corresponding PDF:
In this case the vignettes directory had both the pdf and Rnw; do I need
to move the pdf to inst/doc?


Yes, you need to put the pdf in the inst/doc directory if it cannot be
built by R-forge and CRAN check machines, but leave the Rnw in the
vignettes directory.


No, this is all done automatically by R CMD build, hence you do not need
to worry.


Now I am not sure if I am confused or if you missed the if it cannot be 
built by R-forge and CRAN part of my sentence. I understand that this 
is done automatically by R CMD build for vignettes that can be built on 
all, or most, R platforms. In the situation where R CMD build on R-forge 
will fail, or not result in a complete vignette pdf, I think it is 
necessary to put a good  pdf in inst/doc in order to get a build on 
R-forge that can be submitted to CRAN. That is, in situations like:


  -the vignette requires databases or drivers not generally available
  -the vignette (legitimately) takes forever to run
  -the vignette requires a cluster

I am now wondering what the recommended practice is. What I have been 
doing, which I thought was the recommended practice, is to put the 
vignette Rnw (Stex) file in vignettes/ and put a pdf, constructed on a 
machine that has appropriate resources, into inst/doc.  Is that the 
recommended way to proceed?


Related, some have commented that they put a pdf in inst/doc and then 
leave out the vignette Rnw file to avoid error messages. Is the 
discouraged or encouraged?


Paul





I'm reluctant to add the pdf to the svn source on Rforge, per the usual
rule that a code management system should not have both a primary source
and a object dervived from it under version control. However, if this is
the suggested norm I could do so.


Yes, I think this is the norm if the vignette cannot be built on CRAN
and R-forge,


Well, yours are that specific that they rely on third party software.
Vignettes only depending on R and installed packages that are declared
as dependencies can be build by CRAN.



even though it does seem a bit strange. However, you do not
necessarily need to update the vignette pdf in inst/doc every time you
make a change to the package even though, in my opinion, the correct
logic is to test remaking the vignette when you make a change to the
package. You should do this testing, of course, you just do not need to
put the new pdf in inst/doc and commit it to svn each time. (But you
should probably do that before you build the final package to put on
CRAN.)


R CMD build will rebuild vignettes unless you ask R not to do so.

Uwe




2. Close reading of the paragraph about vignette sources shows the
following -- I think? If I have a vignette that should not be rebuilt by
check or BUILD I should put the .Rnw source and pdf in /inst/doc,
and have the others that should be rebuilt in /vignettes. This would
include any that use private R packages, screen snapshots, ..., or in
my case one that takes just a little short of forever to run.


I don't think it is intended to say that, and I didn't read it that way.
I think putting the Rnw in inst/doc is supported (temporarily?) for
historical reasons only. If it is not in vignettes/ and is found in
inst/doc/, it is treated the same way as if it were in vignettes/.

You can include screen snapshots, etc, in either case. For your
situation, what you probably do need to do is specify BuildVignettes:
false in the DESCRIPTION file. This prevents the pdf for inst/doc from
being generated by the the Rnw. However, it does not prevent R CMD check
from checking that the R code extracted from the Rnw actually runs, and
generating an error if it does not. To prevent testing of the R code,
you have to appeal directly to the CRAN and R-forge maintainers, and
they will put the package on a special list. You do need to give them a
good reason why the code should not be tested. I think they are
sympathetic with takes forever to run and not very sympathetic with
does not work anymore. Generally, I think they want to consider doing
this only in exceptional cases, so they do not get in a situation of
having lots of broken vignettes. (One should stick with journal articles
for recording broken code.)


3. Do these unprocessed package also contribute to the index via
\VignetteIndexEntry lines, or will I need to create a custom index?


I'm not sure of the answer to this, but would be curious to know. You
may need to rely on voodoo.

Paul


Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Vignette questions

2012-04-11 Thread Paul Gilbert




On 12-04-11 04:41 PM, Terry Therneau wrote:

Context: R2.15-0 on Ubuntu.

1. I get a WARNING from CMD check for Package vignette(s) without
corresponding PDF:
In this case the vignettes directory had both the pdf and Rnw; do I need
to move the pdf to inst/doc?


Yes, you need to put the pdf in the inst/doc directory if it cannot be 
built by R-forge and CRAN check machines, but leave the Rnw in the 
vignettes directory.


I'm reluctant to add the pdf to the svn source on Rforge, per the usual
rule that a code management system should not have both a primary source
and a object dervived from it under version control. However, if this is
the suggested norm I could do so.


Yes, I think this is the norm if the vignette cannot be built on CRAN 
and R-forge, even though it does seem a bit strange. However, you do not 
necessarily need to update the vignette pdf in inst/doc every time you 
make a change to the package even though, in my opinion, the correct 
logic is to test remaking the vignette when you make a change to the 
package. You should do this testing, of course, you just do not need to 
put the new pdf in inst/doc and commit it to svn each time. (But you 
should probably do that before you build the final package to put on CRAN.)




2. Close reading of the paragraph about vignette sources shows the
following -- I think? If I have a vignette that should not be rebuilt by
check or BUILD I should put the .Rnw source and pdf in /inst/doc,
and have the others that should be rebuilt in /vignettes. This would
include any that use private R packages, screen snapshots, ..., or in
my case one that takes just a little short of forever to run.


I don't think it is intended to say that, and I didn't read it that way. 
I think putting the Rnw in inst/doc is supported (temporarily?) for 
historical reasons only. If it is not in vignettes/ and is found in 
inst/doc/, it is treated the same way as if it were in vignettes/.


You can include screen snapshots, etc, in either case. For your 
situation, what you probably do need to do is specify  BuildVignettes: 
false in the DESCRIPTION file. This prevents the pdf for inst/doc from 
being generated by the the Rnw. However, it does not prevent R CMD check 
from checking that the R code extracted from the Rnw actually runs, and 
generating an error if it does not. To prevent testing of the R code, 
you have to appeal directly to the CRAN and R-forge maintainers, and 
they will put the package on a special list. You do need to give them a 
good reason why the code should not be tested. I think they are 
sympathetic with takes forever to run and not very sympathetic with 
does not work anymore. Generally, I think they want to consider doing 
this only in exceptional cases, so they do not get in a situation of 
having lots of broken vignettes. (One should stick with journal articles 
for recording broken code.)



3. Do these unprocessed package also contribute to the index via
\VignetteIndexEntry lines, or will I need to create a custom index?


I'm not sure of the answer to this, but would be curious to know. You 
may need to rely on voodoo.


Paul


Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

2012-03-31 Thread Paul Gilbert


Mark

I would like to clarify two specific points.

On 12-03-31 04:41 AM, mark.braving...@csiro.au wrote:
 ...

Someone has subsequently decided that code should look a certain way, and has 
added a check that
isn't in the language itself-- but they haven't thought of everything, and of 
course they never could.


There is a large overlap between people writing the checks and people 
writing the interpreter. Even though your code may have been working, if 
your understanding of the language definition is not consistent with 
that of the people writing the interpreter, there is no guarantee that 
it will continue to work, and in some cases the way in which it fails 
could be that it produces spurious results. I am inclined to think of 
code checks as an additional way to be sure my understanding of the R 
language is close to that of the people writing the interpreter.



It depends on how Notes are being interpreted, which from this thread is no 
longer clear.
 The R-core line used to be Notes are just notes but now we seem to 
have significant Notes and ...


My understanding, and I think that of a few other people, was incorrect, 
in that I thought some notes were intended always to remain as notes, 
and others were more serious in that they would eventually become 
warnings or errors. I think Uwe addressed this misunderstanding by 
saying that all notes are intended to become warnings or errors. In 
several cases the reason they are not yet warnings or errors is that the 
checks are not yet good enough, they produce too many false positives. 
So, this means that it is very important for us to look at the notes and 
to point out the reasons for the false positives, otherwise they may 
become warnings or errors without being recognised as such.


 ...

Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R-forge --as-cran

2012-03-30 Thread Paul Gilbert

(Renamed from  Re: [Rd] CRAN policies because the of the 
muli-threading of that subject.)


Claudia

Actually, my version numbers are year-month dates, eg 2012.3-1, although 
I don't set them automatically.


I have had some additional off-line discussion on this. The problem is this:

Now when I submit version 2012.3-1 to CRAN, any checks of that package 
on R-forge will fail, until I change the version number. This is by 
specific request of the CRAN maintainers to the R-forge maintainers, the 
reason being, understandably, that the CRAN maintainers do not like 
getting submissions without the version number changed. One implication 
of this is that I should change the R-forge version number as soon as I 
make any changes to the package, even if I am going to change it again 
before I actually release to CRAN. This seems like a reasonable 
practice, even if I have not always done that.


The case where the code on R-forge remains unchanged for some time after 
it is released to CRAN is more subtle. If R-forge does not re-run the 
checks until I make a change, as is the current situation, then the 
package will still be indicated as ok on the R-forge pkg page. However, 
when R is upgraded, I would like the checks to be re-run on all 
platforms, not just on my own testing platform. But when that is done, 
the R-forge indication is going to be that the package failed, because 
the version number is the same as on CRAN. The information I want is 
actually available on the CRAN daily check. I just need to know that 
when my package is unchanged from the version on CRAN, I should look at 
CRAN daily rather than at the R-forge result.


Paul

On 12-03-30 10:38 AM, Claudia Beleites wrote:

Paul,


One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my

[snip]


I am curious how other developers approach this.


Regardless of --as-cran I find it very useful to use the date as minor
part of the version number (e.g. hyperSpec 0.98-20120320), which I set
automatically.

Claudia







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

2012-03-29 Thread Paul Gilbert



On 12-03-29 09:29 PM, mark.braving...@csiro.au wrote:
 I'm concerned this thread is heading the wrong way, towards
 techno-fixes for imaginary problems. R package-building is already
 encumbered with a huge set of complicated rules, and more
 instructions/rules eg for metadata would make things worse not better.

 RCMD CHECK on the 'mvbutils' package generates over 300 Notes about
 no visible binding..., which inevitably I just ignore. They arise
 because RCMD CHECK is too stupid to understand one of my preferred
 coding idioms (I'm not going to explain what-- that's beside the
 point).

Actually, I think that is the point. If your code is generating that 
many notes then I think you should explain your idiom, so the checks can 
be made to accommodate it if it really is good. Otherwise, I'd be 
worried about the quality of your code.


 And RCMD CHECK always will be too stupid to understand everything
 that a rich language like R might quite reasonably cause experienced
 coders to do.

Possibly the interpreter is too stupid to understand it too?

 It should not be CRAN's business how I write my code, or even whether
 my code does what it is supposed to. It might be CRAN's business to
 try to work out whether my code breaks CRAN's policies, eg by causing 
 R to crash horribly-- that's presumably what Warnings are for (but

 see below). And maybe there could be circumstances where an automatic
 check might be worried enough to alert the CRANia and require manual
 explanation and emails etc from a developer, but even that seems
 doomed given the growing deluge of packages.

 RCMD CHECK currently functions both as a sanitizer for CRAN, and as
 a developer-tool. But the fact that the one programl does both things
 seems accidental to me, and I think this dual-use is muddying the
 discussion. There's a big distinction between (i) code-checks that
 developers themselves might or might not find useful-- which should
 be left to the developer, and will vary from person to person--

I think this a case of two heads are better than one. I did lots of
checks before the CRAN checks existed, but the CRAN checks still found 
bugs in code that I considerer very mature, including bugs in code has 
been running without noticeable problems for over 15 years. Despite all 
the noise today, most of us are only talking about a small inconvenience 
around the intended meaning of note, not about whether quality control 
is a bad thing. I've found the errors and warnings are always valid, 
even though I do not always like having to fix the bugs, and the notes 
are most often valid too. But there are a few false positives, so the 
checks that give notes are not yet reliable enough to give warnings or 
errors. But they should be sometime, so one should usually consider 
fixing the package code.


   and (ii) code-checks that CRAN enforces for its own peace-of-mind.

I think of this as being for the piece-of-mind of your package users.

 Maybe it's convenient to have both functions in the same place, and
 it'd be fine to use Notes for one and Warnings for the other, but the
 different purposes should surely be kept clear.

 Personally, in building over 10 packages (only 2 on CRAN), I haven't
 found RCMD CHECK to be of any use, except for the code-documentation
 and example-running bits. I know other people have different
 opinions, but that's the point: one-size-does-not-fit-all when it
 comes to coding tools.

 And wrto the Warnings themselves: I feel compelled to point out that
 it's logically impossible to fully check whether R code will do bad
 things. One has to wonder at what point adding new checks becomes
 futile or counterproductive. There must be over 2000 people who have
 written CRAN packages by now; every extra check and non-back-
 compatible additional requirement runs the risk of generating false-
 negatives and incurring many extra person-hours to fix
 non-problems.
 Plus someone needs to document and explain the check (adding to the
 rule mountain), plus there is the time spent in discussions like
 this..!

Bugs in your packages will require users to waste a lot of time too, and 
possibly reach faulty results with much more serious consequences. Just 
because perfection may never be attained, this does not mean that 
progress should not be attempted, in small steps. Compared to Statlib, 
which basicly followed your recommended approach, CRAN is a vast 
improvement.


Paul

 Mark

 Mark Bravington
 CSIRO CMIS
 Marine Lab
 Hobart
 Australia
 
 From:r-devel-boun...@r-project.org  [r-devel-boun...@r-project.org] 
On Behalf Of Hadley Wickham [had...@rice.edu]

 Sent: 30 March 2012 07:42
 To: William Dunlap
 Cc:r-de...@stat.math.ethz.ch; Spencer Graves
 Subject: Re: [Rd] CRAN policies

 Most of that stuff is already in codetools, at least when it is 
checking functions
 with checkUsage().  E.g., arguments of ~ are not checked.  The  expr 
argument
 to with() will not be checked if

[Rd] --as-cran / BuildVignettes: false

2012-03-28 Thread Paul Gilbert



I have packages where I know CRAN and other test platforms do not have 
all the resources to build the vignettes, for example, access to 
databases. Previously I think putting


 BuildVignettes: false

in the DESCRIPTION file resolved this, by preventing CRAN checks from 
attempting to run the vignette code. (If it was not this, then there was 
some other magic I don't understand.)


Now, when I specify --as-cran, the checks fail when attempting to check 
R code from vignettes, even though I have BuildVignettes: false in the 
DESCRIPTION file.


What is the mechanism for indicating that CRAN should not attempt to 
check this code?  Perhaps it is intentionally difficult - I can see an 
argument for that.  (For running tests there are environment variables, 
e.g._R_CHECK_HAVE_MYSQL_, but using these really clutters up a vignette, 
and it did not seem necessary to use them before.)


(The difficult also occurs on R-forge, possibly because it is using 
--as-cran like settings.)


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert

One of the things I have noticed with the R 2.15.0 RC and --as-cran is 
that the I have to bump the version number of the working copy of my 
packages immediately after putting a version on CRAN, or I get an 
message about version suitability. This is probably a good thing for 
packages that I have changed, compared with my old habit of bumping the 
version number at arbitrary times, although the mechanics are a nuisance 
because I do not actually want to commit to the next version number at 
that point. For packages that I have not changed it is a bit worse, 
because I have to change the version number even though I have not yet 
made any changes to the package. This will mean, for example, that on 
R-forge it will look like there is a slightly newer version, even though 
there is not really.


I am curious how other developers approach this. Is it better to not 
specify --as-cran most of the time?  My feeling is that it is better to 
specify it all of the time so that I catch errors sooner rather than 
later, but maybe there is a better solution?


Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert




On 12-03-27 10:59 AM, Uwe Ligges wrote:



On 27.03.2012 16:17, Paul Gilbert wrote:

One of the things I have noticed with the R 2.15.0 RC and --as-cran is
that the I have to bump the version number of the working copy of my
packages immediately after putting a version on CRAN, or I get an
message about version suitability. This is probably a good thing for
packages that I have changed, compared with my old habit of bumping the
version number at arbitrary times, although the mechanics are a nuisance
because I do not actually want to commit to the next version number at
that point. For packages that I have not changed it is a bit worse,
because I have to change the version number even though I have not yet
made any changes to the package. This will mean, for example, that on
R-forge it will look like there is a slightly newer version, even though
there is not really.

I am curious how other developers approach this. Is it better to not
specify --as-cran most of the time? My feeling is that it is better to
specify it all of the time so that I catch errors sooner rather than
later, but maybe there is a better solution?



--as-cran is modelled rather closely after the CRAN incoming checks.
CRAN checks if a new version has a new version number. Of course, you
can ignore its result if you do not want to submit. The idea of using
--as-cran is to apply it before you actually submit. Some parts require
network connection etc.

Uwe


Yes but, for example, will R-forge run checks with --as-cran, and thus 
give warnings for any package unchanged from the one on CRAN, or run 
without --as-cran, and thus not give a true indication of whether the 
package is good to submit?


(No doubt R-forge will customise more, but I am trying to work out a 
strategy for my own automatic testing.)


Paul






Paul

On 12-03-27 07:52 AM, Prof Brian Ripley wrote:

CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package maintainers.
In particular, please

- always send a submission email to c...@r-project.org with the package
name and version on the subject line. Emails sent to individual members
of the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages were
published last week) and to remain viable needs package maintainers to
make its life as easy as possible.

Kurt Hornik
Uwe Ligges
Brian Ripley

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN policies

2012-03-27 Thread Paul Gilbert

An associated problem, for the wish list, is that it would be nice for 
package developers to have a way to automatically distinguish between 
NOTEs that can usually be ignored (e.g. a package suggests a package 
that is not available for cross reference checks - I have several case 
where the suggested package depends on the package being built, so this 
NOTE occurs all the time), and NOTEs that are really pre-WARNINGS, so 
that one can flag these and spend time fixing them before they become a 
WARNING or ERROR. Perhaps two different kinds of notes?


(And, BTW, having been responsible for a certain amount of the
  [*] Since answering several emails a day about why their
  results were different was taking up far too much time.
I think --as-cran is great.)

Paul

On 12-03-27 02:19 PM, Uwe Ligges wrote:



On 27.03.2012 19:10, Jeffrey Ryan wrote:

Is there a distinction as to NOTE vs. WARNING that is documented? I've
always assumed (wrongly?) that NOTES weren't an issue with publishing on
CRAN, but that they may change to WARNINGS at some point.


We won't kick packages off CRAN for Notes (but we will if Warnings are
not fixed), but we may not accept new submissions with significant Notes.

Best,
Uwe Ligges




Is the process by which this happens documented somewhere?

Jeff

On 3/27/12 11:09 AM, Gabor Grothendieckggrothendi...@gmail.com wrote:


2012/3/27 Uwe Liggeslig...@statistik.tu-dortmund.de:



On 27.03.2012 17:09, Gabor Grothendieck wrote:


On Tue, Mar 27, 2012 at 7:52 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:


CRAN has for some time had a policies page at
http://cran.r-project.org/web/packages/policies.html
and we would like to draw this to the attention of package
maintainers.
In
particular, please

- always send a submission email to c...@r-project.org with the
package
name and version on the subject line. Emails sent to individual
members
of
the team will result in delays at best.

- run R CMD check --as-cran on the tarball before you submit it. Do
this with the latest version of R possible: definitely R 2.14.2,
preferably R 2.15.0 RC or a recent R-devel. (Later versions of R are
able to give better diagnostics, e.g. for compiled code and
especially
on Windows. They may also have extra checks for recently uncovered
problems.)

Also, please note that CRAN has a very heavy workload (186 packages
were
published last week) and to remain viable needs package
maintainers to
make
its life as easy as possible.



Regarding the part about warnings or significant notes in that page,
its impossible to know which notes are significant and which ones are
not significant except by trial and error.




Right, it needs human inspection to identify false positives. We
believe
most package maintainers are able to see if he or she is hit by such a
false
positive.


The problem is that a note is generated and the note is correct. Its
not a false positive. But that does not tell you whether its
significant or not. There is no way to know. One can either try to
remove all notes (which may not be feasible) or just upload it and by
trial and error find out if its accepted or not.

--
Statistics Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RC / methods package

2012-03-26 Thread Paul Gilbert




On 12-03-25 05:29 PM, Paul Gilbert wrote:

John

Here is the definition of the TSMySQLConnection class, and a few other
things. This is a simplified example that produces the message, but
unfortunately will not work unless you have a MySQL database to connect
to. (I do get the same problem with PostgreSQL, and may with SQLLite,
but I have not tested the last yet.)

require(methods)
require(DBI)
require(RMySQL)

setClassUnion(OptionalPOSIXct, c(POSIXct, logical))
setClass(conType, representation( drv=character), VIRTUAL )

setClass(TSdb, representation( dbname=character,
hasVintages=logical, hasPanels=logical), VIRTUAL )


setClass(TSMySQLConnection, contains=c(MySQLConnection, conType,
TSdb))


setGeneric(TSconnect, def= function(drv, dbname, ...)
standardGeneric(TSconnect))


setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character),
definition=function(drv, dbname, ...) {
con - dbConnect(drv, dbname=dbname, ...)
new(TSMySQLConnection , con, drv=MySQL, dbname=dbname,
hasVintages=dbExistsTable(con, vintageAlias),
hasPanels =dbExistsTable(con, panels))
})

con - TSconnect(dbDriver(MySQL), test)

dbGetQuery(con, show tables)
Note: Method with signature MySQLConnection#integer chosen for
function coerce,
target signature TSMySQLConnection#integer.
dbObjectId#integer would also be valid
Tables_in_test
1 A
2 B
 

The message also seems to go away, even quitting R and restarting to
clear the cache, if I change the TSconnect method as follow

setMethod(TSconnect, signature(drv=MySQLDriver, dbname=character),
definition=function(drv, dbname, ...) {
con - dbConnect(drv, dbname=dbname, ...)
new(TSMySQLConnection , con, drv=MySQL, dbname=dbname,
hasVintages=FALSE,
hasPanels =FALSE)
})

Why this would happen makes absolutely no sense to me. In the first
version is dbExistsTable(con, vintageAlias) left unevaluated in the
result from new?


This is very strange. With

setMethod(TSconnect,   signature(drv=MySQLDriver, dbname=character),
   definition=function(drv, dbname, ...) {
con - dbConnect(drv, dbname=dbname, ...)
hasVintages - as.logical(dbExistsTable(con, vintageAlias) )
hasPanels   - as.logical(dbExistsTable(con, panels) )
new(TSMySQLConnection , con, drv=MySQL, dbname=dbname,
   hasVintages=FALSE,
   hasPanels  =FALSE)
})

I get the note, but if I remove the two lines that appear to do nothing:

setMethod(TSconnect,   signature(drv=MySQLDriver, dbname=character),
   definition=function(drv, dbname, ...) {
con - dbConnect(drv, dbname=dbname, ...)
new(TSMySQLConnection , con, drv=MySQL, dbname=dbname,
   hasVintages=FALSE,
   hasPanels  =FALSE)
})

I no longer get the note. I am restarting R each time to be sure nothing 
is cached.


[ R version 2.15.0 RC (2012-03-25 r58832) ]

Paul



As you can tell, I'm struggling a bit with interpreting the information
from the note. Also, if it were a warning I could set it to stop, and
then traceback to what was causing the problem. As it is, it took me a
fairly long time just to get the fact that the call to dbGetQuery() was
generating the message. And caching the methods may be good for
performance, but when things change the second time you call them it
sure makes debugging difficult.

Best,
Paul


On 12-03-25 03:24 PM, John Chambers wrote:

On 3/24/12 5:43 PM, Paul Gilbert wrote:



On 12-03-24 08:11 PM, John Chambers wrote:



On 3/24/12 1:29 PM, Paul Gilbert wrote:

(I think this is being caused by the new methods package in RC.)

Possibly, but the methods package isn't particularly new in its
method
selection.

We need to see the definition of the class.


Is there a way to know which class it is that we need to see the
definition for?


It's in the note: 'target signature TSMySQLConnection#integer'. In
functional OOP with multiple dispatch, it's all the classes that matter
in general, but in this and most cases, one class is likely the relevant
one: TSMySQLConnection. That was why I said what I did before.

(We could go to a bit more effort and back-translate the dispatch string
TSMySQLConnection#integer into the corresponding formal arguments.
Would be more natural with the INSTALL time tool I mentioned before.
That's the real challenge here -- to give information about this to the
package developer, not the poor user.)

John



Paul


The note implies that it
inherits from both MySQLConnection and dbObjectId, both of which
have methods for coercing to integer. Hence the ambiguity.


In the RC (March 24) some of my packages are generating a Note

Note: Method with signature MySQLConnection#integer chosen for
function coerce,
target signature TSMySQLConnection#integer.
dbObjectId#integer would also be valid

This is coming from a call to dbGetQuery() in package DBI. The method
with the signature TSMySQLConnection#integer is generated
automatically because TSMySQLConnection inherits from
MySQLConnection. (More details below.)

Is there a way

1 2 3 >

1 - 100 of 238 matches

Mail list logo