Re: [R-pkg-devel] Problem with loading package "devtools" from CRAN.

2024-04-29 Thread Berwin A Turlach
G'day Rolf,

On Tue, 30 Apr 2024 01:21:15 +
Rolf Turner  wrote:

> Previously I got an error message from
> install.packages("devtools",lib="/home/rolf/Rlib")  
> but now of course I cannot reproduce it.

Presumably the install.packages() invocation did not produce an error
message but a subsequent "library(devtools)" did?

I am not sure how remotes::install_github() behaves.  Are you using R
directly or via RStudio?  RStudio redefines the behaviour of
install.packages() so I am not sure what would happen if you type that
command into an R session running in RStudio.

As far as I remember, R's install.packages(), as you would invoke it if
you used R directly, installs the requested packages and any of its
(Depends/Imports) dependencies if these dependencies do not exist in
your libraries.  As devtool's dependencies must have existed on your
system, your command only re-installed devtools but none of the

And it must be one of the dependencies that uses compiled code and
created the problem, as a "packageDescription("devtools")" actually
shows "NeedsCompilation: no".

You should really execute an
update.packages(lib="/home/rolf/Rlib", checkBuilt=TRUE)
whenever you upgrade your R version, definitely when the upgrade
involves changing the major/minor version. 



Re: [R-pkg-devel] Problem with loading package "devtools" from CRAN.

2024-04-29 Thread Berwin A Turlach
G'day Rolf,

hope all is well.

On Mon, 29 Apr 2024 01:19:50 +
Rolf Turner  wrote:

> Executive summary:
> > The devtools package on CRAN appears to be broken.
> > Installing devtools from github (using remotes::install_github())
> > seems to give satisfactory results.  

I somehow have not shared this experience.  But then I compile on my
machines (Ubuntu 22.04.4) R and packages from source.

For me the devtools package from CRAN seems to work fine with R 4.4.0

> [...] I thereby obtained what I believe is the latest version of R
> (4.4.0 (2024-04-24)).

Well, what happens if you start R and enter "R.version"?  That should
confirm whether you run R 4.4.0. :)

If you run R 4.4.0 but previously ran R 4.3.x then you are running now
a version with a new minor version number.  AFAIK, there is no
guarantee that at the interface level to compiled code R versions
remain compatible when the minor version number changes.

So on machines where I do not compile from source, I usually run
"update.packages(checkBuilt=TRUE)" whenever I upgrade R on those
machines, definitely if the upgrade involves a change in the major or
minor version number. 

Did you try updating your packages?

Well, you probably first want to find out where exactly the Debian R
distribution installs additional packages (Dirk might help there), and
it might be that you have to run either (or all) of:

update.packages(lib="/usr/local/lib/R/site-library", checkBuilt=TRUE)
update.packages(lib="/usr/lib/R/site-library", checkBuilt=TRUE)
update.packages(lib="/usr/lib/R/library", checkBuilt=TRUE)

though, by a short look at my Ubuntu machine, it may be that the last
two locations are not writeable for you as user, and the first one only
if you belong to the group "staff" as normal user on your machine.

> A bit of web-searching got me to a post on github by Henrik Bengtsson,
> which referred to the devtools problem. 

Could you provide a link?  I could not find anything relevant. :)

> However there seems to be a problem with the devtools package on
> CRAN, which ought to be fixed.

Or, perhaps, just that the interface to compiled code changed from R
3.6.x to R 4.4.0 and, hence, all packages that rely on compiled code
must be reinstalled.

Stay safe.



Re: [R-pkg-devel] Native pipe in package examples

2024-01-25 Thread Berwin A Turlach
On Thu, 25 Jan 2024 09:44:26 -0800
Henrik Bengtsson  wrote:

> On Thu, Jan 25, 2024 at 9:23 AM Berwin A Turlach
>  wrote:
> >
> > G'day Duncon,

Uups, apologies for the misspelling of your name Duncan.  Fingers were
too fast. :)

> > But you could always code your example (not tested :-) ) along lines
> > similar to:
> >
> > if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
> >   ## code that uses native pipe
> > }else{
> >   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
> > }  
> That will unfortunately not work in this case, because |> is part of
> the new *syntax* that was introduced in R 4.1.0.  Older versions of R
> simply doesn't understand how to *parse* those two symbols next to
> each other, e.g.
> {R 4.1.0}> parse(text = "1:3 |> sum()")
> expression(1:3 |> sum())
> {R 4.0.5}> parse(text = "1:3 |> sum()")
> Error in parse(text = "1:3 |> sum()") : :1:6: unexpected '>'
> 1: 1:3 |>
>  ^
> In order for R to execute some code, it needs to be able to parse it
> first. Only then, it can execute it.  So, here, we're not even getting
> past the parsing phase.

Well, not withstanding 'fortune(181)', you could code it as:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
   cat(eval(parse(text="1:3 |> sum()")), "\n")
  cat("You have to upgrade to R >= 4.1.0 to run this example\n")

Admittedly, it would be easier to say "Depends: R (>= 4.1.0)" in the



Re: [R-pkg-devel] Native pipe in package examples

2024-01-25 Thread Berwin A Turlach
G'day Duncon,

On Thu, 25 Jan 2024 11:27:50 -0500
Duncan Murdoch  wrote:

> On 25/01/2024 11:18 a.m., Henrik Bengtsson wrote:
> I think you're right that syntax errors in help page examples will be 
> installable, but I don't think there's a way to make them pass "R CMD 
> check" other than wrapping them in \dontrun{}, and I don't know a way
> to do that conditional on the R version.

I remember vaguely that 'S Programming' was discussing some nifty
tricks to deal with differences between S and R, and how to write code
that would work with either.  If memory serves correctly, those
tricks depended on whether a macro called using_S (using_R?) was
defined. Not sure if the same tricks could be used to distinguish
between different versions of R.

But you could always code your example (not tested :-) ) along lines
similar to:

if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
  ## code that uses native pipe
  cat("You have to upgrade to R >= 4.1.0 to run this example\n")

> I would say that a package that doesn't pass "R CMD check" without 
> errors shouldn't be trusted.

Given the number of packages on CRAN and Murphy's law (or equivalents),
I would say that there are packages that do pass "R CMD check" without
errors but shouldn't be trusted, own packages not excluded. :)



Re: [Rd] A potential POSIXlt->Date bug introduced in r-devel

2022-10-06 Thread Berwin A Turlach
G'day all,

On Thu, 6 Oct 2022 10:15:29 +0200
Martin Maechler  wrote:

> > Davis Vaughan 
> > on Wed, 5 Oct 2022 17:04:11 -0400 writes:  

> > # Weird, where is the `NA`?
> > as.Date(x)  
> > #> [1] "2013-01-31" "1970-01-01" "2013-03-31"  
> > ```  
> I agree that the above is wrong, i.e., a bug in current  R-devel.

I have no intention of hijacking this thread, but I wonder whether this
is a good opportunity to mention that the 32 bit build of R-devel falls
over on my machine since 25 September.  It fails one of the regression
tests in reg-tests-1d.R.  The final lines of

> tools::Rd2txt(rd, out <- textConnection(NULL, "w"), fragment = TRUE)
> stopifnot(any(as.character(rd) != "\n"),
+   identical(textConnectionValue(out)[2L], "LaTeX"));
> ## empty output in R <= 4.2.x
> ## as.POSIXlt()  gave integer overflow
> stopifnot(as.POSIXlt(.Date(2^31 + 10))$year == 5879680L)
Error: as.POSIXlt(.Date(2^31 + 10))$year == 5879680L is not TRUE
Execution halted

I should have reported this earlier, but somehow did not find the time
to do so.  So I thought I mention it here. :)



Re: [Rd] Trying to compile R 4.2.x on Linux as 32bit sub-architecture

2022-06-18 Thread Berwin A Turlach
G'day all,

On Sat, 18 Jun 2022 22:58:19 +0800
Berwin A Turlach  wrote:

> [...] I attach the relevant file from trying to compile R-patched
> during last night's run. 

Mmh, on the web-interface to the mailing list I see that the attachment
might have been deleted.  Perhaps because it was too large?

So below the start and the final part from which
shows where the error occurs.



R version 4.2.1 RC (2022-06-17 r82501) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu/32 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> ## Regression tests for R >= 3.4.0
[...very large snip...]
> ## pretty(x) when range(x) is finite but diff(range(x)) is +/- Inf:
> B <- 1e308; 2*B; (s <- seq(-B,B,length.out = 3))
[1] Inf
[1] -1e+308   0e+00  1e+308
> options(warn=1) # => warnings *as they happen*
> (ps <- pretty(c(-B,B)))
[1] -1e+308 -5e+307   0e+00  5e+307  1e+308
> ## Warning in pretty.default(c(-B, B)) :
> ##   Internal(pretty()): very large range 4e+307, corrected to 2.24712e+307
> nps <- length(ps)
> dd <- sum((dps <- diff(ps))/length(dps)) # mean w/o overflow
> epsC <- .Machine$double.eps
> relD <- (dps/dd - 1)/epsC
> relEr <- function(f, y) abs((f-y)/(f+y)*2) # cheap relative error, |f| > 0 !
> stopifnot(is.finite(mean(ps)), ## these all failed without "long-double"
+   is.finite(mdp <- mean(dps)),
+   all.equal(dd, mdp, tolerance=1e-15))
> stopifnot(relEr(c(-B,B), ps[c(1L,nps)]) <= 4*epsC,
+   -8 <= relD, relD <= 8) # seen [-1.5,.., 3.0]; w/o long-double: [-5, 
.., 4]
> ## ps was   0 Inf Inf Inf Inf Inf Inf Inf Inf Inf  0 , in R <= 4.1.0
> f. <- c(-1.797, -1.79, -1.75, seq(-1.7, -1, by=.1))
> stopifnot(!is.unsorted(f.)) ; f.nm <- setNames(, f.)
> fmtRng <- function(x) paste(format(range(x)), collapse=", ")
> ns <- c(2:12, 15, 20, 30, 51, 100, 2001, 1e5)
> nms.n <- formatC(ns, digits=0, format="f")
> nmsRng <- c(t(outer(paste0("r",1:2), c("lo","hi"), paste, sep=".")))
> rr <- matrix(NA, length(ns), 4, dimnames=list(nms.n, nmsRng))
> for(i.n in seq_along(ns)) {
+ n <- ns[i.n]
+ cat("n = ", n,":\n\n")
+ pBL <- lapply(f., function(f) structure(pretty(c(f*1e308, 2^1023.9), n), 
+ ## -> a warning per f
+ n.s <- lengths(pBL) # how close to target 'n' ??
+ cat("lengths(.) in [", fmtRng(n.s), "]\n")
+ if(n <= 15) stopifnot(n.s <= 20)# seen {14,..,17}
+ else stopifnot(abs(n.s/n - 1) <= 1/2)
+ if(n) cat("length(.) <> n relative err in [", fmtRng(n.s/n - 1), "]\n")
+ ## .pretty(*, bounds=FALSE) :
+ prM <- t(sapply(f.nm, function(f)
+ unlist( .pretty(c(f*1e308, 2^1023.9), n, bounds=FALSE) )))
+ print(prM)
+ luM <- prM[,c("ns","nu")] * prM[,"unit"] # the pretty-scaled unit
+ r1 <- luM[,"ns"] / (f.nm*1e308)
+ rr[i.n, 1:2] <- r1 <- range(r1)
+ cat(sprintf("range(r1): [%g, %g]\n", r1[1], r1[2]))
+ r2 <- luM[,"nu"] / 2^1023.9
+ rr[i.n, 3:4] <- r2 <- range(r2)
+ cat(sprintf("range(r2): [%g, %g]\n", r2[1], r2[2]))
+ stopifnot(exprs = { is.matrix(prM)
+ prM[,"nu"] - prM[,"ns"] == prM[,"n"] # could differ, but not for 
this data
+ identical(colnames(prM), c("ns", "nu", "n", "unit"))
+ ## These bounds depend on 'n' :
+ r1 >= if(n <= 12) 0.55 else 0.89
+ r1 <= if(n <= 15) 1.4  else 1.1
+ r2 >= if(n <= 12) 0.58 else 0.95
+ r2 <= if(n <= 15) 1else 1.025
+ })
+ invisible(lapply(pBL, function(ps) {
+ mdB <- sum((dB <- diff(ps))/length(dB))
+ rd <- dB/mdB - 1 # relative differences
+ ## print(range(rd))
+ x <- c(attr(ps,"f")*1e308, 2^1023.9)
+ stopifnot(if(n >= 1) abs(rd) <= n * 3e-15 else TRUE,
+   ps[1] <= x[1] , x[2] <= ps[length(ps)])
+ }))
+ }
n =  2 :

Warning in pretty.default(c(f * 1e+308, 2^1023.9), n) :
  R_pretty(): very large range 'cell

Re: [R-pkg-devel] Question about preventing CRAN package archival

2021-06-03 Thread Berwin A Turlach
G'day Jeff,

On Wed, 02 Jun 2021 11:34:21 -0700
Jeff Newmiller  wrote:

Not that I want to get involved in old discussions :), but...

> MIT is more permissive than GPL2,

... this statement depends on how one defines "permissive".

MIT requires that you fulfil: "The above copyright notice and this
permission notice shall be included in all copies or substantial
portions of the Software." (

Whereas GPL 2 merely requires that you "[...] conspicuously and
appropriately publish on each copy an appropriate copyright notice and
disclaimer of warranty [...]".

Thus, arguably, GPL 2 is more permissive.

>  so there is a collision there. 

Well, luckily the FSF does not think that the MIT license is
incompatible with the GPL license, though it finds the term "MIT
license" misleading and discourages its use, see



Re: [Rd] Development version of R fails tests and is not installed

2020-02-16 Thread Berwin A Turlach
G'day Jeroen,

On Sun, 9 Feb 2020 01:04:24 +0100
Jeroen Ooms  wrote:

> I think the intention was to add something similar in R's autoconf
> script to enable sse on 32-bit unix systems, but seemingly this hasn't
> happened. For now I think you should be able to make your 32-bit
> checks succeed if you build R with CFLAGS=-mfpmath=sse -msse2.

Just for the record, adding

  CFLAGS="-mfpmath=sse -msse2"

to the file used to compile the 32bit version of R's
development version fixed the problem indeed.  The installation script
ran from the command line with out error to the end, and every day
since then at this crontab'd time.

Looks as if it would be good indeed if R's autoconf script would enable
sse on 32-bit unix systems. :)

Thank you for the solution.



[Rd] Development version of R fails tests and is not installed

2020-02-08 Thread Berwin A Turlach
G'day all,

I have daily scripts running to install the patched version of the
current R version and the development version of R on my linux box
(Ubuntu 18.04.4 LTS).

The last development version that was successfully compiled and
installed was "R Under development (unstable) (2020-01-25 r77715)" on
27 January.  Since then the script always fails as a regression test
seems to fail.  Specifically, in the tests/ subdirectory of my build
directory I have a file which ends with:

> ## more than half of the above were rounded *down* in R <= 3.6.x
> ## Some "wrong" test cases from CRAN packages (partly relying on wrong R <= 
> 3.6.x behavior) 
> stopifnot(exprs = {
+ all.equal(round(10.7775, digits=3), 10.778, tolerance = 1e-12) # even 
tol=0, was 10.777
+ all.equal(round(12345 / 1000,   2), 12.35 , tolerance = 1e-12) # even 
tol=0, was 12.34 in Rd
+ all.equal(round(9.18665, 4),9.1866, tolerance = 1e-12) # even 
tol=0, was  9.1867
+ })
Error: round(10.7775, digits = 3) and 10.778 are not equal:
  Mean relative difference: 9.27902e-05
Execution halted

This happens while the 32bit architecture is installed,  which is a bit
surprising as I get the following results for the last installed
version of R's development version:

R Under development (unstable) (2020-01-25 r77715) -- "Unsuffered Consequences" 
Copyright (C) 2020 The R Foundation for Statistical Computing 
Platform: x86_64-pc-linux-gnu/32 (32-bit)
> round(10.7775, digits=3)
[1] 10.778


R Under development (unstable) (2020-01-25 r77715) -- "Unsuffered Consequences" 
Copyright (C) 2020 The R Foundation for Statistical Computing 
Platform: x86_64-pc-linux-gnu/64 (64-bit) 
> round(10.7775, digits=3)
[1] 10.778

On the other hand, the R 3.6.2 version, that I mainly use at the moment,
gives the following results:

R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu/32 (32-bit)
> round(10.7775, digits=3)
[1] 10.777


R version 3.6.2 (2019-12-12) -- "Dark and Stormy Night"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu/64 (64-bit)
> round(10.7775, digits=3)
[1] 10.777

So it seems as if the behaviour of round() has changed between R 3.6.2
and the development version.  But I do not understand why this test all
of a sudden failed if the results from the last successfully installed
development version of R suggest that the test should be passed.

Thanks in advance for any insight and tips.



[Rd] Development version of R fails tests and is not installed

2019-03-05 Thread Berwin A Turlach
G'day all,

I have daily scripts running to install the patched version of the
current R version and the development version of R on my linux box
(Ubuntu 18.04.2 LTS).

The last development version that was successfully compiled and
installed was "R Under development (unstable) (2019-02-25 r76159)" on
26 February.  Since then the script always fails as a regression test
seems to fail.  Specifically, in the tests/ subdirectory of my build
directory I have a file which ends with:

> ## checking ar.yw.default() multivariate case
> estd <- ar(unclass(y) , aic = FALSE, order.max = 2) ## Estimate VAR(2)
> es.d <- ar(unclass(y.), aic = FALSE, order.max = 2, na.action=na.pass)
> stopifnot(exprs = {
+ all.equal(est$ar[1,,], diag(0.8, 2), tol = 0.08)# seen 0.0038
+ all.equal(est[1:6], es.[1:6], tol = 5e-3)
+ all.equal(estd$x.mean, es.d$x.mean, tol = 0.01) # seen 0.0023
+ all.equal(estd[c(1:3,5:6)],
+   es.d[c(1:3,5:6)], tol = 1e-3)## seen {1,3,8}e-4
+ all.equal(lapply(estd[1:6],unname),
+   lapply(est [1:6],unname), tol = 2e-12)# almost identical
+ all.equal(lapply(es.d[1:6],unname),
+   lapply(es. [1:6],unname), tol = 2e-12)
+ })
Error: lapply(es.d[1:6], unname) and lapply(es.[1:6], unname) are not equal:
  Component "aic": Mean relative difference: 3.297178e-12
Execution halted

Would it be possible to make this tolerance more lenient?  In case it
matters, I am configuring R to be compiled using Openblas and this test
fails for the 64 bit installation, the 32 bit installation seems to
pass all tests.

Happy to provide any more information/context that might be needed.



Re: [R-pkg-devel] Cannot submit package due to misspell note

2018-04-14 Thread Berwin A Turlach
G'day all,

On Sat, 7 Apr 2018 10:00:36 +0100
David Sterratt  wrote:

> On the subject of spell-checking, to avoid false positives when I'm 
> checking the package, in the directory above the package directory I 
> create a file called .spell_ignore with one word per line, [...]
> All the false positives still come through in the CRAN check, but it 
> makes checking easier for me.

I mentioned to David that 
could prove enlightening regarding false positives when spell checking
R packages, but apparently forgot to CC that e-mail to the list.

Best wishes,


[Rd] Problem compiling R patched and R devel on Ubuntu

2017-08-03 Thread Berwin A Turlach
G'day all,

since about a week my daily re-compilations of R patched and R devel
are falling over, i.e. they stop with an error during "make
check" (while building the 32 bit architecture) on my Ubuntu 16.04.3
LTS machine.  Specifically, a test in graphics-Ex.R seems to fail and
the last lines of are:

  > ## Extreme outliers; the "FD" rule would take very large number of
  > XXL <- c(1:9, c(-1,1)*1e300)
  > hh <- hist(XXL, "FD") # did not work in R <= 3.4.1; now gives
  Warning in hist.default(XXL, "FD") :
'breaks = 4.44796e+299' is too large and set to 1e9
  Error in pretty.default(range(x), n = breaks, min.n = 1) : 
cannot allocate vector of length 11
  Calls: hist -> hist.default -> pretty -> pretty.default
  Execution halted

My R 3.4.1 installation, the last R patched version that I could
compile (R version 3.4.1 Patched (2017-07-26 r72974)) and the last R
devel version that I could compile (R Under development (unstable)
(2017-07-26 r72974)) give the following results (under the 32bit
architecture and the 64bit architecture):

  > XXL <- c(1:9, c(-1,1)*1e300)
  > hh <- hist(XXL, "FD")
  Error in pretty.default(range(x), n = breaks, min.n = 1) : 
invalid 'n' argument
  In addition: Warning message:
  In pretty.default(range(x), n = breaks, min.n = 1) :
NAs introduced by coercion to integer range

Not sure if this is a general problem, or only a problem on my machine.



Re: [Rd] [patch] Error in reg-tests-1c.R (R-devel)

2016-05-18 Thread Berwin A Turlach
G'day Martin,

On Wed, 18 May 2016 12:50:21 +0200
Martin Maechler  wrote:

> > Mikko Korpela 
> > on Wed, 18 May 2016 13:05:24 +0300 writes:
> > I get an error when running "make check" after building
> > R-devel r70629 on Ubuntu 14.04. 
> > Here are the relevant
> > lines in the file "":
> This is ..hmm.. "interesting".  We have a few other non-ASCII
> characters in a few of the tests/*.R  files  and they don't seem to
> harm your checks; even  reg-tests-1c.R  contains some.
> Also, the "Installation and Administration" R Manual mentions
> that some of the tests only run flawlessly if you are not using
> "unusual" locales.  So I am a bit puzzled that exactly this
> (new) test fails in your locale, but the others did not.

Well, my nightly script had also failed to complete due to the same
problem.  But I usually wait a day or two before reporting such a problem, 
in the hope that the problem sorts itself out. :)

But to confirm this issue:

* My (bash) script sets:
export LANG=en_AU.UTF-8
* The crontab entry that runs it is:
44 5 * * * cd /opt/src ; /usr/bin/xvfb-run ./R-aop-Doit
* The relevant part of says:

> ## m1z uses match(x, *) with length(x) == 1 and failed in R 3.3.0 
> ## PR#16909 - a consequence of the match() bug; check here too:
> dv <- data.frame(varé1 = 1:3, varé2 = 3); dv[,"varé2"] <- 2
Error: unexpected input in "dv <- data.frame(var�"
Execution halted



[Rd] Building R-patched and R-devel fails

2016-04-17 Thread Berwin A Turlach
G'day all,

probably you have noticed this by now, but I thought I ought to report
it. :)

My scripts that update the SVN sources for R-patched and R-devel, run
`tools/rsync-recommended' (for both) and then install both these
versions from scratch failed this morning.  Apparently the new version
of the recommended package `survival' depends on the recommended
package `Matrix', but the makefile does not ensure that Matrix is build
before survival.  So my log files had entries:

  ERROR: dependency ‘Matrix’ is not available for package ‘survival’
  * removing ‘/opt/src/R-devel-build/library/survival’
  Makefile:51: recipe for target 'survival.ts' failed
  make[2]: *** [survival.ts] Error 1
  make[2]: *** Waiting for unfinished jobs

Presumably, in both branches, the files and in
src/library/Recommended have to be adapted to contain the following
line at the end among the "Hardcoded dependencies":

survival.ts: Matrix.ts



Re: [Rd] Small inaccuracy in the Writing R Extensions manual

2016-01-12 Thread Berwin A Turlach
G'day Duncan,

On Tue, 12 Jan 2016 07:32:05 -0500
Duncan Murdoch <> wrote:

> On 11/01/2016 11:59 PM, Berwin A Turlach wrote:
> > G'day all,
> >
> > In Chapter 1.4 (Writing package vignettes) the Writing R Extensions
> > manual states:
> >
> > By default @code{R CMD build} will run @code{Sweave} on all
> > Sweave vignette source files in @file{vignettes}.  If
> > @file{Makefile} is found in the vignette source directory,
> > then @code{R CMD build} will try to run @command{make} after the
> > @code{Sweave} runs, otherwise @code{texi2pdf} is run on each
> > @file{.tex} file produced.
> >
> > This does not seem to be quite correct as stated.  'R CMD build'
> > seems to run make only if there was a file in the directory
> > vignettes that Sweave successfully processed.   If the directory
> > vignettes contains a Makefile and subdirectories in which the
> > actual vignettes are, 'R CMD build' does not run make.
> >
> I think it is behaving as documented:  it says it will run make after 
> Sweave, so if Sweave never ran, neither would make.  

Sorry, I disagree.  It says that  "R CMD build will try to run make
after the Sweave runs".  For me (and probably others) "after the Sweave
runs" (note the plural) include the cases of "no Sweave runs" and "one
Sweave run".  Otherwise the case of one vignette in the vignettes
directory would produce undocumented (albeit expected) behaviour. :-)



[Rd] On 'R CMD INSTALL' with multiple architectures

2016-01-11 Thread Berwin A Turlach
G'day all,

I guess it is still early enough in the year to wish everybody a happy
and successful new year.

I thought I should report that the installation of the CRAN package
rstan regularly fails on my machine (a 64 bit linux box running Xubuntu
15.10).  The reason being that I have the 32-bit and the 64-bit
architecture of R installed, and my /tmp file is on a partition with
about 1Gb space.  

During the installation of rstan the compilation of the source
directory regularly fills up my /tmp directory such that at some point
the compilation for the 64-bit architecture (my main architecture)
fails as /tmp runs out of space.  At that time, the src-32 subdirectory
has a size of about 460 MB (~ half of the partition!).

The most recent version of rstan I finally managed to install by
issuing the commands:

R CMD INSTALL --no-multiarch rstan_2.9.0.tar.gz
followed by
R --arch=32 CMD INSTALL --libs-only rstan_2.9.0.tar.gz

The last command, to my surprise, actually tried to compile both
libraries (32-bit and 64-bit) again.  So the installation kept failing
until I deleted the src-32 directory while the 64 libraries where build.

By now I realise that the R-admin manual only suggests/documents that
R --arch=name CMD INSTALL --libs-only pkg1 pgk2 
only installs the library for the specified architecture if the package
has "an executable configure script or a src/Makefile file", and is
quite about its behaviour otherwise.  But I wonder whether it would be
reasonable for users to expect that 'R --arch=32 CMD INSTALL
--libs-only' installs only the library for the specified architecture
in all circumstances.

Playing around with a package that also has compiled code, but is
faster to install than rstan, I realise now that my second command
should have been:
  R --arch=32 CMD INSTALL --libs-only --no-multiarch rstan_2.9.0.tar.gz

In summary, I would like to suggest that 'R CMD INSTALL' deletes
architecture specific 'src' subdirectories as soon as they are no
longer needed and/or that 'R --arch=name CMD INSTALL --libs-only'
installs only libraries for the specified architecture (as an unwary
user might expect).



[Rd] Small inaccuracy in the Writing R Extensions manual

2016-01-11 Thread Berwin A Turlach
G'day all,

In Chapter 1.4 (Writing package vignettes) the Writing R Extensions
manual states:

By default @code{R CMD build} will run @code{Sweave} on all
Sweave vignette source files in @file{vignettes}.  If
@file{Makefile} is found in the vignette source directory, then
@code{R CMD build} will try to run @command{make} after the
@code{Sweave} runs, otherwise @code{texi2pdf} is run on each
@file{.tex} file produced.

This does not seem to be quite correct as stated.  'R CMD build' seems
to run make only if there was a file in the directory vignettes that
Sweave successfully processed.   If the directory vignettes contains a
Makefile and subdirectories in which the actual vignettes are, 'R CMD
build' does not run make.



Re: [Rd] Fortran BLAS giving bad results

2014-01-11 Thread Berwin A Turlach
G'day Emmanuel,

On Sat, 11 Jan 2014 14:08:41 -0800
Emmanuel Sharef wrote:

 Example: this program creates two double vectors, takes the dot
 product with ddot, and prints the result:
   subroutine testddot(n,x,y)
   integer i,n
   double precision x,y,d1(n),d2(n)
   do 100 i=1,n
  100  continue
   print *,ddot(n,d1,1,d2,1)
 R CMD SHLIB test.f

Mmh, FORTRAN77 does not have dynamical memory allocation (though I have
been told that many compilers implement it as an extension to the
standard).  So I am surprised that this code works.  It is certainly
not portable (i.e. a compiler that implements the FORTRAN77 standard
strictly would barf on this code).

Also, output to * in FORTRAN code dynamically linked to R is
discouraged, see chapter 5.7 (Fortran I/O) in the Writing R Extension
manual.  From memory, many years ago one could do so (on linux and
SUN machines atleast), then it stopped working.  I am surprised that it
works again.

But neither of these two issues are the reason for your strange
results.  You would have pinpointed your actual problem easily if you
would have used my favourite none-FORTRAN77 feature (which is
implemented by most compilers as an extension), namely having implicit
none as the first line in each function/subroutine. (The FORTRAN77
compliant version, AFAIK, was to write IMPLICIT CHARACTER (A-Z) into
the first line of all functions/subroutines.)

Essentially, your routine testddot does not know what kind of result
ddot returns and by the implicit type definition rules of FORTRAN77 it
expects an INTEGER back.  The fact that a DOUBLE PRECISION value is
returned when an INTEGER is expected creates some confusion.  Somewhere
at the start of your routine you have to add a DOUBLE PRECISION ddot




Re: [Rd] R CMD config for R = 3.0.1

2013-05-19 Thread Berwin A Turlach
G'day Brian,

On Sat, 18 May 2013 10:28:43 +0100
Prof Brian Ripley wrote:

  Is it necessary for R = 3.0.1 to have one build as
  main-architecture and the other one as sub-architecture?  I could
  not find anything in the NEWS file or the Admin manual that
  indicated that this would now be necessary.
 Not exclusively, but as the Mac build no longer uses this, we do need 
 people who do to test pre-releases.   One of us needs to build such a 
 setup and test it 

Either that, or amend the administration and installation manual to
state that one installation should not be a sub-architecture
installation.  :)

But the latter solution also seems to have some problems.  My usual
install script did the following:

1) Run ./configure with 'r_arch=32' (and a few other options) using a, configured for a 32bit build; followed by make  
2) make check ; make install 
3) `make distclean'; run ./configure with 'r_arch=64' (and a few other
   options using a configured for a 64 bit build; followed
   by make
4) make check ; make install
5) make pdf info; make install-pdf install-info

When trying to install Rgraphviz afterwards (mmh, this is a
BioConductor package and not a CRAN package, so perhaps I should ask on
their lists?), Rgrahviz couldn't find the correct compiler settings as
R CMD config ... did not work (as reported originally).

So I changed my install script to remove the 'r_arch=64' in step 3.
(The modified script is attached.)  But now, trying to install
Rgraphviz falls over earlier when a dependencies of Rgraphviz,
BiocGenerics, is installed.  Trying to install BiocGenerics fails with:

** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
*** arch - 32
*** arch - R
ERROR: sub-architecture 'R' is not installed
ERROR: loading failed for 'R'
* removing '/opt/R/R-3.0.1/lib/R/library/BiocGenerics'

I have no idea why the process tries to check for an 'arch - R'.  But
this seems to be happening for packages that do not contain code that
needs to be compiled, another example is 'car' which is needed by 'AER'.

So I am bit puzzled how I should change my scripts.  Does step 3 needs
something stronger than 'make distclean'?  Or should the 'r_arch=32' be
dropped in step 1 but step 3 should use 'r_arch=64'?

Essentially, I would like to install 32bit and 64bit builds on my
machines, with one or both as sub-architectures (to save space) and
with the 64bit the 'default', i.e. the one that is run by


#!/bin/bash -e


part1a ()
cd /opt/src/R-$VERSION
echo **
echo *** configure (32bit) 
echo **
cp ../
./configure --prefix=/opt/R/R-$VERSION --with-blas --with-lapack 
--enable-R-shlib r_arch=32 | tee CONFIGURE_32
echo *
echo *** make (32 bit)
echo *
make -j6

echo *
echo *** make check   
echo *
make check
make install

part1b ()
echo **
echo *** configure (64bit) 
echo **
make distclean
cp ../
./configure --prefix=/opt/R/R-$VERSION --with-blas --with-lapack  
--enable-R-shlib | tee CONFIGURE_64
echo *
echo *** make (64 bit)
echo *
make -j6

part3 ()
echo **
echo *** make documentation
echo **
make pdf info
make install-pdf install-info

export http_proxy=
export LANG=en_AU.UTF-8
echo Installing R-$VERSION
part1a   /opt/src/R-$VERSION-LOG 21
part2/opt/src/R-$VERSION-CheckLOG 21
part1b  /opt/src/R-$VERSION-LOG 21
part2   /opt/src/R-$VERSION-CheckLOG 21
part3   /opt/src/R-$VERSION-LOG 21
echo Done
Re: [Rd] R CMD config for R = 3.0.1

2013-05-19 Thread Berwin A Turlach
G'day Brian,

On Sun, 19 May 2013 08:40:04 +0100
Prof Brian Ripley wrote:

 Could you try current R-patched or R-devel?  Works in my tests at

Tried R-patched (2013-05-18 r62762) and R-devel (2013-05-18 r62762),
installed with my original script.  Things seem fine when I try to
install my usual selection of packages in either of those versions.
Loading under correct (and existing) architectures is tested and
packages that query the configuration get the correct answers from `R
CMD config'.

Thank you very much for the help.



[Rd] R CMD config for R = 3.0.1

2013-05-18 Thread Berwin A Turlach
Dear all,

When installing the usual packages that I use, after installing R
3.0.1, I noticed that the installation of some packages that query R about
its configuration did not succeed.  The problem is exemplified by:

berwin@bossiaea:~$ R-3.0.1 CMD config CC
/opt/R/R-3.0.1/lib/R/bin/config: 222: .: Can't open 

Prior to R 3.0.1 such commands worked fine:

berwin@bossiaea:~$ R-3.0.0 CMD config CC
gcc -std=gnu99

I noticed now that my installations of the development and
patched version of R have the same problem (since I usually do not install
packages in those versions that query the configuration, I hadn't
noticed the issue earlier).  

The problem seems to be line 222 of `R RHOME`/bin/config (when R is R
3.0.1) which reads:

. ${R_HOME}/etc/Renviron
The file ${R_HOME}/etc/Renviron does not necessarily exists if one has
opted for 32/64-bit builds and installed both as sub-architectures,
which I have.  In my installation ${R_HOME}/etc/32/Renviron and
${R_HOME}/etc/64/Renviron exist, but no ${R_HOME}/etc/Renviron.

Is it necessary for R = 3.0.1 to have one build as
main-architecture and the other one as sub-architecture?  I could not
find anything in the NEWS file or the Admin manual that indicated that
this would now be necessary.



For completeness:  

I am running an Ubuntu 12.04 system and:

berwin@bossiaea:~$ echo sessionInfo() | R-3.0.1 --slave
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu/64 (64-bit)

[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methodsbase

Re: [Rd] Recommended way to call/import functions from a Suggested package

2013-02-22 Thread Berwin A Turlach
G'day David,

On Fri, 22 Feb 2013 18:50:07 -0800
David Winsemius wrote:

 On Feb 22, 2013, at 6:39 PM, David Winsemius wrote:
  I've always wondered: How does lattice manage to use grid functions
  without putting them on the search path?

Because lattice imports the grid package and has a NAMESPACE (as have
all packages nowadays):

R packageDescription(lattice)
Package: lattice
Version: 0.20-10
Date: 2012/08/21
Suggests: grid, KernSmooth, MASS
Imports: grid, grDevices, graphics, stats, utils, methods

And the relevant information is not in the Writing R Extensions
manual but in section 3.5.4 of the R Language Definition manual:

Packages which have a @emph{namespace} have a different search
path. When a search for an @R{} object is started from an
object in such a package, the package itself is searched first,
then its imports, then the base namespace and finally the
global environment and the rest of the regular search path.
The effect is that references to other objects in the same
package will be resolved to the package, and objects cannot be
masked by objects of the same name in the global environment or
in other packages.

Thus, as grid is imported by lattice, it is loaded but not attached
(i.e. does not appear in the search path).  However, function in the
lattice package will find functions in the grid package as the imports
are searched.

 Neither can the R interpreter find it. But it's clearly available if
 you ask nicely:

This will always work, whether the grid package is loaded/attached or

R sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-unknown-linux-gnu/64 (64-bit)

[1] C

attached base packages:
[1] stats graphics  grDevices utils dataset  methods base 

loaded via a namespace (and not attached):
[1] tools_2.15.2
R grid::grid.text
function (label, x = unit(0.5, npc), y = unit(0.5, npc), 
just = centre, hjust = NULL, vjust = NULL, rot = 0, check.overlap = 
default.units = npc, name = NULL, gp = gpar(), draw = TRUE, 
vp = NULL) 
tg - textGrob(label = label, x = x, y = y, just = just, 
hjust = hjust, vjust = vjust, rot = rot, check.overlap = check.overlap, 
default.units = default.units, name = name, gp = gp, 
vp = vp)
if (draw) 
bytecode: 0x2507c80
environment: namespace:grid

You specifically asked R to get object grid.text from the grid
package, so R obliges to do so.  For the help system to find the help
pages on an object, the package that contains the help pages has to be
on the search path AFAIK.



Re: [Rd] Calling FORTRAN function from R issue?

2012-03-06 Thread Berwin A Turlach
G'day Berend,

On Tue, 6 Mar 2012 11:19:07 +0100
Berend Hasselman wrote:

 On 06-03-2012, at 01:21, Dominick Samperi wrote:
 zx[0].r = 1.0; zx[0].i = 0.0;
 zx[1].r = 2.0; zx[0].i = 0.0;
 zx[2].r = 3.0; zx[0].i = 0.0;

Just noticing that it is always zx[0].i, same with the imaginary part
of zy.  But this is probably not of importance. :)

 I tried calling zdotc  through an intermediate Fortran routine hoping
 it would solve your problem.
 Above C routine changed to
 The fortran subroutine is
   subroutine callzdotc(retval,n, zx, incx, zy, incy)
   integer n, incx, incy
   double complex retval, zx(*), zy(*)
   external double complex zdotc
   retval = zdotc(n, zx, incx, zy, incy)
 Made a shared object with
 R CMD SHLIB callzdotc.f czdot.c 
 and ran
 with the result 0.0, 0.0

Same here.

Once I change the line

external double complex zdotc

in your fortran subroutine to

double complex zdotc

everything works fine and I get as result 14.0, 0.0.

It is long time ago that I was taught (and studied) the FORTRAN 77
standard.  But flipping through some books from that time I thing I
have some inkling on what is going on.  The external statement is not
needed here (seems to be used in the sense that C is using the
external statement).



[Rd] nobs() and logLik()

2012-01-19 Thread Berwin A Turlach
Dear all,

I am studying a bit the various support functions that exist for
extracting information from fitted model objects.

From the help files it is not completely clear to me whether the number 
returned by nobs() should be the same as the nobs attribute of the
object returned by logLik().  

If so, then there is a slight inconsistency in the methods for 'nls'
objects with logLik.nls() taking zero weights into account while
nobs.nls() does not.  Admittedly, the help page of nobs() states that:

For 'lm' and 'glm' fits, observations with zero weight are not

i.e. does not comment on what nls does.  

But I wonder whether the following behaviour is desirable:

R DNase1 - subset(DNase, Run == 1)
R fm3DNase2 - nls(density ~ Asym/(1 + exp((xmid - log(conc))/scal)), 
+ data = DNase1, weights=c(0,rep(1,14),0), 
+ start = list(Asym = 3, xmid = 0, scal = 1))
R nobs(fm3DNase2)
[1] 16
'log Lik.' 42.62777 (df=4)
[1] 14


Re: [Rd] nobs() and logLik()

2012-01-19 Thread Berwin A Turlach
G'day Brian,

On Fri, 20 Jan 2012 06:20:30 +
Prof Brian Ripley wrote:

 I do wonder why people use zero weights rather than 'subset', and I 
 don't particularly like the discontinuity as a weight goes to zero.

I completely agree, and for developers it is a bit of a pain to make
sure that all possible combinations of 'subset' and 'weights' play
nicely together.

One reason that I can see for people to use zero weights rather than
'subset' is that fitted() and predict() in the former case readily
produce fitted values for the observations that received a zero weight.



Re: [Rd] CRAN: How to list a non-Sweave doc under Vignettes: on package page?

2011-11-07 Thread Berwin A Turlach
G'day Henrik,

On Sun, 6 Nov 2011 16:41:22 -0800
Henrik Bengtsson wrote:

 is it possible to have non-Sweave vignettes(*) in inst/doc/ be listed
 under 'Downloads' on CRAN package pages?  

As far as I know, only by a little trick.  Create an Sweave based
vignette that uses the pdfpages package to include the .pdf file that
you want to have listed.  This dummy vignette should then be listed on

See the lasso2 package for an example.

The vignette in inst/doc/ in that package is actually a bit more
complicated than necessary.  As I think there is no point of having two
nearly identical copies of PDF files in a package, I use .buildignores
to have the original PDF file not included in the source package.  This
started to create a problem when R decided to rebuild vignettes during
the checking process and pdfpages decided to hang if the PDF file to be
included was missing.  




[Rd] Clean up after R CMD INSTALL and/or R CMD check

2010-12-03 Thread Berwin A Turlach
G'day all,

I noticed the following (new) behaviour of R 2.12.0, running on Kubuntu
10.10, when installed with sub-architectures:

When I run R CMD INSTALL or R CMD check on the source directory of a
package that contains C or FORTRAN code, R creates sub-directories
src-32/ and src-64/ that seem to be copies of the src/ subdirectory
plus the compiled objects.   

These directories are not deleted at the end of a successful
INSTALL/check and I wonder if there is any particular reason for this?
Would it be possible to delete these sub-directories during clean-up
at the end of a successful INSTALL/check?



Re: [Rd] Clean up after R CMD INSTALL and/or R CMD check

2010-12-03 Thread Berwin A Turlach
G'day Brian,

On Fri, 3 Dec 2010 13:14:44 + (GMT)
Prof Brian Ripley wrote:

  I noticed the following (new) behaviour of R 2.12.0, running on
  Kubuntu 10.10, when installed with sub-architectures:
 Yes, there are new features when there are multiple sub-architectures.

Indeed.  One new feature seems to be that if the installation of a
package via
R arch=XXX CMD INSTALL --libs-only
fails then the package is not completely removed but rather the
previously install version is re-installed.  IIRC, I had requested this
behaviour some years ago and it is nice to see it now implemented. :)
  These directories are not deleted at the end of a successful
  INSTALL/check and I wonder if there is any particular reason for
 Because it might be partially successful and you want to look at the 
 generated objects?  

I agree that it would be helpful to look at the generated objects if
the INSTALL/check is only partially successful, that's why I asked
about a successful INSTALL/check.  However, it looks...

 In particular 'success' means that the primary sub-architecture is
 installed: others might fail.

... as if we have different definitions of what constitutes 'success';
I take 'success' as meaning successful installation for all
architectures, but accept that you are using the official definition. :)

  Would it be possible to delete these sub-directories during clean-up
  at the end of a successful INSTALL/check?
 Try INSTALL --clean, etc.  

This does not seem to help, the directories in question are not removed.

 But I normally do this from a tarball to keep the sources clean and
 to test the reference sources.

I used to do this too but changed my habits when it was once pointed out
to me that the section Checking and building packages in the Writing
R Extensions manual starts with:

Before using these tools, please check that your package can be
installed and loaded.  @code{R CMD check} will @emph{inter
alia} do this, but you may get more detailed error messages
doing the checks directly.
IIRC, the context was that it took me some time to track down a problem
via R CMD check foo.tar.gz as the error messages were not as helpful
in locating the problem as the error messages of R CMD INSTALL would
have been.  But if R CMD INSTALL is to be run before R CMD check
and/or R CMD build it has to be run on the source directory, hasn't
it?  This looks like a chicken-and-egg problem. :)

Or are you now saying that it is o.k. to first run R CMD build and
then R CMD INSTALL on the tarball?

 There are a few improvements to R-patched in the detection of 
 sub-architectures, so you might like to see if you prefer what it 

I tried with:
   R version 2.13.0 Under development (unstable) (2010-12-02 r53747)
   R version 2.12.0 Patched (2010-12-02 r53747)
and I did not see any different behaviour.  The subdirectories src-32/
and src-64/ are created and not deleted.

Thank you very much for your comments/insights. 



Re: [Rd] Trying to understand the search path and namespaces

2010-11-16 Thread Berwin A Turlach
G'day Hadley,

On Mon, 15 Nov 2010 19:45:30 -0600
Hadley Wickham wrote:

  1.6 of Writing R Extensions says
  Note that adding a name space to a package changes the search
  strategy. The package name space comes first in the search, then
  the imports, then the base name space and then the normal search
  I'm not sure of the details, but I think
 Ah, my mistake was assuming that the package namespace and environment
 were the same thing.
 Interestingly the namespace is dynamic:

Not sure what you mean with this.  Section 1.6 of Writing R
Extensions explicitly states:

Name spaces are @emph{sealed} once they are loaded.  Sealing
means that imports and exports cannot be changed and that
internal variable bindings cannot be changed.

 [31] base
 [31] Autoloadsbase

Well, as the part of Writing R Extensions that Martin quoted states,
the normal search path is part of the search path used by packages with
name spaces.  So if you attach another package via library(), the
normal search path changes and, hence,
`parents(getNamespace(devtools))' has one more location to report.



Re: [Rd] Trying to understand the search path and namespaces

2010-11-16 Thread Berwin A Turlach
G'day Hadley,

On Tue, 16 Nov 2010 07:35:09 -0600
Hadley Wickham wrote:

  Well, as the part of Writing R Extensions that Martin quoted
  states, the normal search path is part of the search path used by
  packages with name spaces.  So if you attach another package via
  library(), the normal search path changes and, hence,
  `parents(getNamespace(devtools))' has one more location to report.
 It's still not at all obvious how this happens - when does variable
 look up use the stack of environments given by the package environment
 and when does it use the stack of environments given by the namespace?

Well, if we are going to nitpick, the former presumably never
happens. :)  Section 3.5.4 Search path of the R Language Definition
might be more illuminating than quotes from Writing R Extensions.

The first two paragraphs of that section are :

In addition to the evaluation environment structure, @R{}
has a search path of environments which are searched for
variables not found elsewhere.  This is used for two things:
packages of functions and attached user data.

The first element of the search path is the global environment
and the last is the base package.  An @code{Autoloads}
environment is used for holding proxy objects that may be
loaded on demand.  Other environments are inserted in the path
using @code{attach} or @code{library}.

I guess this is what you refer to by stack of environments given by the
package environment (though it clearly isn't, if the object/variable
is not found in the evaluation environment structure [I guess Section
3.5.2 Lexical environment explains what this is], the search starts in
the global environment, not the environment of the package).  The third
and last paragraph of Section 3.5.4 states:

Packages which have a @emph{namespace} have a different search
path. When a search for an @R{} object is started from an
object in such a package, the package itself is searched first,
then its imports, then the base namespace and finally the
global environment and the rest of the regular search path.
The effect is that references to other objects in the same
package will be resolved to the package, and objects cannot be
masked by objects of the same name in the global environment or
in other packages.

I hope this clarifies your remaining doubts about the process.  



Re: [Rd] Query about .check_packages_used_in_examples

2010-10-11 Thread Berwin A Turlach
G'day Brian,

On Mon, 11 Oct 2010 11:11:42 +0100 (BST)
Prof Brian Ripley wrote:

 Sounds reasonable to count Imports, so we'll alter this.

Thanks for that.  I noticed the changes to R-devel and to the R-2-12-branch.
Looking at the diffs (an example is appended below), it seems to me
that, except for the ordering, the variables `depends_suggests' and
`imports'  are the same.  This would suggest that only one of them is
necessary and that the remaining code could be made leaner...  but
probably that is something to do after the code freeze is over...

 Thank you for the report,  However, the pre-test period for 2.12.0 is 
 already 80% over and we are in code freeze -- reports early on in the 
 period are much more useful and at this stage anything but very
 simple localized changes need to be deferred to 2.12.1.

Point taken.  When I am back home, I will think about what versions of
R I want to have installed on my main machine...  should probably
also write some scripts that regularly check my packages...



ber...@goodenia:/opt/src/R-2-12-branch/src/library/tools/R$ svn diff -r53024 
Index: QC.R
--- QC.R(revision 53024)
+++ QC.R(working copy)
@@ -3937,7 +3937,7 @@

 ## it is OK to refer to yourself and standard packages
 standard_package_names - .get_standard_package_names()$base
-depends_suggests - c(depends, suggests, enhances,  pkg_name,
+depends_suggests - c(depends, imports, suggests, enhances, pkg_name,
 imports - c(imports, depends, suggests, enhances, pkg_name,
@@ -4038,7 +4038,7 @@

 ## it is OK to refer to yourself and standard packages
 standard_package_names - .get_standard_package_names()$base
-depends_suggests - c(depends, suggests, enhances,  pkg_name,
+depends_suggests - c(depends, imports, suggests, enhances, pkg_name,
 imports - c(imports, depends, suggests, enhances, pkg_name,

[Rd] Query about .check_packages_used_in_examples

2010-10-10 Thread Berwin A Turlach
G'day all,

looking at
I noticed that r-prerel-* and r-devel-* issue notes.  Apparently, now
examples are more thoroughly checked by R CMD check and the note
pointed out that a package used in the examples was not declared.  This
was easily fixed by adding 'Suggests: lattice' to the DESCRIPTION file
(now this change has to migrate through R-forge to CRAN).

But then I noticed that for another package I have on R-forge a similar
note is issued:
O.k., R-forge is using R-devel, so I installed R 2.12.0 RC (2010-10-10
r53273) on my machine to check, same note is issued.

The note issued by R CMD check says:
* checking for unstated dependencies in examples ... NOTE
'library' or 'require' call not declared from: ks
which is a bit surprising.  

There are three .Rd files that have a library(ks) in them and ks is
listed in the Imports: of the description file.

This seems to be completely in line with the documentation in Writing
R extensions:

Packages whose name space only is needed to load the package
using @code{library(@var{pkgname})} must be listed in the
@samp{Imports} field and not in the @samp{Depends} field.

that's why ks is listed in depends

All packages that are needed to successfully run @code{R CMD
check} on the package must be listed in one of @samp{Depends}
or @samp{Suggests} or @samp{Imports}. [...]

which seems to be adhered to, since ks is in the depends.

But upon digging into .check_packages_used_in_examples in QC.R in
package tools, I have the impression that packages loaded in examples
via library() or require() are only checked against packages listed in
Suggests, Enhances and Depends, the package's own name and the standard
packages, but not against those listed in Imports.  Thus, I believe
there is some disconnect between documentation and behaviour at the

In this particular case, possible solutions would be to list ks in
both, Imports and Suggests, or, to have only one entry, to move it to

But I believe the cleaner solution would be that
.check_packages_used_in_examples includes the packages listed in
Imports when checking whether all packages used in library() or
require() commands in examples are declared.



Re: [Rd] Query about .check_packages_used_in_examples

2010-10-10 Thread Berwin A Turlach
G'day all,

sorry, should proof-read better before hitting the send button...

On Mon, 11 Oct 2010 06:06:46 +0800
Berwin A Turlach wrote:

 But then I noticed that for another package I have on R-forge a
 similar note is issued:
 O.k., R-forge is using R-devel, so I installed R 2.12.0 RC (2010-10-10
 r53273) on my machine to check, same note is issued.
 The note issued by R CMD check says:
   * checking for unstated dependencies in examples ... NOTE
   'library' or 'require' call not declared from: ks
 which is a bit surprising.  
 There are three .Rd files that have a library(ks) in them and ks is
 listed in the Imports: of the description file.
 This seems to be completely in line with the documentation in Writing
 R extensions:
   Packages whose name space only is needed to load the package
   using @code{library(@var{pkgname})} must be listed in the
   @samp{Imports} field and not in the @samp{Depends} field.
 that's why ks is listed in depends

ks is listed in the Imports: line in the DESCRIPTION file of the package
in question, not in the Depends: line.

   All packages that are needed to successfully run @code{R CMD
   check} on the package must be listed in one of @samp{Depends}
   or @samp{Suggests} or @samp{Imports}. [...]
 which seems to be adhered to, since ks is in the depends.

Again, this depends should read Imports.

Sorry if this created confusion.



Re: [Rd] checking user interrupts in C(++) code

2010-09-29 Thread Berwin A Turlach
G'day Simon,

since Karl brought up this topic, I thought I might use it to seek
clarification for something that bothered me for some time.

On Tue, 28 Sep 2010 14:55:34 -0400
Simon Urbanek wrote:

 There are several ways in which you can make your code respond to
 interrupts properly - which one is suitable depends on your
 application. Probably the most commonly used for interfacing foreign
 objects is to create an external pointer with a finalizer - that
 makes sure the object is released even if you pass it on to R later.
 For memory allocated within a call you can either use R's transient
 memory allocation (see Salloc) or use the on.exit handler to cleanup
 any objects you allocated manually and left over.

But what about objects created by allocVector() or NEW_ in C code
that is called via .Call that need to be PROTECT'd?

The Writing R Extensions manual states:

The programmer is solely responsible for housekeeping the calls
to @code{PROTECT}.  There is a corresponding macro
@code{UNPROTECT} that takes as argument an @code{int} giving
the number of objects to unprotect when they are no longer
needed.  The protection mechanism is stack-based, so
@code{UNPROTECT(@var{n})} unprotects the last @var{n} objects
which were protected.  The calls to @code{PROTECT} and
@code{UNPROTECT} must balance when the user's code returns.
@R{} will warn about @code{stack imbalance in .Call} (or
@code{.External}) if the housekeeping is wrong.

If a call to R_CheckUserInterrupt() may not return, does that mean that
you should not call this function while you have objects PROTECT'd?

Even more, the section on R_CheckUserInterrupt() states:

Note that it is possible that the code behind one of the entry
points defined here if called from your C or FORTRAN code could
be interruptible or generate an error and so not return to your

This seems to imply that, if you have objects PROTECT'd in your code,
you shouldn't use any of the R API defined in Chapter 6 of the manual,
except if you know that it doesn't call R_CheckUserInterrupt(), and
there seems to be no documentation on which functions in the API do and
which don't.

I guess my question is, essentially, does the PROTECT mechanism and
R_CheckUserInterrupt() play together nicely?  Can I call the latter
from code that has objects PROTECT'd?  If yes, and the code gets
interrupted, is the worse that happens a warning about a stack
imbalance, or will the R session become unusable/unstable?

Thanks in advance for any enlightening comments.



Re: [Rd] Matrix install fails because of defunct save in require

2010-09-17 Thread Berwin A Turlach
G'day Uwe,

On Fri, 17 Sep 2010 19:22:04 +0200
Uwe Ligges wrote:

 On 17.09.2010 16:04, Thomas Petzoldt wrote:
  Dear R-Devel,
  I've just tried to compile the fresh R-devel and found that the
  install of package Matrix failed:
  ** help
  *** installing help indices
  ** building package indices ...
  Error in require(Matrix, save = FALSE) :
  unused argument(s) (save = FALSE)
  ERROR: installing package indices failed
 Have you got the Matrix package from the appropriate 2.12/recommended 
 repository or installed via
 make rsync-recommended
 make recommended

Are those the commands that should now be used?  My script is
essentially doing:

svn up
make check FORCE=FORCE

Running the script now, I experience the same problem as Thomas.  But I
note that Thomas did not state exactly what he is compiling.  My 'svn
up' updates the version checked out from:
which I think of as R-devel.   
Now after the svn up the file VERSION in the source directory says:
2.13.0 Under development (unstable)
The SVN-REVISION file in my build directory says:
Revision: 52938
Last Changed Date: 2010-09-17
And I have Matrix_0.999375-44.tar.gz in src/library/Recommended of my
source directory.

As you refer to 2.12/recommended, you and Thomas might talk about
different versions when talking about R-devel.



Re: [Rd] Indexing bug?

2010-05-26 Thread Berwin A Turlach
G'day Duncan,

On Wed, 26 May 2010 05:57:38 -0400
Duncan Murdoch wrote:

 Is this expected behaviour?

Yes, according to the answer that this poster


Indeed, the help page of '[' states:

The index object i can be numeric, logical, character or empty.
Indexing by factors is allowed and is equivalent to indexing by the
numeric codes (see factor) and not by the character values which are
printed (for which use [as.character(i)]). 



Re: [Rd] Bugs? when dealing with contrasts

2010-04-21 Thread Berwin A Turlach
 [1] 0 1 1
 [1] contr.treatment

No bug either, but wrong use of the optional argument contrasts for 
lm.  Please read the help page of lm, which points you to the help
page of model.matrix.default, which contains pertinent examples; in
your case:

R model.matrix(lm(seq(15) ~ fac, contrasts = list(fac=contr.sum)))  
   (Intercept) fac1 fac2
10   101
11   1   -1   -1
12   1   -1   -1
13   1   -1   -1
14   1   -1   -1
15   1   -1   -1
[1] 0 1 1
[1] contr.sum




[Rd] LOGICAL arguments in FORTRAN code

2010-04-08 Thread Berwin A Turlach
G'day all,

I just took over maintenance of the quadprog package from Kurt Hornik
and noticed that one of the FORTRAN routines has an argument that is
declared to be a LOGICAL.  The R code that calls this routine (via
the .Fortran interface) passes the argument down wrapped in a call to

This was fine (and as documented) under S-Plus 3.4, for which this code
was originally developed.  However, as far as I know, in R objects of
storage mode logical were always supposed to be passed to FORTRAN
arguments of type INTEGER; and that is what the current Writing R
extension manual states.

Thus, given that the port of quadprog existed for quite some time, I
am wondering whether it is o.k. to pass R objects with storage mode
logical into FORTRAN code to arguments declared as LOGICAL?  Or should
the FORTRAN code be corrected to declare the argument in question as



Re: [Rd] LOGICAL arguments in FORTRAN code

2010-04-08 Thread Berwin A Turlach
G'day Brian, 

On Thu, 8 Apr 2010 12:40:45 +0100 (BST)
Prof Brian Ripley wrote:

 On Thu, 8 Apr 2010, Berwin A Turlach wrote:
  Thus, given that the port of quadprog existed for quite some time, I
  am wondering whether it is o.k. to pass R objects with storage mode
  logical into FORTRAN code to arguments declared as LOGICAL?  Or
  should the FORTRAN code be corrected to declare the argument in
  question as INTEGER?
 The second to be safe.  [...]

Thanks for the quick and informative response.  I will make the
necessary changes.

BTW, can I assume that if R passes down TRUE to the FORTRAN routine the
corresponding integer argument will be set to 1, and that it will be
set to zero if FALSE is passed down?  Likewise, can I assume that if at
the end of the FORTRAN routine the integer holds a value of zero, then
FALSE is passed back to R and if the integer holds any other value then
TRUE is passed back?  I don't remember ever reading any documentation
about this; and most documentation that I would search is not at hand
but back on the bookshelves of my office.



Re: [Rd] build, data and vignettes

2010-02-26 Thread Berwin A Turlach
G'day Uwe,

On Fri, 26 Feb 2010 17:03:10 +0100
Uwe Ligges wrote:

 R CMD check executes the R code in the vignettes and checks if that 
 works, and it checks if the PDFs are available. It does not check if
 it can build the vignettes, because that is only necessary on the 
 maintainer's machine since the PDFs are shipped with the (built)
 source package.

Are you implying that R CMD check has reverted to its pre-2.6.0
behaviour?  If I remember correctly, for R versions before 2.6.0  R
CMD check did not attempt to build the vignette, with version 2.6.0 it
started to do so.  The relvant NEWS entry is:

o   R CMD check now (by default) attempts to latex the
vignettes rather than just weave and tangle them: this will give a
NOTE if there are latex errors.



Re: [Rd] inconsistency in ?factor

2009-05-25 Thread Berwin A Turlach
G'day Petr,

On Mon, 25 May 2009 09:04:14 +0200
Petr Savicky wrote:

 The first formulation Be careful to compare ... levels in the same
 order may be just a warning against a potential problem if the
 levels have different order, however this is not clear.

Well, the first statement is a remark on comparison in general while
the second statement is specific to comparison operators and generic
methods.  There are other ways of comparing objects; note:

R f1 - factor(c(a, b, c, c, b, a), levels=c(a, b, c))
R f2 - factor(c(a, b, c, c, b, a), levels=c(c, b, a))
R f1==f2
R identical(f1,f2)
R all.equal(f1,f2)
[1] Attributes:  Component 2: 2 string mismatches 

Just my 2c.



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail:
Singapore 117546

[Rd] Patch proposal for logspace_sub

2009-04-27 Thread Berwin A Turlach
G'day all,

I am working on problems where I have to calculate the logarithm of a
sum or difference from the logarithms of the individual terms; so the
functions logspace_add and logspace_sub which are part of R's API come
in handy.

However, I noticed that logspace_sub can have problems if both
arguments are (very) small or the difference between the arguments are
vary small.  The logic behind logspace_sup, when called with arguments
lx and ly, is:

log( exp(lx) - exp(ly) ) = log( exp(lx) * ( 1 - exp(ly - lx) ) )
 = lx + log( 1 - exp(ly - lx) )
 = lx + log1p( - exp(ly - lx) )
and it is the last expression that is evaluated by logspace_sub.

However the use of log1p for additional precision is appropriate if
exp(ly-lx) is small, i.e. when lx  ly.  If |ly-lx|  1, then 
exp(ly-lx) is close to one; if this term becomes numerically one, then
log1p will return -Inf and large handfuls of accuracy are thrown
away.  In such circumstances, it would be better to use the following
evaluation scheme:
log( exp(lx) - exp(ly) ) = lx + log( - ( exp(ly - lx) - 1 ) )
 = lx + log( - expm1(ly-lx) )

The following code, using the equivalent commands and evaluation schemes
at the R level, illustrates my points:

R lx - 2e-17
R ly - 1e-17
R lx + log1p(-exp(ly-lx)) ; lx + log(-expm1(ly-lx))
[1] -Inf
[1] -39.1439465808988
R lx - 2e-17
R ly - 1e-20
R lx + log1p(-exp(ly-lx)) ; lx + log(-expm1(ly-lx))
[1] -Inf
[1] -38.4512995253805
R lx - 1e-16
R ly - 1e-20
R lx + log1p(-exp(ly-lx)) ; lx + log(-expm1(ly-lx))
[1] -36.7368005696771
[1] -36.8414614929051

In all these cases the output from the second evaluation scheme compares
favourably with the output of yacas or bc. 

The current version of R-devel, with this patch applied, passes make
check FORCE=FORCE on my machine.  The cut-off point for switching
between the two evaluation schemes was chosen arbitrarily.  

I hope this patch will proof to be useful and will be applied. :)



PS;  If use of the equivalent commands and evaluation schemes at the R
level are not convincing enough, then the following code can be used to
verify that logspace_sub has the same behaviour.  But one would need two
versions of R for to do so, one without the patch and one with it.

R install.packages(inline) ## if necessary
R library(inline)
R sig - signature(lx=double, ly=double, res=double)
R code - *res = logspace_sub(*lx, *ly);
R incl - #include Rmath.h
R fn - cfunction(sig,code,convention=.C,language=C,includes=incl)
R lx - 1e-16
R ly - 1e-20
R fn(lx, ly, res=0)$res
[1] -36.7368005696771

Re: [Rd] surprising behaviour of names-

2009-03-15 Thread Berwin A Turlach
G'day Wacek,

On Sun, 15 Mar 2009 21:01:33 +0100
Wacek Kusnierczyk wrote:

 Berwin A Turlach wrote:
  Obviously, assuming that R really executes 
  *tmp* - x
  x - names-('*tmp*', value=c(a,b))
  under the hood, in the C code, then *tmp* does not end up in the
  symbol table and does not persist beyond the execution of 
  names(x) - c(a,b)

 to prove that i take you seriously, i have peeked into the code, and
 found that indeed there is a temporary binding for *tmp* made behind
 the scenes -- sort of. unfortunately, it is not done carefully enough
 to avoid possible interference with the user's code:
 '*tmp*' = 0
 # 0
 x = 1
 names(x) = 'foo'
 # error: object *tmp* not found

I agree, and I am a bit flabbergasted.  I had not expected that
something like this would happen and I am indeed not aware of anything
in the documentation that warns about this; but others may prove me
wrong on this.

 given that `*tmp*`is a perfectly legal (though some would say
 'non-standard') name, it would be good if somewhere here a warning
 were issued -- perhaps where i assign to `*tmp*`, because `*tmp*` is
 not just any non-standard name, but one that is 'obviously' used
 under the hood to perform black magic.

Now I wonder whether there are any other objects (with non-standard)
names) that can be nuked by operations performed under the hood.  

I guess the best thing is to stay away from non-standard names, if only
to save the typing of back-ticks. :)

Thanks for letting me know, I have learned something new today.



Re: [Rd] surprising behaviour of names-

2009-03-14 Thread Berwin A Turlach
On Sat, 14 Mar 2009 07:22:34 +0100
Wacek Kusnierczyk wrote:

  Well, I don't see any new object created in my workspace after
  x - 4
  names(x) - foo
  Do you?

 of course not.  that's why i'd say the two above are *not*
 i haven't noticed the 'in the c code';  do you mean the r interpreter
 actually generates, in the c code, such r expressions for itself to

As I said before, I have little knowledge about how the parser works and
what goes on under the hood; and I have also little time and
inclination to learn about it.  

But if you are interested in these details, then by all means invest
the time to investigate.

Alternatively, you would hope that Simon eventually finishes the book
that he is writing on programming in R; as I understand it, that book
would explain part of these issues in details.  Hopefully, along with
the book he makes the tools that he has for introspection available.

  i guess you have looked under the hood;  point me to the relevant
  No I did not, because I am not interested in knowing such intimate
  details of R, but it seems you were interested.

 yes, but then your claim about what happens under the hood, in the c
 code, is a pure stipulation.  

I made no claim about what is going on under the hood because I have no
knowledge about these matters.  But, yes, I was speculating of what
might go on.

 and you got the example from the r language definition sec. 10.2,
 which says the forms are equivalent, with no 'under the hood, in the
 c code' comment.

Trying to figure out what a writer/painter actually means/says beyond
the explicitly stated/painted, something that is summed up in Australia
(and other places) under the term critical thinking, was not high in
the curriculum of your school, was it? :-)

 you're just showing that your statements cannot be taken seriously.

Usually, my statement can be taken seriously, unless followed by some
indication that I said them tongue-in-cheek.  Of course, statements
that I allegedly made but were in fact put into my mouth cannot, and
should not, be taken seriously.

  yes, *if* you are able to predict the refcount of the object
  passed to 'names-' *then* you can predict what 'names-' will do,
  I think Simon pointed already out that you seem to have a wrong
  picture of what is going on.  [...]

 so what you quote effectively talks about a specific refcount
 mechanism.  it's not refcount that would be used by the garbage
 collector, but it's a refcount, or maybe refflag.

Fair enough, if you call this a refcount then there is no problem.
Whenever I came across the term refcount in my readings, it was
referring to different mechanisms, typically mechanisms that kept exact
track on how often an object was referred too.  So I would not call the
value of the named field a refcount.  And we can agree to call it from
now on a refcount as long as we realise what mechanism is really used.
 yes, that's my opinion:  the effects of implementation tricks should
 not be observable by the user, because they can lead to hard to
 explain and debug behaviour in the user's program.  you surely don't
 suggest that all users consult the source code before writing
 programs in r.

Indeed, I am not suggesting this.  Only users who use/rely on
features that are not sufficiently documented would have to study the
source code to find out what the exact behaviour is.  But, of course,
this could be fraught with danger since the behaviour could change
without warning.

 i have indeed learned what prefix 'names-' does and now i know that
 the surprising behaviour is due to the observability of the internal
 thanks to simon, peter, and you for your answers which allowed me to
 learn this ugly detail.

You are welcome.



Re: [Rd] surprising behaviour of names-

2009-03-13 Thread Berwin A Turlach
On Fri, 13 Mar 2009 11:43:55 +0100
Wacek Kusnierczyk wrote:

 Berwin A Turlach wrote:

  And it is documented behaviour.  

Glad to see that we agree on this.

  Read section 2.1.10 (Environments) in the R
  Language Definition, 
 haven't objected to that.  i object to your 'r uses pass by value',
 which is only partially correct.

Well, I used qualifiers and did not stated it categorically. 
  and actually, in the example we discuss, 'names-' does *not*
  return an updated *tmp*, so there's even less to entertain.  
  How do you know?  Are you sure?  Have you by now studied what goes
  on under the hood?
 yes, a bit.  but in this example, it's enough to look into *tmp* to
 see that it hasn't got the names added, and since x does have names,
 names- must have returned a copy of *tmp* rather than *tmp* changed:

 x = 1
 tmp = x
 x = 'names-'(tmp, 'foo')

Indeed, if you type these two commands on the command line, then it is
not surprising that a copy of tmp is returned since you create a
temporary object that ends up in the symbol table and persist after the
commands are finished.

Obviously, assuming that R really executes 
*tmp* - x
x - names-('*tmp*', value=c(a,b))
under the hood, in the C code, then *tmp* does not end up in the symbol
table and does not persist beyond the execution of 
names(x) - c(a,b)

This looks to me as one of the situations where a value of 1 is used
for the named field of some of the objects involves so that a copy can
be avoided.  That's why I asked whether you looked under the hood.

 you suggested that One reads the manual, (...) one reflects and
 investigates, ...

Indeed, and I am not giving up hope that one day you will master this

 -- had you done it, you wouldn't have asked the  question.

Sorry, I forgot that you have a tendency to interpret statements
extremely verbatim and with little reference to the context in which
they are made.  I will try to be more explicit in future.

  for fun and more guesswork, the example could have been:
  x = x
  x = 'names-'(x, value=c('a', 'b'))
  But it is manifestly not written that way in the manual; and for
  good reasons since 'names-' might have side effects which invokes
  in the last line undefined behaviour.  Just as in the equivalent C
  snippet that I mentioned.
 i just can't get it why the manual does not manifestly explain what
 'names-' does, and leaves you doing the guesswork you suggest.

As I said before, patched to documentation are also welcome.

Best wishes,


Re: [Rd] surprising behaviour of names-

2009-03-13 Thread Berwin A Turlach
On Fri, 13 Mar 2009 19:41:42 +0100
Wacek Kusnierczyk wrote:

  Glad to see that we agree on this.

 owe you a beer.

O.k., if we ever meet is is first your shout and then mine.
  haven't objected to that.  i object to your 'r uses pass by value',
  which is only partially correct.
  Well, I used qualifiers and did not stated it categorically. 

 indeed, you said R supposedly uses call-by-value (though we know how
 to circumvent that, don't we?).
 in that vain, R supposedly can be used to do valid statistical
 computations (though we know how to circumvent it) ;)

Sure, use Excel? ;-)
  Indeed, if you type these two commands on the command line, then it
  is not surprising that a copy of tmp is returned since you create a
  temporary object that ends up in the symbol table and persist after
  the commands are finished.

 what does command line have to do with it?

If you want to find out what goes on under the hood, it is not
necessarily sufficient to do the same calculations on the command line.
  Obviously, assuming that R really executes 
  *tmp* - x
  x - names-('*tmp*', value=c(a,b))
  under the hood, in the C code, then *tmp* does not end up in the
  symbol table 

Well, I don't see any new object created in my workspace after
x - 4
names(x) - foo
Do you?

 i guess you have looked under the hood;  point me to the relevant

No I did not, because I am not interested in knowing such intimate
details of R, but it seems you were interested.
 yes, *if* you are able to predict the refcount of the object passed to
 'names-' *then* you can predict what 'names-' will do, [...] 

I think Simon pointed already out that you seem to have a wrong
picture of what is going on.  As far as I know, there is no refcount
for objects.  

The relevant documentation would be R Language Manual, 1.1 SEXPs:

  What R users think of as variables or objects are symbols which are
  bound to a value. The value can be thought of as either a SEXP (a
  pointer), or the structure it points to, a SEXPREC (and there are
  alternative forms used for vectors, namely VECSXP pointing to
  VECTOR_SEXPREC structures).

and 1.1.2 Rest of header:

  The named field is set and accessed by the SET_NAMED
  and NAMED macros, and take values 0, 1 and 2. R has a `call by value'
  illusion, so an assignment like

  b - a

  appears to make a copy of a and refer to it as b. However, if neither
  a nor b are subsequently altered there is no need to copy. What really
  happens is that a new symbol b is bound to the same value as a and the
  named field on the value object is set (in this case to 2). When an
  object is about to be altered, the named field is consulted. A value
  of 2 means that the object must be duplicated before being changed.
  (Note that this does not say that it is necessary to duplicate, only
  that it should be duplicated whether necessary or not.) A value of 0
  means that it is known that no other SEXP shares data with this
  object, and so it may safely be altered. A value of 1 is used for
  situations like

  dim(a) - c(7, 2)

  where in principle two copies of a exist for the duration of the
  computation as (in principle)

  a - `dim-`(a, c(7, 2))

  but for no longer, and so some primitive functions can be optimized to
  avoid a copy in this case. 

 but in general you may not have the chance. [...]


 and in general, this should not matter because it should be
 unobservable, but it isn't.

That's your opinion (to which you are entitled).  Unfortunately (for
you), the designers of R decided on a design which allows them to
reduce the number of copies that have to be made.

  you suggested that One reads the manual, (...) one reflects and
  investigates, ...
  Indeed, and I am not giving up hope that one day you will master
  this art.

 well, this time i meant you.
Rest assure I have read and reflected on that part of the manual.  

And I guess it boils down to how you interpret what is equivalent to

For me it means that those two commands are what is executed in the C
engine once the names(x)-c(a,b) expression is parsed and the
parse list arrives at the interpreter.  To investigate whether that is
the case, one would have to look at the C code, and I have little
inclination to do so.  But that would be necessary to answer the
question whether *tmp* or a copy of *tmp* is returned, if one is really
interested in this question.  Or whether a *tmp* object is created at

You seem to take is equivalent to to mean that issuing
names(x)-c(a,b) on the command line has the same effect as
issuing those two other commands on the command line and addressing
whether *tmp* or a copy of *tmp* is returned in this case.  Fair
enough, but it addresses a different question.  And, as you said
yourself in another e-mail, on the command line these two versions are
not equivalent since 

Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Wed, 11 Mar 2009 20:31:18 +0100
Wacek Kusnierczyk wrote:

 Simon Urbanek wrote:
  On Mar 11, 2009, at 10:52 , Simon Urbanek wrote:
  Peter gave you a full answer explaining it very well. If you really
  want to be able to trace each instance yourself, you have to learn
  far more about R internals than you apparently know (and Peter
  hinted at that). Internally x=1 an x=c(1) are slightly different
  in that the former has NAMED(x) = 2 whereas the latter has
  NAMED(x) = 0 which is what causes the difference in behavior as
  Peter explained. The reason is that c(1) creates a copy of the 1
  (which is a constant [=unmutable] thus requiring a copy) and the
  new copy has no other references and thus can be modified and
  hence NAMED(x) = 0.
  Errata: to be precise replace NAMED(x) = 0 with NAMED(x) = 1 above
  -- since NAMED(c(1)) = 0 and once it's assigned to x it becomes
  NAMED(x) = 1 -- this is just a detail on how things work with
  assignment, the explanation above is still correct since
  duplication happens conditional on NAMED == 2.
 i guess this is what every user needs to know to understand the
 behaviour one can observe on the surface? 

Nope, only users who prefer to write '+'(1,2) instead of 1+2, or
'names-'(x, 'foo') instead of names(x)='foo'.

Attempting to change the name attribute of x via 'names-'(x, 'foo')
looks to me as if one relies on a side effect of the function
'names-'; which, in my book would be a bad thing.  I.e. relying on side
effects of a function, or writing functions with side effects which are
then called for their side-effects;  this, of course, excludes
functions like plot() :)  I never had the need to call 'names-'()
directly and cannot foresee circumstances in which I would do so.

Plenty of users, including me, are happy using the latter forms and,
hence, never have to bother with understanding these implementation
details or have to bother about them.  

Your mileage obviously varies, but that is when you have to learn about
these internal details.  If you call functions because of their
side-effects, you better learn what the side-effects are exactly.



Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Wed, 11 Mar 2009 20:29:14 +0100
Wacek Kusnierczyk wrote:

 Simon Urbanek wrote:
  Peter gave you a full answer explaining it very well. If you really
  want to be able to trace each instance yourself, you have to learn
  far more about R internals than you apparently know (and Peter
  hinted at that). Internally x=1 an x=c(1) are slightly different in
  that the former has NAMED(x) = 2 whereas the latter has NAMED(x) =
  0 which is what causes the difference in behavior as Peter
  explained. The reason is that c(1) creates a copy of the 1 (which
  is a constant [=unmutable] thus requiring a copy) and the new copy
  has no other references and thus can be modified and hence NAMED(x)
  = 0.
 simon, thanks for the explanation, it's now as clear as i might
 now i'm concerned with what you say:  that to understand something
 visible to the user one needs to learn far more about R internals
 than one apparently knows.  your response suggests that to use r
 without confusion one needs to know the internals, 

Simon can probably speak for himself, but according to my reading he
has not suggested anything similar to what you suggest he suggested. :)

 and this would be a really bad thing to say.. 

No problems, since he did not say anything vaguely similar to what you
suggest he said.



Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Thu, 12 Mar 2009 10:05:36 +0100
Wacek Kusnierczyk wrote:

 well, as far as i remember, it has been said on this list that in r
 the infix syntax is equivalent to the prefix syntax, [...]

Whoever said that must have been at that moment not as precise as he or
she could have been.  Also, R does not behave according to what people
say on this list (which is good, because some times people they wrong
things on this list) but according to how it is documented to do; at
least that is what people on this list (and others) say. :)

And the R Language manual (ignoring for the moment that it is a draft
and all that), clearly states that 

names(x) - c(a,b)

is equivalent to

'*tmp*' - x
 x - names-('*tmp*', value=c(a,b))

 well, i can imagine a user using the prefix 'names-' precisely under
 the assumption that it will perform functionally;  

You mean
y - 'names-'(x, foo)
instead of
y - x
names(y) - foo

Fair enough.  But I would still prefer the latter version this it is
(for me) easier to read and to decipher the intention of the code.

 i.e., 'names-'(x, 'foo') will always produce a copy of x with the
 new names, and never change the x.  

I am not sure whether R ever behaved in that way, but as Peter pointed
out, this would be quite undesirable from a memory management and
performance point of view.  Image that every time you modify a (name)
component of a large object a new copy of that object is created.
 cheers, and thanks for the discussion.

__ mailing list

Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Thu, 12 Mar 2009 10:53:19 +0100
Wacek Kusnierczyk wrote:

 well, ?'names-' says:
  For 'names-', the updated object. 
 which is only partially correct, in that the value will sometimes be
 an updated *copy* of the object.

But since R supposedly uses call-by-value (though we know how to
circumvent that, don't we?) wouldn't you always expect that a copy of
the object is returned?
  And the R Language manual (ignoring for the moment that it is a
  draft and all that), 
 since we must...
  clearly states that 
  names(x) - c(a,b)
  is equivalent to
  '*tmp*' - x
   x - names-('*tmp*', value=c(a,b))

 ... and?  

This seems to suggest that in this case the infix and prefix syntax
is not equivalent as it does not say that 
names(x) - c(a,b)
is equivalent to
x - names-(x, value=c(a,b))
and I was commenting on the claim that the infix syntax is equivalent
to the prefix syntax.

 does this say anything about what 'names-'(...) actually
 returns?  updated *tmp*, or a copy of it?

Since R uses pass-by-value, you would expect the latter, wouldn't
you?  If you entertain the idea that 'names-' updates *tmp* and
returns the updated *tmp*, then you believe that 'names-' behaves in a
non-standard way and should take appropriate care.

And the fact that a variable *tmp* is used hints to the fact that
'names-' might have side-effect.  If 'names-' has side effects,
then it might not be well defined with what value x ends up with if
one executes:
x - 'names-'(x, value=c(a,b))  

This is similar to the discussion what value i should have in the
following C snippet:
i = 0;
i += i++;
  I am not sure whether R ever behaved in that way, but as Peter
  pointed out, this would be quite undesirable from a memory
  management and performance point of view.  
 why?  you can still use the infix names- with destructive semantics
 to avoid copying. 

I guess that would require a rewrite (or extension) of the parser.  To
me, Section 10.1.2 of the Language Definition manual suggests that once
an expression is parsed, you cannot distinguish any more whether
'names-' was called using infix syntax or prefix syntax.

Thus, I guess you want to start a discussion with R Core whether it is
worthwhile to change the parser such that it keeps track on whether a
function was used with infix notation or prefix notation and to
provide for most (all?) assignment operators implementations that use
destructive semantics if the infix version was used and always copy if
the prefix notation is used. 



Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Thu, 12 Mar 2009 15:21:50 +0100
Wacek Kusnierczyk wrote:

  And the R Language manual (ignoring for the moment that it is a
  draft and all that), 

  since we must...
  clearly states that 
names(x) - c(a,b)
  is equivalent to

'*tmp*' - x
   x - names-('*tmp*', value=c(a,b))

  ... and?  
  This seems to suggest 
 seems to suggest?  is not the purpose of documentation to clearly,
 ideally beyond any doubt, specify what is to be specified?

The R Language Definition manual is still a draft. :)

  that in this case the infix and prefix syntax
  is not equivalent as it does not say that 

 are you suggesting fortune telling from what the docs do *not* say?

My experience is that sometimes you have to realise what is not
stated.  I remember a discussion with somebody who asked why he could
not run, on windows, R CMD INSTALL on a *.zip file.  I pointed out to
him that the documentation states that you can run R CMD INSTALL on
*.tar.gz or *.tgz files and, thus, there should be no expectation that
it can be run on *.zip file.

YMMV, but when I read a passage like this in R documentation, I start
to wonder why it is stated that 
names(x) - c(a,b)
is equivalent to 
*tmp* - x
x - names-('*tmp*', value=c(a,b))
and the simpler construct
x - names-(x, value=c(a, b))
is not used.  There must be a reason, nobody likes to type
unnecessarily long code.  And, after thinking about this for a while,
the penny might drop.

  does this say anything about what 'names-'(...) actually
  returns?  updated *tmp*, or a copy of it?
  Since R uses pass-by-value, 
 since?  it doesn't!

For all practical purposes it is as long as standard evaluation is
used.  One just have to be aware that some functions evaluate their
arguments in a non-standard way.  

  If you entertain the idea that 'names-' updates *tmp* and
  returns the updated *tmp*, then you believe that 'names-' behaves
  in a non-standard way and should take appropriate care. 
 i got lost in your argumentation.  [..]

I was commenting on does this say anything about what 'names-'(...)
actually returns?  updated *tmp*, or a copy of it?

As I said, if you entertain the idea that 'names-' returns an updated
*tmp*, then you believe that 'names-' behaves in a non-standard way
and appropriate care has to be taken.

  And the fact that a variable *tmp* is used hints to the fact that
  'names-' might have side-effect.  
 are you suggesting fortune telling from the fact that a variable *tmp*
 is used?

Nothing to do with fortune telling.  One reads the manual, one wonders
why is this construct used instead of an apparently much more simple
one, one reflects and investigates, one realises why the given
construct is stated as the equivalent: because names- has

  This is similar to the discussion what value i should have in the
  following C snippet:
  i = 0;
  i += i++;

 nonsense, it's a *completely* different issue.  here you touch the
 issue of the order of evaluation, and not of whether an object is
 copied or modified;  above, the inverse is true.

Sorry, there was a typo above.  The second statement should have been
i = i++;

Then on some abstract level they are the same; an object appears on the
left hand side of an assignment but is also modified in the expression
assigned to it.  So what value should it end up with?

  why?  you can still use the infix names- with destructive
  semantics to avoid copying. 
  I guess that would require a rewrite (or extension) of the parser.
  To me, Section 10.1.2 of the Language Definition manual suggests
  that once an expression is parsed, you cannot distinguish any more
  whether 'names-' was called using infix syntax or prefix syntax.

 but this must be nonsense, since:
 x = 1
 'names-'(x, 'foo')
 x = 1
 names(x) - 'foo'
 # foo
 clearly, there is not only syntactic difference here.  but it might be
 that 10.1.2 does not suggest anything like what you say.

Please tell me how this example contradicts my reading of 10.1.2 that
the expressions 
'names-'(x, 'foo')
names(x) - 'foo'
once they are parsed, produce exactly the same parse tree and that it
becomes impossible to tell from the parse tree whether originally the
infix syntax or the prefix syntax was used.  In fact, the last sentence
in section 10.1.2 strongly suggests to me that the parse tree stores
all function calls as if prefix notation was used.  But it is probably
my English again.

  Thus, I guess you want to start a discussion with R Core whether it
  is worthwhile to change the parser such that it keeps track on
  whether a function was used with infix notation or prefix notation
  and to provide for most (all?) assignment operators implementations

Re: [Rd] surprising behaviour of names-

2009-03-12 Thread Berwin A Turlach
On Thu, 12 Mar 2009 21:26:15 +0100
Wacek Kusnierczyk wrote:

  YMMV, but when I read a passage like this in R documentation, I
  start to wonder why it is stated that 
  names(x) - c(a,b)
  is equivalent to 
  *tmp* - x
  x - names-('*tmp*', value=c(a,b))
  and the simpler construct
  x - names-(x, value=c(a, b))
  is not used.  There must be a reason, 
 got an explanation:  because it probably is as drafty as the
 aforementioned document.

Your grasp of what draft manual means in the context of R
documentation seems to be as tenuous as the grasp of intelligent
design/creationist proponents on what it means in science to label a
body of knowledge a (scientific) theory. :)

 but it is possible to send an argument to a function that makes an
 assignment to the argument, and yet the assignment is made to the
 original, not to a copy:
 foo = function(arg) arg$foo = foo
 e = new.env()
 are you sure this is pass by value?

But that is what environments are for, aren't they?  And it is
documented behaviour.  Read section 2.1.10 (Environments) in the R
Language Definition, in particular the last paragraph:

  Unlike most other R objects, environments are not copied when 
  passed to functions or used in assignments.  Thus, if you assign the
  same environment to several symbols and change one, the others will
  change too.  In particular, assigning attributes to an environment can
  lead to surprises.

 and actually, in the example we discuss, 'names-' does *not* return
 an updated *tmp*, so there's even less to entertain.  

How do you know?  Are you sure?  Have you by now studied what goes on
under the hood?

 for fun and more guesswork, the example could have been:
 x = x
 x = 'names-'(x, value=c('a', 'b'))

But it is manifestly not written that way in the manual; and for good
reasons since 'names-' might have side effects which invokes in the
last line undefined behaviour.  Just as in the equivalent C snippet
that I mentioned.

 for your interest in well written documentation, ?names says that the
 argument x is 'an r object', and nowhere does it say that environment
 is not an r object.  it also says what the value of 'names-' applied
 to pairlists is.  the following error message is doubly surprising:
 e = new.env()
 'names-'(e, 'foo')
 # Error: names() applied to a non-vector

But names are implemented by assigning a name attribute to the
object; as you should know.  And the above documentation suggests that
it is not a good idea to assign attributed to environments.  So why
would you expect this to work?

 firstly, because it would seem that there's nothing wrong in applying
 names to an environment;  from ?'$':
 name: A literal character string or a name (possibly backtick
   quoted).  For extraction, this is normally (see under
   'Environments') partially matched to the 'names' of the

I fail to see the relevance of this.

 secondly, because, as ?names says, names can be applied to pairlists,

Yes, but it does not say that names can be applied to environment.
And it explicitly says that the default methods get and set the
'name' attribute of... and (other) documentation warns you about
setting attributes on environments.

 which are not vectors, and the following does not give an error as
 p = pairlist()
 # names successfully applied to a non-vector

 assure me this is not a mess, but a well-documented design feature.

It is documented, if it is well-documented depends on your definition
of well-documented. :)

 ... and one wonders why r man pages have to be read in O(e^n) time.

I believe patches to documentation are also welcome; and perhaps more
  I guess that would require a rewrite (or extension) of the parser.
  To me, Section 10.1.2 of the Language Definition manual suggests
  that once an expression is parsed, you cannot distinguish any more
  whether 'names-' was called using infix syntax or prefix syntax.

  but this must be nonsense, since:
  x = 1
  'names-'(x, 'foo')
  # NULL
  x = 1
  names(x) - 'foo'
  # foo
  clearly, there is not only syntactic difference here.  but it
  might be that 10.1.2 does not suggest anything like what you say.
  Please tell me how this example contradicts my reading of 10.1.2
  that the expressions 
  'names-'(x, 'foo')
  names(x) - 'foo'
  once they are parsed, produce exactly the same parse tree and that
  it becomes impossible to tell from the parse tree whether
  originally the infix syntax or the prefix syntax was used.  
 because if they produced the same parse tree, you would either have to
 have the same result in both cases (because the same parse tree 

Re: [Rd] bug (PR#13570)

2009-03-05 Thread Berwin A Turlach
G'day Peter,

On Thu, 05 Mar 2009 09:09:27 +0100
Peter Dalgaard wrote: wrote:
  insert bug report here
  This is a CRITICAL bug!!!  I have verified it in R 2.8.1 for mac
  and for windows.  The problem is with loess degree=0 smoothing.
  For example, try the following:
  x - 1:100
  y - rnorm(100)
  plot(x, y)
  lines(predict(loess(y ~ x, degree=0, span=0.5)))
  This is obviously wrong.
 Obvious? How? I don't see anything particularly odd (on Linux).

Neither did I on linux; but the OP mentioned mac and windows. 

On windows, on running that code, the lines() command added a lot of
vertical lines; most spanning the complete window but some only part.  

Executing the code a second time (or in steps) gave sensible

My guess would be that some memory is not correctly allocated or
initialised.  Or is it something like an object with storage mode
integer being passed to a double?  But then, why doesn't it show on

Happy bug hunting.  If my guess is correct, then I have no idea how to
track down such things under windows.



Re: [Rd] [R] Semantics of sequences in R

2009-02-24 Thread Berwin A Turlach
On Mon, 23 Feb 2009 20:27:23 +0100
Wacek Kusnierczyk wrote:

 Berwin A Turlach wrote:

  judging from your question, you couldn't possibly see sorting
  routines in other languages.
  Quite likely, or the other languages that I regularly use (C,
  Fortran) have even more primitive sorting facilities. 

 i apologize, [...]

Apology accepted.

  No, if that is what you want.  And I guess it is one way of
  sorting a list.  The question is what should be the default
  one possible answer is: none.  (i have already given this answer
  previously, if you read carefully it's still there).  sort.list
  *should* demand an additional comparator argument.  at least, it
  should demand it if the argument to be sorted is a list, rather
  than a non-list vector (if you still need to use sort.list on
  So when are you sending your patch to implement this facility?
 as i said, you seem to have a severe bug in thinking about
 collaborative development.
 i am sending *no* patch for this.  the issue has to be first discussed
 on the design level, and only then, if accepted, should anyone -- me,
 for example -- make an attempt to implement it.  tell me you want to
 listen to what i have to say, and we can discuss.  

I could tell you that I will listen and we can discuss this until the
cows come home.  This will not change one iota since neither of us have
the power to implement changes in R.  You keep barking up the wrong

So far I have seen only one posting of an R-core member in this thread
(but perhaps he has started further discussion on the R-core mailing
list to which we are not privy), but if you want to have discussion and
acceptance before you do anything, you have to get R core involved.

Since for the kind of work for which I am using R a facility for
sorting lists is not crucial, I am rather indifferent about whether
such a facility should exist and, if so, how it should be designed.

 telling me i have a chip on my shoulder is rather unhelpful.

Well, then stop acting as if you are running around with chips on your
shoulders.  Behaving in such a manner is rather counter productive in
the R community (at least from my experience/observation).  The same
with publicly stating that you do certain things with the purpose of
annoying certain persons.  I am just pointing out that your behaviour
is counter productive but it is really up to you on whether you change
or want to continue in your ways.

There is a nice German proverb that sums all this up wie man in den
Wald reinruft so schallt es heraus.  Unfortunately, an equivalent
English proverb does not exist, but perhaps the Norwegians have
something similar


  R is open source.  Check out the svn version, fix what you
  consider needs fixing, submit a patch, convince R core that the
  patch fixes a real problem/is an improvement/does not break too
  much.  Then you have a better chance in achieving something.  

  no, berwin.  this is a serious bug in thinking.  people should be
  allowed -- *encouraged* -- to discuss the design *before* they even
  attempt to write patches. 
  And what makes you believe this is not the case?   I have seen over
  the years e-mails to R-devel along the lines I am thinking of a
  change along [lots of details and reasoning for the change]; would
  patches that implement this be accepted? and these e-mails were
  discussed more often than not.  However, in the end, the only
  people who can commit changes to the R code are the members of
  R-core, thus they will have the final word of design issues (and,
  as I assume, they discuss, among other things, design issues on the
  private mailing list of R-core member).  But you can discuss this
  issues before writing a patch. 
 how many such mails have you seen?  

A few over the years, but the more R progresses/matures the less of
such e-mail happens.

 i've been on r-devel for six months, and haven't seen many.  

Well, six month is a rather narrow sampling window

 on the other hand, i have seen quite a few responses that were
 bashing a user for reporting a non-existent bug or submitting an
 annoying patch.

In didactic terms those are negative motivations/reinforcements;
opinion differ on how effective they are to reach certain learning

  And I am sure that if you had sent an e-mail to r-devel pointing out
  that the binary operator , when called in the non-standard way 
  ''(1,2,3), does not check the number of arguments while other
  binary operators (e.g. '+'(1,2,3) or '*'(1,2,3)) do such checks,
  and provided a patch that implemented such a check for '' (and
  presumably other comparison operators), then that patch would have
  been acknowledged and applied.

 it has been fixed immediately by martin. 

Yes, and, again, you could not help yourself telling the developers
what you think they should do, could you?  As I

Re: [Rd] [R] Semantics of sequences in R

2009-02-24 Thread Berwin A Turlach
G'day Dimitris,

On Tue, 24 Feb 2009 11:19:15 +0100
Dimitris Rizopoulos wrote:

 in my opinion the point of the whole discussion could be summarized
 by the question, what is a design flaw? This is totally subjective,
 and it happens almost everywhere in life. [...]

Beautifully summarised and I completely agree.  Not surprisingly,
others don't.

 To close I'd like to share with you a Greek saying (maybe also a
 saying in other parts of the world) that goes, for every rule there
 is an exception. [...]

As far as I know, the same saying exist in English.  It definitely
exist in German.  Actually, in German it is every rule has its
exception including this rule.  In German there is one grammar rule
that does not have an exception.  At least there used to be one; I am
not really sure whether that rule survived the recent reform of the
German grammar rules.



__ mailing list

Re: [Rd] [R] Semantics of sequences in R

2009-02-23 Thread Berwin A Turlach
On Mon, 23 Feb 2009 11:31:16 +0100
Wacek Kusnierczyk wrote:

 Berwin A Turlach wrote:
  On Mon, 23 Feb 2009 08:52:05 +0100
  Wacek Kusnierczyk wrote:
  and you mean that sort.list not being applicable to lists is a)
  good design, and b) something that by noe means should be fixed,
  I neither said nor meant this and I do not see how what I said
  could be interpreted in such a way.  I was just commenting to
  Stavros that the example he picked, hoping that it would not break
  existing code, was actually a bad one which potentially will break
  a lot (?) of existing code. 
 would it, really?  if sort.list were, in addition to sorting atomic
 vectors (can-be-considered-lists), able to sort lists, how likely
 would this be to break old code?  

Presumably not.

 can you give one concrete example, and suggest how to estimate how
 much old code would involve the same issue?

Check out the svn source of R, run configure, do whatever change you
want to sort.list, make, make check FORCE=FORCE.  That should give
you an idea how much would break.  

Additionally, you could try to install all CRAN packages with your
modified version and see how many of them break when their
examples/demos/c is run.  

AFAIK, Brian is doing something like this on his machine.  I am sure
that if you ask nicely he will share his scripts with you.

If this sounds too time consuming, you might just want to unpack the
sources and grep for sort.list on all .R files;  I am sure you know
how to use find and grep to do this.

  Also, until reading Patrick Burns' The R Inferno I was not aware
  of sort.list.  That function had not registered with me since I
  hardly used it.  
 which hints that potentially will break a lot (?) of existing code
 is a rather unlikely event.

Only for code that I wrote; other people's need and knowledge of R may
  And I also have no need of calling sort() on lists.  For em a
  lists is a flexible enough data structure such that defining a
  sort() command for them makes no sense; it could only work in very
  specific circumstances.

 i don't understand the first part:  flexible enough data structure
 such that defining a sort() command for them makes no sense makes no

lists are very flexible structure whose component must not be of equal
type.  So how do you want to compare components?  How to you compare a
vector of numbers to a vector of character strings?  Or a list of

Or should the sorting be on the length of the components?  Or their
names?  Or should sort(myList) sort each component of myList?  But for
that case we have already lapply(myList, sort).

 as to it could only work in very specific circumstances -- no, it
 would work for any list whatsoever, provided the user has a correctly
 implemented comparator.  for example, i'd like to sort a list of
 vectors by the vectors' length -- is this a very exotic idea?

No, if that is what you want.  And I guess it is one way of sorting a
list.  The question is what should be the default way?  

  BTW, as I mentioned once before, you might want to consider to lose
  these chips on your shoulders.

 berwin, it's been a tradition on this list to discourage people from
 commenting on the design and implementation of r whenever they think
 it's wrong.  

I am not aware of any such tradition and I subscribed to R-help on 15
April 1998.  

The point is rather that by commenting only one will not achieve much,
in particular if the comments look more like complaints and the same
comments are done again and again (along with dragging up previous
comments or comments received on previous comments).

R is open source.  Check out the svn version, fix what you consider
needs fixing, submit a patch, convince R core that the patch fixes a
real problem/is an improvement/does not break too much.  Then you have
a better chance in achieving something.  

Alternatively, if it turns out that something that bugs you cannot be
changed without breaking too much existing code, start from scratch
that with a better design.  Apparently the GAP project
( is doing something like this, as
someone closely associated with that project once told me.  While
developing a version of GAP they collect information on how to improve
the design, data structures c; then, at some point, they start to
write the next version from scratch.
  scary!  it's much preferred to confuse new users.
  I usually learn a lot when I get confused about some issues/concept.
  Confusion forces one to sit down, think deeply and, thus, gain some
  understanding.  So I am not so much concerned with new users being
  confused.  It is, of course, a problem if the new user never comes
  out of his or her confusion.
 the problem, is, r users have to learn lots [...]

Indeed, and I guess in this age of instant gratification that that is a
real bummer for new

Re: [Rd] [R] Semantics of sequences in R

2009-02-23 Thread Berwin A Turlach
On Mon, 23 Feb 2009 13:27:08 +0100
Wacek Kusnierczyk wrote:

 Berwin A Turlach wrote:
  can you give one concrete example, and suggest how to estimate how
  much old code would involve the same issue?
  Check out the svn source of R, run configure, do whatever change you
  want to sort.list, make, make check FORCE=FORCE.  That should
  give you an idea how much would break.  

 it's not just making changes to sort.list, berwin.  sort.list calls
 .Internal order, and this one would have to be modified in order to
 accommodate for the additional comparator argument. [...]

Well, you could start of with an R only implementation and then start
to move things to compiled code as needed for efficiency 

  Additionally, you could try to install all CRAN packages with your
  modified version and see how many of them break when their
  examples/demos/c is run.  

 that's not a good benchmark;  this are third-party stuff, and where
 people are willing to rely on poor design they should be prepared to
 suffer.  [...]

I do not believe that those developers are relying on poor design.
Rather, they rely on things to work as documented (and how they are
used for them to work) and that the behaviour is not gratuitously
changed just because it is considered bad design by some. 

 judging from your question, you couldn't possibly see sorting routines
 in other languages.

Quite likely, or the other languages that I regularly use (C, Fortran)
have even more primitive sorting facilities. 

  No, if that is what you want.  And I guess it is one way of sorting
  a list.  The question is what should be the default way?   
 one possible answer is: none.  (i have already given this answer
 previously, if you read carefully it's still there).  sort.list
 *should* demand an additional comparator argument.  at least, it
 should demand it if the argument to be sorted is a list, rather than
 a non-list vector (if you still need to use sort.list on non-lists).

So when are you sending your patch to implement this facility?

  The point is rather that by commenting only one will not achieve
  much, in particular if the comments look more like complaints and
  the same comments are done again and again (along with dragging up
  previous comments or comments received on previous comments).

 again and again because you seem to be immune to critique.  

You obviously do not know me.

 open you mind, and it will suffice complain just once.  besides, i am
 certainly *not* just complaining.  i am providing concrete arguments,
 examples, and suggestions.  you're being unreasonably unfair.

I gladly admit that I am not reading every thread in which you are
active, so these comments might have been based on a biased a sample.

  R is open source.  Check out the svn version, fix what you consider
  needs fixing, submit a patch, convince R core that the patch fixes a
  real problem/is an improvement/does not break too much.  Then you
  have a better chance in achieving something.  
 no, berwin.  this is a serious bug in thinking.  people should be
 allowed -- *encouraged* -- to discuss the design *before* they even
 attempt to write patches. 

And what makes you believe this is not the case?   I have seen over the
years e-mails to R-devel along the lines I am thinking of a change
along [lots of details and reasoning for the change]; would patches
that implement this be accepted? and these e-mails were discussed more
often than not.  However, in the end, the only people who can commit
changes to the R code are the members of R-core, thus they will have
the final word of design issues (and, as I assume, they discuss, among
other things, design issues on the private mailing list of R-core
member).  But you can discuss this issues before writing a patch.  

 writing one patch which will never be considered -- well, never
 responded to -- is about enough to stop people from sending patches.

While it is unfortunate if this happens, and such persons might just be
too thin-skinned, worse can happen; e.g. being flamed for sending in a
patch that is considered not to address any problems and with a sloppy
description of what it tries to address (happened to me).  

Yes, patches are ignored; patches are gratefully acknowledged and
applied; patches are completely re-written and still attributed to the
provider of the patch...   That does not mean that I stop sending in a
patch if I feel it is warranted...

And I am sure that if you had sent an e-mail to r-devel pointing out
that the binary operator , when called in the non-standard way 
''(1,2,3), does not check the number of arguments while other binary
operators (e.g. '+'(1,2,3) or '*'(1,2,3)) do such checks, and provided
a patch that implemented such a check for '' (and presumably other
comparison operators), then that patch would have been acknowledged and

 maybe that's what you want

Re: [Rd] unloadNamespace (Was: How to unload a dll loaded via library.dynam()?)

2009-02-20 Thread Berwin A Turlach
G'day all,

On Fri, 20 Feb 2009 04:01:07 + (GMT)
Prof Brian Ripley wrote:

 library.dynam.unload() does work if the OS is cooperative.  And if
 you have your package set up correctly and unload the namespace (and
 not just detach the package if there is a namespace) then the shared 
 object/DLL will be unloaded. [...]

I guess I have a similar code-install-test development cycle as Alex;
and I seem to work on a cooperative OS (Kubuntu 8.04).

My set up is that I install packages on which I work into a separate
library.  To test changes to such packages, I start R in a directory
which contains a .Rprofile file which, via .libPaths(), adds the above
library to the library path.  In this R session I then test the changes.

I also used to quit and restart R whenever I re-installed a package with
namespace to test the changes made.  Somehow I got the impression that
this was the way to proceed when namespaces were introduced; and I did
not realise until recently that better ways (unloading the namespace)

However, I noticed the following behaviour under R 2.8.1 and R version
2.9.0 Under development (unstable) (2009-02-19 r47958) which I found

1) In the running R session, issue the command unloadNamespace(XXX)
2) Do changes to the code of the package; e.g. add a print(hello
   world) statement to one of the R functions.
3) Install the new package
4) In the running R session, issue the command library(XXX) and call
   the R function that was changed.  
Result: Hello world is not printed, somehow the old R function is
still used.  If I issue the commands unloadNamespace(XXX) and
library(XXX) once more then a call to the R function that was changed
will print Hello world; i.e. the new code is used.

If the above sequence is changed to 2), 3) and then 1), then 4) behaves
as expected and the new R code is used immediately.

As far as I can tell, if via the .onUnload() hook the shared object is
unloaded via library.dynam.unload(), changes in the C code take effect
no matter whether I perform the above steps in the sequence 1-2-3-4 or

My preference is to use the sequence 1-2-3-4 since it seems to be the
more logical and cleaner sequence; and I have vague memories that I
managed to crash R in the past after using 2-3 and then trying to quit

I am wondering why does it make a difference with respect to R code in
which order these steps are done but not with respect to compiled
code.  Well, I guess I understand why the order does not matter for
compiled code, but I do not understand why the order matters for R
code.  I could not find anything in the documentation that would
explain this behaviour, or indicate that this is the intended

Enlightening comments and/or pointers to where this behaviour is
documented would be welcome.



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail:
Singapore 117546

__ mailing list

Re: [Rd] unloadNamespace (Was: How to unload a dll loaded via library.dynam()?)

2009-02-20 Thread Berwin A Turlach
G'day Brian,

On Fri, 20 Feb 2009 11:37:18 + (GMT)
Prof Brian Ripley wrote:

 This was rather a large shift of subject, [...] 

Well, yes, from the clean unloading of compiled code to the clean
unloading of R code.  :-)  
Though, I also confirmed that the former is possible on a cooperative
OS when library.dynam.unload() is correctly used via an .onUnload()

 Is lazy loading involved? 

Yes, the DESCRIPTION file has the default LazyLoad: yes entry. 

If I set LazyLoad: no, then both sequences use the new version of the
R code immediately.

 If so I have an idea that may or may not be relevant.  We do cache in
 memory the lazy-loading database for speed on slow (network-mounted
 or USB drive) file systems.  Now the cache is flushed at least if you
 do detach(foo, unload = TRUE) or but I can envisage a set of
 circumstances in which it might not be.

As far as I can tell, detach(foo, unload=TRUE) and
unloadNamespace(foo) behave identical on my machines (while the
DESCRIPTION file has LazyLoad: yes); the modified R code is only used
if either of this command is given (followed by library(foo)) after
the new version of the package was installed. 

 So perhaps try detach(foo, unload = TRUE) or not using lazy-loading 
 when developing the package?

Unfortunately, the former does not work and although the latter works I
am hesitant to use it since:

a) as I understand it, most packages that use S4 methods need
lazy-loading (though the particular package with which I noticed the
behaviour does not have S4 methods); and

b) it seems that these days the DESCRIPTION file is the only way of
switching lazy loading on and off and that there is no way of
overriding that value.  Knowing myself, I would forget changing the
DESCRIPTION file back to LazyLoad: yes before building the .tar.gz
file for distribution (once the package is ready).  As it is, I have
already to remember to take -Wall -pedantic out of the Makevars file
in the src/ directory; but I am reminded of that by R CMD check.

Well, thinking a bit more about b), I could probably complicate my
Makefile a bit more such that a make install first modifies the
DESCRIPTION file to LazyLoad: no before installing the package to the
local library and that make build first modifies the DESCRIPTION in
the opposite way.  But this would still leave concern a).



__ mailing list

Re: [Rd] vignette compilation times

2009-02-20 Thread Berwin A Turlach
G'day Robin,

On Thu, 19 Feb 2009 11:10:45 +
Robin Hankin wrote:

 I am preparing a number of vignettes that require a very  long time to
 process with Sweave.  The longest one takes 10 hours.  

Is the sum of all chunks taking this time?  Or is it mostly the code in
only a view chunks?  And if so, are there chunks following that depend
on the result of these time-intensive chunks?

I wonder if it is feasible to construct your vignette according to the
1) have a file, say, that contains:

you may want to try the commands
but be aware that his might take a long time.
Now we run the commands

2) Now construct a Makefile that, using a preprocesser like cpp,
produces vignette1.Rnw from using the first version
before an R CMD build but otherwise (for your own testing) the
second version.  Using .Rbuildignore, you can ensure that would not be distributed. 

 I love the weaver package!

Thanks for pointing this package out.  I was aware of cacheSweave, but
that package seems to require that each chunk has a label which I find
kind of inconvenient.  weaver does not seem to have such a requirement.



__ mailing list

Re: [Rd] vignette compilation times

2009-02-20 Thread Berwin A Turlach
G'day Gabor,

On Thu, 19 Feb 2009 17:47:53 -0500
Gabor Grothendieck wrote:

 Unless this has changed recently,I've tried including a PDF but it
 does not appear in library(help = myPackage) nor on the CRAN site on
 while Sweave'd PDFs do.

If you want a PDF file to appear in library(help=myPackage), then you
can write a vignette that just includes that PDF file via \includepdf
from the LaTeX package(?) pdfpages.

You will, of course, end up with two PDF files that are practically
identical.  So you might want to exclude the original PDF file from the
build package via .Rbuildignore.

If you do so, the next problem is that since R 2.6.0 R CMD check is
trying to latex the vignette and not just checks the code in the
vignette.  And in current TeX systems latex will hang if \includepdf
does not find the specified PDF file; latex does not stop with an
error, it hangs.  

So the vignette has to be written smart enough to try to include the
PDF file via \includepdf only if the file really exists, but that can
easily be done.  See the package lasso2 for an example.  

If you follow this set up, your PDF file will show up in
library(help=myPackage) and your package will pass R CMD check on




__ mailing list

Re: [Rd] vignette compilation times

2009-02-20 Thread Berwin A Turlach
G'day Fritz,

On Fri, 20 Feb 2009 12:46:49 +1100
Friedrich Leisch wrote:

 It is also unclear to me whether including a PDF without sources in a
 GPLed package isn't a violation of the GPL (I know people who very
 strongly think so). And source according to the GPL means the
 preferred form of the work for making modifications to it. So for a
 PDF showing R output that would mean the text plus R code plus
 data ... which boils down to XXXweave anyway.

Well, GPL-2 says This License applies to any program or other work
which contains a notice placed by the copyright holder saying it may be
distributed under the terms of this General Public License.   I am
somehow unable to locate the equivalent statement in GPL-3.  

Thus, under GPL-2, if the source that produces the PDF file does not
contain a statement that it may be distributed under the terms of the
GPL, then, in my understanding, you do not have to distribute the
source.  On occasions I wondered whether stating in the DESCRIPTION
file that your package is GPL-2 extends this license to all other
files and to those in inst/doc in particular.  Or whether one should
better slap a GPL-2 notice (or a GNU Free Documentation License)
explicitly on the documentation.

Actually, the fact that the GNU Free Documentation License exists makes
me wonder whether it is tenable to apply GPL to documentation such as
PDF files.  But the phrase or other work in the above cite part
of GPL-2 and the explicit `The Program refers to any copyrightable
work' in GPL-3 seem to indicate that it is possible.  Though, I guess
you would still have to state *within* the (source of) vignette that it
is under the GPL.

But then IANAL.



__ mailing list

Re: [Rd] Warning: missing text for item ... in \describe?

2008-12-05 Thread Berwin A Turlach
G'day Brian,

On Fri, 5 Dec 2008 15:35:00 + (GMT)
Prof Brian Ripley [EMAIL PROTECTED] wrote:

 On Mon, 1 Dec 2008, Berwin A Turlach wrote:
  G'day Spencer,
  On Sun, 30 Nov 2008 22:31:54 -0800
  Spencer Graves [EMAIL PROTECTED] wrote:
What might be the problem generating messages like Warning:
  missing text for item ... in \describe with R CMD check and R
  CMD install?
With the current version of fda on R-Forge, I get the
  Warning: missing text for item 'fd' in \describe
  Warning: missing text for item 'fdPar' in \describe
  fRegress.Rd, which contains
  \item{fd} { ...
  \item{fdPar} {...
 Well, the warning/errors always come out just before the file with
 the problem (unless you have stdout buffered and stderr not).  So
 finding which file does not seem so hard.

Which platform are we talking here?  I was using linux and R CMD check
fda, using R 2.8.0, on the command line said:

* checking for executable files ... OK
* checking whether package 'fda' can be installed ... WARNING
Found the following significant warnings:
  Warning: missing text for item 'fd' in \describe
  Warning: missing text for item 'fd' in \describe
  Warning: missing text for item 'fdPar' in \describe
See '/home/berwin/lang/R/Develop/Others/fda.Rcheck/00install.out' for
* checking package directory ... OK

And 00install.out said:

Attaching package: 'zoo'

The following object(s) are masked from package:base :


** help
Warning: missing text for item 'fd' in \describe
Warning: missing text for item 'fd' in \describe
Warning: missing text for item 'fdPar' in \describe
  Building/Updating help pages for package 'fda'
 Formats: text html latex example
  CSTR  texthtmllatex   example
  CanadianWeather   texthtmllatex   example

I am not aware that either stdout or stderr are buffered on my linux

 This *is* an error: nothing in the description allows whitespace
 between arguments to \item (nor \section). It seems that only a few
 people misread the documentation (sometimes even after their error is
 pointed out to them). 

But there is also nothing that explicitly forbid such whitespace, is
it?  I guess this comes down to the question whether everything is
allowed that is not expressively forbidden or everything is forbidden
unless it is expressively allowed.  Strangely enough, though I am
German, I don't tend to subscribe to the latter philosophy.  

The language of Rd files, and the notation used, seems to have some
clear roots in (La)TeX;  and in (La)TeX whitespace between arguments to
macros is ignored.  So one may argue that it is a bit of a surprise
that whitespace between arguments to \item matter here.

 What we can do is detect the error, and I am about to commit code in
 R-devel to do so. 

Thanks.  I am sure Spencer will be happy about this. :)



__ mailing list

Re: [Rd] Warning: missing text for item ... in \describe?

2008-12-05 Thread Berwin A Turlach
G'day Brian,

On Fri, 5 Dec 2008 17:32:58 + (GMT)
Prof Brian Ripley [EMAIL PROTECTED] wrote:

  Which platform are we talking here?  I was using linux and R CMD
  check fda, using R 2.8.0, on the command line said:
 That writes to a file, and writes to a file are buffered.  Try R CMD 
 INSTALL, where they are not.  We do recommend getting a clean install 
 before R CMD check.

Indeed, forgot about that.  Thanks for reminding me.  With R CMD
INSTALL the messages are right next to the help files with the problems.

  This *is* an error: nothing in the description allows whitespace
  between arguments to \item (nor \section). It seems that only a few
  people misread the documentation (sometimes even after their error
  is pointed out to them).
  But there is also nothing that explicitly forbid such whitespace, is
  it?  I guess this comes down to the question whether everything is
  allowed that is not expressively forbidden or everything is
  forbidden unless it is expressively allowed.  Strangely enough,
  though I am German, I don't tend to subscribe to the latter
 It really doesn't matter: the author of the convertor (not me)
 decidedly to silently ignore arguments after whitespace so you get an
 incorrect conversion. I also added sentences to the documentation
 that say that explicitly.  But if I see documentation that says
 I don't see why anyone would be surprised that
 \item {foo} {bar}
 goes haywire.

Well, I was. :)  And I guess anybody who knows that the TeX parser does
not care about whitespaces between arguments to macros but forgets
that .Rd files are not directly parsed by the TeX parser would have
been surprised too.

Use of whitespace and indentation is pretty much a matter of taste and
personal style to improve readability; and while there are languages
where they matter (Python, makefiles, c), and some projects (including
R) have style-guides, usually a developer is left with a lot of
flexibility to suit her or his style.  But I guess only Spencer knows
why a whitespace at this place was desirable and/or improved


__ mailing list

Re: [Rd] Warning: missing text for item ... in \describe?

2008-11-30 Thread Berwin A Turlach
G'day Spencer,

On Sun, 30 Nov 2008 22:31:54 -0800
Spencer Graves [EMAIL PROTECTED] wrote:

   What might be the problem generating messages like Warning: 
 missing text for item ... in \describe with R CMD check and R CMD 
   With the current version of fda on R-Forge, I get the
 Warning: missing text for item 'fd' in \describe
 Warning: missing text for item 'fdPar' in \describe

fRegress.Rd, which contains

\item{fd} { ...
\item{fdPar} {...

Apparently the space between the closing and the opening bracket leads
to some confusion; remove the space and the warning goes away.

I am not sure why an extra space here leads to problems, would be nicer
if it would not.  If it have to lead to a problem, then it would be
nice if the name of the .Rd files that produces the problem is actually
mentioned. :)

Don't ask me how I found this, let us just say that long live
find-grep-dired in emacs and perseverance (or should that be




=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] chisq.test with simulate.p.value=TRUE (PR#13292)

2008-11-17 Thread Berwin A Turlach
G'day Reginaldo, 

On Sun, 16 Nov 2008 01:00:09 +0100 (CET)

 Full_Name: Reginaldo Constantino
 Version: 2.8.0
 OS: Ubuntu Hardy (32 bit, kernel 2.6.24)
 Submission from: (NULL) (
 For many tables, chisq.test with simulate.p.value=TRUE gives a p
 value that is obviously incorrect and inversely proportional to the
 number of replicates:

Why is this Monte-Carlo p-value obviously incorrect?  

In your example, did you look at the observed ChiSquare statistics?
Any idea how likely it is to observe a value that is at least as
extreme as the one observed?

Essentially, you are doing B Monte-Carlo simulations and in none of
these simulations do you obtain a statistic that is at least as extreme
as the one that you have observed.  So your Monte-Carlo p-value ends up
to be 1/(B+1).  

I do not see any problem or bug here.

  x - margin.table(HairEyeColor, c(1, 2))
 Pearson's Chi-squared test with simulated p-value (based on
 2000 replicates)
 data:  x
 X-squared = 138.2898, df = NA, p-value = 0.0004998
 X-squared = 138.2898, df = NA, p-value = 1e-04
 X-squared = 138.2898, df = NA, p-value = 1e-05
 X-squared = 138.2898, df = NA, p-value = 1e-06
 Also tested the same R version under Windows XP and got the same



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] typo in ?pie

2008-11-11 Thread Berwin A Turlach
G'day Brian,

On Mon, 10 Nov 2008 21:07:44 + (GMT)
Prof Brian Ripley [EMAIL PROTECTED] wrote:

 'British' spelling is in the majority amongst R-core, and preferred
 for R documentation (that is in the guidelines somewhere).

I have a vague memory of a discussion that ended with the conclusion
that words like colour should be spelt color; at least in function
names and function arguments.  IIRC, this was for compatible reasons
with S/S-Plus.

Does your comment mean that we can send it patches for help pages in
which colour is spelt color (e.g. lines in package:graphics) and
such patches would be applied?

Sorry, couldn't resist. :)



__ mailing list

Re: [Rd] (PR#13132) splinefun gives incorrect derivs when extrapolating to the left

2008-10-12 Thread Berwin A Turlach
Dear Brian,

On Thu, 9 Oct 2008 07:27:24 +0100 (BST)
Prof Brian Ripley [EMAIL PROTECTED] wrote:

 Thanks, I've merged this (second version) patch into 2.8.0 beta.

My pleasure.  I thought it would be only fair that I fix the infelicity
in the code after having provided the initial patch that introduced
it. :)

Apologies also for not realising that the bug repository truncates
over long subject line and having thus filed two additional reports.
Will remember that in future.

Best wishes,


__ mailing list

Re: [Rd] why is \alias{anRpackage} not mandatory?

2008-10-07 Thread Berwin A Turlach
G'day Hadley,

On Mon, 6 Oct 2008 08:55:14 -0500
hadley wickham [EMAIL PROTECTED] wrote:

  The main problem with vignettes at the moment is that
  they must be sweave, a format which I don't really like.  I wish I
  could supply my own pdf + R code file produced using whatever
  tools I choose.
  I like Sweave, and it is also possible to include your own PDFs and
  R files and then to reference them in anRpackage.Rd.
 Yes, but they're not vignettes - which means they're not listed under
 vignette() and it's yet another place for people to look for

Well, there is a kind of subversive way of how do use the facilities
provided to vignettes for PDFs that were created by some other
mechanism.  That was discussed sometime ago on this list.  The idea was
to create an empty vignette that uses the LaTeX package pdfpages to
include the existing PDF into a vignette.

Since you ended up with essentially two copies of the same PDF file,
you can use .Rbuildignore to exclude the original PDF from the build
and only distribute the PDF created from the vignette.

This trick worked really well as long as R CMD check was not trying
to latex the .tex file produced from the vignette (apparently since R
2.6.0 the behaviour of R CMD check in this respect has changed,
though the Writing R Extension manual was not updated to reflect
this change; which reminds me that I promised Kurt Hornik to file a
bug report about this).  What makes things worse is that with the
current TeX installation on Debian based operating system, latex hangs
if a file, from which pdfpages wants to include some (or all) pages,
does not exist.  That is R CMD check on such a tar.gz file hangs and
doesn't stop with an error message.

The solution that I am using at the moment is shown in the attached
file which resides in inst/doc of the lasso2 package on my machine.  On
my machine, R CMD build and R CMD check will, of course, work.
Essentially, this vignette only creates the information needed to index
it and then includes the file Manual-wo-help.pdf (which is the old help
for the S-Plus version of that package; I should really update all
this). Manual-wo-help.pdf is excluded from lasso2_x.y-z.tar.gz via an
entry in .Rbuildignore but the vignette is distributed and listed under
vignette().  And R CMD check on works the .tar.gz file too.



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546
%\VignetteIndexEntry{Manual of lasso2 package}



\typeout{No file Manual-wo-help.pdf}

__ mailing list

Re: [Rd] splinefun gives incorrect derivs when extrapolating to the left (PR#13132)

2008-10-07 Thread Berwin A Turlach
On Tue, 7 Oct 2008 19:31:03 +0800
Berwin A Turlach [EMAIL PROTECTED] wrote:

 The attached patch (against the current SVN version of R) implements
 the latter strategy.  With this patch applied, make check
 FORCE=FORCE passes on my machine.  The version of R that is build
 seems to give the correct answer in your example:

Perhaps I should have thought a bit more about this.  For a natural
spline c[1] is zero, and d[1] is typically not but for evaluations
left of the first knot it should be taken as zero.  So the attached
patch solves the problem in what some might consider a more elegant
manner. :)

With the patch make check FORCE=FORCE works on my machine and it
also solves your example:

R x - 1:10
R y - sin(x)
R splfun - splinefun(x,y, method='natural')
R # these should be linear (and are)
R splfun( seq(0,1, 0.1) )  
 [1] 0.5682923 0.5956102 0.6229280 0.6502459 0.6775638 0.7048816
 [7] 0.7321995 0.7595174 0.7868352 0.8141531 0.8414710
R # these should all be the same
R splfun( seq(0,1, 0.1), deriv=1 )  
 [1] 0.2731787 0.2731787 0.2731787 0.2731787 0.2731787 0.2731787
 [7] 0.2731787 0.2731787 0.2731787 0.2731787 0.2731787
R # these should all be 0
R splfun( seq(0,1, 0.1), deriv=2 )
 [1] 0 0 0 0 0 0 0 0 0 0 0
R splfun( seq(0,1, 0.1), deriv=3 )  
 [1] 0 0 0 0 0 0 0 0 0 0 0




=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546
Index: src/library/stats/R/splinefun.R
--- src/library/stats/R/splinefun.R (revision 46635)
+++ src/library/stats/R/splinefun.R (working copy)
@@ -38,7 +38,6 @@
y - as.vector(tapply(y,x,ties))# as.v: drop dim  dimn.
x - sort(ux)
nx - length(x)
-   rm(ux)
} else {
o - order(x)
x - x[o]
@@ -93,7 +92,7 @@
e=double(if(iMeth == 1) nx else 0),
 z$e - NULL
 function(x, deriv = 0) {
deriv - as.integer(deriv)
@@ -114,18 +113,25 @@
 ##   where dx := (u[j]-x[i]); i such that x[i] = u[j] = 
 ##u[j]:= xout[j] (unless sometimes for periodic spl.)
 ##   and  d_i := d[i] unless for natural splines at left
-   .C(spline_eval,
-  z$method,
-  as.integer(length(x)),
-  x=as.double(x),
-  y=double(length(x)),
-  z$n,
-  z$x,
-  z$y,
-  z$b,
-  z$c,
-  z$d,
-  PACKAGE=stats)$y
+   res - .C(spline_eval,
+  z$method,
+  as.integer(length(x)),
+  x=as.double(x),
+  y=double(length(x)),
+  z$n,
+  z$x,
+  z$y,
+  z$b,
+  z$c,
+  z$d,
+  PACKAGE=stats)$y
+## deal with points to the left of first knot if natural
+## splines are used  (Bug PR#13132)
+if( deriv  0  z$method==2  any(ind - x=z$x[1]) )
+  res[ind] - ifelse(deriv == 1, z$y[1], 0)
Index: src/library/stats/man/splinefun.Rd
--- src/library/stats/man/splinefun.Rd  (revision 46635)
+++ src/library/stats/man/splinefun.Rd  (working copy)
@@ -131,7 +131,7 @@
 ## Manual spline evaluation --- demo the coefficients :
-.x - get(ux, envir = environment(f))
+.x - splinecoef$x
 u - seq(3,6, by = 0.25)
 (ii - findInterval(u, .x))
 dx - u - .x[ii]
__ mailing list

Re: [Rd] literate programming

2008-08-05 Thread Berwin A Turlach
G'day Terry,

On Tue, 5 Aug 2008 09:38:23 -0500 (CDT)
Terry Therneau [EMAIL PROTECTED] wrote:

 I'm working on the next iteration of coxme.  (Rather slowly during
 the summer). 
   This is the most subtle code I've done in S, both mathematically
 and technically, and seems a perfect vehicle for the literate
 programming paradym of Knuth.  The Sweave project is pointed at S
 output however, not source code. I would appreciate any pointers to
 an noweb type client that was R-aware. 

I would suggest you look at relax:



=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] prod(0, 1:1000) ; 0 * Inf etc

2008-04-21 Thread Berwin A Turlach
G'day Martin,

On Mon, 21 Apr 2008 18:40:43 +0200
Martin Maechler [EMAIL PROTECTED] wrote:

 I think most of us would expect  prod(0:1000)  to return 0, and ...
 ... it does.
 However, many of us also expect  
   prod(x1, x2)to be equivalent to
 the same as we can expect that for min(), max(), sum() and such
 members of the Summary group.

Many may also expect that prod(x) and prod(rev(x)) are equivalent.
Unfortunately, this does not hold in finite precision arithmetic:

[1] 0
[1] NA
[1] NA

It might be better to educate useRs on finite precision arithmetic
than trying to catch such situations.  Note, I am saying better, not
easier. :-)  



__ mailing list

Re: [Rd] NA warnings for rdistr() {aka patch for random.c}

2008-03-11 Thread Berwin A Turlach
G'day Martin,

On Tue, 11 Mar 2008 18:07:35 +0100
Martin Maechler [EMAIL PROTECTED] wrote:

  BAT == Berwin A Turlach [EMAIL PROTECTED]
  on Tue, 11 Mar 2008 13:19:46 +0800 writes:

 BAT The first two lines give identical results, as one could
 BAT reasonably expect.  
 Yes, but I don't think a user should *rely* on the way R
 generates these random number; though I agree for the very
 specific case of rlnorm(..).
 BAT But the other two do not and I would argue that a user could
 BAT reasonably expect that the commands in these two lines
 BAT should lead to identical results.
 They now do.

Well, actually I forgot to mention one could also put another argument
forward.  As the log-mean parameter of the lognormal distribution
goes to -Inf, the distribution degenerates to something that has mean 0
and variance 0, i.e. could be taken as the constant zero and, hence,
one might expect that rlnorm(1, -Inf) returns 0.

But as the log-mean parameter goes to Inf, the distribution degenerates
to something with infinite mean and infinite variance.  Thus, perhaps
it is more sensible for rlnorm(1, Inf) to return NaN instead of Inf.

 I don't think your change to  .../R/distn.R  was good,

I didn't like it either, but it was the simplest way I could think of
that would allow the C rexp() routine to realise that a scale parameter
of 0 actually came from a rate parameter of -Inf in the R code.  

 but the others I have more or less committed together with a few
 more similar ones.


 BAT BTW, I was surprised to realise that the *exp() functions in
 BAT the underlying C code use the opposite parameterisation from
 BAT the corresponding functions at R level.  Perhaps it would be
 BAT worthwhile to point this out in section 6.7.1 of the Writing
 BAT R extension manual?  In particular since the manual states:
 BAT Note that these argument sequences are (apart from the names
 BAT and that @code{rnorm} has no @var{n}) exactly the same as
 BAT the corresponding @R{} functions of the same name, so the
 BAT documentation of the @R{} functions can be used.
 BAT Well, as I noticed the hard way, for *exp() the
 BAT documentation of the corresponding R functions cannot be
 BAT used. ;-)
 We often also gratefully accept patches for the documentation 

I know, and I am always amazed that despite this policy (or perhaps
because of it?) the documentation of R is not patchy ;-)



__ mailing list

Re: [Rd] patch for random.c

2008-03-05 Thread Berwin A Turlach
G'day Martin,

On Mon, 3 Mar 2008 10:16:45 +0100
Martin Maechler [EMAIL PROTECTED] wrote:

  BAT == Berwin A Turlach [EMAIL PROTECTED]
  on Fri, 29 Feb 2008 18:19:40 +0800 writes:

 BAT while looking for some inspiration of how to organise some
 BAT code, I studied the code of random.c and noticed that for
 BAT distributions with 2 or 3 parameters the user is not warned
 BAT if NAs are created while such a warning is issued for
 BAT distributions with 1 parameter. [...] The attached patch
 BAT rectifies this.  [...]

 I cannot imagine a design reason for that.  If there was, it
 should have been mentioned as a comment in the C code.
 I'll commit your patch (if it passes the checks).

Sorry, I was a bit in a hurry when writing the e-mail, so I forgot to
mention that the source code modified by this patch compiles and passes
make check FORCE=FORCE on my machine.

And in my hurry, I also posted from my NUS account, without realising
it, which forced you to intervene as moderator and to approve the
posting.  My apologies for the extra work.  But this gave me the idea
to also subscribe to r-devel with my NUS account and configure the
subscriptions so that I only receive e-mail at my UWA account.  Thus,
hopefully, you will not have to intervene again.  (Which this e-mail
should test.)
 BAT BTW, there are other places in the code were NAs can be
 BAT created but no warning is issued.  E.g:
  rexp(2, rate=numeric())
 BAT [1] NA NA
  rnorm(2, mean=numeric())
 BAT [1] NA NA
 BAT I wonder whether a warning should be issued in this case
 BAT too.  
 Yes, should in principle.
 If you feel like finding another elegant patch...

Well, elegance is in the eye of the beholder. :-)  I attach two
patches.  One that adds warning messages at the other places where NAs
can be generated.

The second one in additiona rearranges the code a bit such that in the
case when all the vectors that contain the parameter values of the
distribution, from which one wants to simulate, are of length one some
unnecessary calculations is taken out of the for loop.  I am not sure
how much time is actually saved in this situation, but I belong to the
school that things such kind of optimisation should be done. :)  If you
think it bloats the code too much (or duplicates things too much
leading to hard to maintain code), then feel free to ignore this second

 Thank you Berwin, for your contribution!

My pleasure.


Index: src/main/random.c
--- src/main/random.c   (revision 44677)
+++ src/main/random.c   (working copy)
@@ -81,6 +81,7 @@
 if (na  1) {
for (i = 0; i  n; i++)
REAL(x)[i] = NA_REAL;
+warning(_(NAs produced));
 else {
PROTECT(a = coerceVector(CADR(args), REALSXP));
@@ -123,7 +124,7 @@
 #define RAND2(num,name) \
case num: \
-   random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \
+   naflag = random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \
 /* do_random2 - random sampling from 2 parameter families. */
@@ -155,6 +156,7 @@
 if (na  1 || nb  1) {
for (i = 0; i  n; i++)
REAL(x)[i] = NA_REAL;
+warning(_(NAs produced));
 else {
PROTECT(a = coerceVector(CADR(args), REALSXP));
@@ -207,7 +209,7 @@
 #define RAND3(num,name) \
case num: \
-   random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, REAL(x), 
n); \
+   naflag = random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, 
REAL(x), n); \
@@ -244,6 +246,7 @@
 if (na  1 || nb  1 || nc  1) {
for (i = 0; i  n; i++)
REAL(x)[i] = NA_REAL;
+warning(_(NAs produced));
 else {
PROTECT(a = coerceVector(a, REALSXP));
Index: src/main/random.c
--- src/main/random.c   (revision 44677)
+++ src/main/random.c   (working copy)
@@ -41,10 +41,19 @@
 double ai;
 int i;
 errno = 0;
-for (i = 0; i  n; i++) {
+if( na == 1 ){
+  ai = a[0];
+  for (i = 0; i  n; i++) {
+   x[i] = f(ai);
+   if (!R_FINITE(x[i])) naflag = 1;
+  }
+  for (i = 0; i  n; i++) {
ai = a[i % na];
x[i] = f(ai);
if (!R_FINITE(x[i])) naflag = 1;
+  }
@@ -81,6 +90,7 @@
 if (na  1) {
for (i = 0; i  n; i++)
REAL(x)[i] = NA_REAL;
+warning(_(NAs produced));
 else {
PROTECT(a = coerceVector(CADR(args), REALSXP));
@@ -112,18 +122,28 @@
 double ai, bi; int i;
 Rboolean naflag = FALSE;
 errno = 0;
-for (i = 0; i  n; i++) {
+if( na == 1  na == 1 ){
+  ai = a[0];
+  bi = b[0];
+  for (i = 0; i  n; i++) {
+   x[i] = f(ai, bi);
+   if (!R_FINITE(x[i

[Rd] A patch

2008-03-05 Thread Berwin A Turlach
Dear all,

since a day or two make dvi and make pdf fails on my machine when I
try to install the latest version of R from scratch.  The attached
patch seems to solve this problem.



=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546
__ mailing list

Re: [Rd] A patch

2008-03-05 Thread Berwin A Turlach
On Wed, 5 Mar 2008 21:02:34 +0800
Berwin A Turlach [EMAIL PROTECTED] wrote:

 since a day or two make dvi and make pdf fails on my machine when
 I try to install the latest version of R from scratch.  The attached
 patch seems to solve this problem.

Sorry, forgot to change the attachment from application/octet-stream to
text/plain and it got scrubbed.  Patch is attached now.


Index: src/library/grDevices/man/unix/x11.Rd
--- src/library/grDevices/man/unix/x11.Rd   (revision 44677)
+++ src/library/grDevices/man/unix/x11.Rd   (working copy)
@@ -136,7 +136,7 @@
   \code{cairo} (on Mac OS X and perhaps elsewhere).  Both make use of
   \code{fontconfig} (\url{}) to select fonts
   and so the results depend on the fonts installed on the system running
-  \R -- setting the environmnent variable \env{FC_DEBUG} to 1 allows
+  \R -- setting the environmnent variable \env{FC\_DEBUG} to 1 allows
   some tracing of the selection process.
   This works best when high-quality scalable fonts are installed,
__ mailing list

[Rd] patch for random.c

2008-02-29 Thread Berwin A Turlach
Dear all,

while looking for some inspiration of how to organise some code, I
studied the code of random.c and noticed that for distributions with
2 or 3 parameters the user is not warned if NAs are created while such
a warning is issued for distributions with 1 parameter.  E.g:

R version 2.7.0 Under development (unstable) (2008-02-29 r44639)


 rexp(2, rate=Inf)
[1] NaN NaN
Warning message:
In rexp(2, rate = Inf) : NAs produced
 rnorm(2, mean=Inf)
[1] NaN NaN

Surprisingly, the code for issuing warnings for distributions with 2 or
3 parameters is essentially there, but does not seem to be used.  The
attached patch rectifies this.  With the patch the above command produce
the following output:

 rexp(2, rate=Inf)
[1] NaN NaN
Warning message:
In rexp(2, rate = Inf) : NAs produced
 rnorm(2, mean=Inf)
[1] NaN NaN
Warning message:
In rnorm(2, mean = Inf) : NAs produced

Please ignore the patch if the code that was designed to produce the
warning had been removed on purpose.  

BTW, there are other places in the code were NAs can be created but no
warning is issued.  E.g:

 rexp(2, rate=numeric())
[1] NA NA
 rnorm(2, mean=numeric())
[1] NA NA

I wonder whether a warning should be issued in this case too.  

Best wishes,


Index: src/main/random.c
--- src/main/random.c   (revision 44639)
+++ src/main/random.c   (working copy)
@@ -123,7 +123,7 @@
 #define RAND2(num,name) \
case num: \
-   random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \
+   naflag = random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \
 /* do_random2 - random sampling from 2 parameter families. */
@@ -207,7 +207,7 @@
 #define RAND3(num,name) \
case num: \
-   random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, REAL(x), 
n); \
+   naflag = random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, 
REAL(x), n); \
__ mailing list

[Rd] On Section 2.5:Sub-architectures in the R Installation and Administration manual

2008-02-13 Thread Berwin A Turlach
Dear all,

there are a few issues regarding Section 2.5: Sub-architectures in the
R Installation and Administration manual (referring to Version 2.6.2)
that I would like to raise:

1.)  The manual states:

@code{R CMD INSTALL} will detect if more that one build is installed
and try to install packages with the appropriate library objects
for each. This will not be done if the package has an executable
@code{configure} script or a @file{src/Makefile} file.  In such
cases you can install for extra builds by

R [EMAIL PROTECTED] CMD INSTALL --libs-only @var{pkg}(s)
@end example

My experience with the --libs-only flag is that if the installation of
the package fails, then the package is removed.  Is this really desired
behaviour?  If yes, then the manual should perhaps warn that this might
happen.  However, in my opinion, this behaviour is undesirable.  As I
understand it, `--libs-only' would only be used if the package was
already successfully installed for (at least) one sub-architecture.
Why should the failure to install it for another sub-architecture
remove the complete package, including the successful build for another

2.) On my machines (running Kubuntu Linux (gutsy)), I usually install a
32 bit version, by specifying r_arch=32 to the configure line, and a 64
version, by specifying r_arch=64 to the configure line. This seems to
be in line with the following recommendation in the manual:

   If you want to mix sub-architectures compiled on different
   platforms (for example @cputype{x86_64} Linux and @cputype{i686}
   Linux), it is wise to use explicit names for each, and you may
   also need to set @option{libdir} to ensure that they
   install into the same place.

However, after installing R 2.6.2 in such manner, trying to start R
resulted in the following error message:

Error: R executable not found

As far as I can tell, this is due to the code that was added at the
beginning of the bash script R (to support parallel 32/64-bit
installations using multilib on Linux?) which tries to locate the R
executable but does not search deep enough into the directory structure
if multiple architectures are installed using the r_arch option to the
configure line (for both/all architectures).

3) The manual also states in that section:

On Linux, there is an alternative mechanism for mixing 32-bit and
64-bit libraries known as @emph{multilib}. If a Linux distribution
supports multilib, then parallel builds of @R{} may be installed in
the sub-directories @file{lib} (32-bit) and @file{lib64} (64-bit).

As far as I can tell, Kubuntu Linux distributions support multilib.
However on these distributions /lib64 and /usr/lib64 are links to /lib
and /usr/lib, respectively, where the 64-bit libraries reside and the
32-bit libraries reside in /lib32 and /usr/lib32.  Thus, the above
paragraph is somewhat confusing to somebody who is running a Kubuntu
Linux distribution.  (Presumably the same holds for Debian and all
other distributions derived from Debian.)



=== Full address =
Berwin A TurlachTel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability+65 6515 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] 0.450.45 = TRUE (PR#10744)

2008-02-12 Thread Berwin A Turlach
On Tue, 12 Feb 2008 15:47:56 +0100
Gabor Csardi [EMAIL PROTECTED] wrote:

 OMG, not again please!
 FAQ 7.31.

Yeah, there seems to be a cluster of that type of questions at the

Perhaps it is time to introduce a global option HaveReadFAQ7.31 whose
default is FALSE but can be changed via the normal mechanism to

Any comparison of numeric values/vectors should print a warning message
You are comparing numbers calculated using finite precision
arithmetic, have you read FAQ 7.31.  while the value of this option is

Perhaps that will help. :)



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] Solve.QP

2007-12-06 Thread Berwin A Turlach
G'day Serge,

On Thu, 6 Dec 2007 13:34:58 +0100
de Gosson de Varennes Serge (4100)

 I have a major problem (major for me that is) with solve.QP and I'm
 new at this. You see, to solve my quadratic program I need to have
 the lagrange multipliers after each iteration. Solve.QP gives me the
 solution, the unconstrained solution aswell as the optimal value.
 Does anybody have an idea for how I could extract the multipliers?

You could calculate them.  For the quadratic program 

min  1/2 x'Dx - d'x   such that A'x = b

the KKT conditions are:

( D -A) (x) = (d)
( A' 0) (l) = (b)

plus the complementary conditions.  solve.QP tells you the solution x*
and which constraints are active.  If A* is the submatrix of A that
contains only those columns of A that correspond to active constraints
at the solution, then the first line in the above set of equations
imply that the corresponding Lagrange multiplier l* fulfill the

D x* - A* l* = d  --  l* = (A*' A*)^{-1} A*'(D x* - d)

all other Lagrange multiplier would be zero.  

Thus, using the example from the help page and expanding it, the
following code should calculate the Lagrange multipliers:

Dmat   - matrix(0,3,3)
diag(Dmat) - 1
dvec   - c(0,5,0)
Amat   - matrix(c(-4,-3,0,2,1,0,0,-2,1),3,3)
bvec   - c(-8,2,0)
res - solve.QP(Dmat,dvec,Amat,bvec=bvec)
xst - res$solution
tmp - Dmat %*% xst -dvec
Ast - Amat[,res$iact]
ll - solve(crossprod(Ast,Ast), crossprod(Ast, tmp))
## a small check
cbind(tmp, Ast %*% ll)
lagr - rep(0, ncol(Amat))
lagrange[res$iact] - ll
[1] 0.000 0.2380952 2.0952381

Alternatively, somewhere down in the FORTRAN code the Lagrange
multipliers are actually calculated.  About 4 years ago somebody asked
me about this and he could locate where they are calculated.  He
modified the FORTRAN code and the R code such that the Lagrange
multipliers would be returned too.  Curiously, he sent the modified
code to me and not to the package maintainer, but I had no time at that
moment to check his modification, so I never passed anything on to the
package maintainer either.  But if you are interested in modifying the
FORTRAN and R code and recompile the package yourself, I can see if I
can find that code.

I still think that this package should be worked over by someone with a
better understanding of the kind of fudges that do not come back to
bite and of finite precision arithmetic than the original author's
appreciation of such issues when the code was written. ;-))  Given
Bill's recent comments on r-help, I wonder whether this package is one
of those on his list of downright dangerous packages.  LOL.

Hope this helps.



=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546

__ mailing list

Re: [Rd] ?mean

2007-01-26 Thread Berwin A Turlach
G'day Gabor,

On Thu, 25 Jan 2007 09:53:49 -0500
Gabor Grothendieck [EMAIL PROTECTED] wrote:

 The help page for mean does not say what happens when one
 applies mean to a matrix.

Well, not directly.  :-)

But the help page of mean says that one of the arguments is:

   x: An R object.  Currently there are methods for numeric data
  frames, numeric vectors and dates.  A complex vector is
  allowed for 'trim = 0', only.

And the `Value' section states:
 For a data frame, a named vector with the appropriate method being
 applied column by column.

 If 'trim' is zero (the default), the arithmetic mean of the values
 in 'x' is computed, as a numeric or complex vector of length one. 
 If any argument is not logical (coerced to numeric), integer,
 numeric or complex, 'NA' is returned, with a warning.

Since a matrix is a vector with a dimension attribute, and not a data
frame, one can deduce that the second paragraph describes the return
value for `mean(x)' when x is a matrix.

As I always tell my students, reading R help pages is a bit of an
art. :)

 mean and sd work in an inconsistent way on a matrix so that should at
 least be documented. 

Agreed.  But it is documented in the help page of sd, which clearly

 [] If 'x' is a matrix or a data frame, a vector
 of the standard deviation of the columns is returned.

I guess you also want to have it documented in the mean help page?  

But then, should `var' also be mentioned in the mean help page?  This
command also work in an a different and inconsistent manner to mean on

And, of course, there are other subtle inconsistencies in the language
used in these help pages.  Note that the mean help page talks about
numeric data frames while the help pages of `var' and `se' talk about
data frames only, though all components of the data frame have to be
numeric, of course.

 Also there should be a See Also to colMeans since that provides the
 missing column-wise analog to sd.

That's probably a good idea.  What would you suggest should be
mentioned to provide the column-wise analog of `var'?



__ mailing list

Re: [Rd] stem does not give a correct answer (PR#9359)

2006-11-12 Thread Berwin A Turlach
G' day Myung,

 MGK == mgkim  [EMAIL PROTECTED] writes:

MGK For the data c1 of size 14, stem provides the following result.

MGK **
MGK [1] 14 39 70 11 38 20 37 15 41 74 74 34 48 51
ZZangi stem(c1)

MGK The decimal point is 1 digit(s) to the right of the |

MGK 0 | 145
MGK 2 | 04789
MGK 4 | 181
MGK 6 | 044

MGK **
MGK However, stem does not give a correct answer, for example 34, 74, etc.

No bug, it is just the scale to which stem decides to plot by
default.  The line 

4 | 181

should give a clue that this summarises 41, 48 and 51.  Since the data
is so sparse only even leading numbers are used.  Try:

 dat - c(14, 39, 70, 11, 38, 20, 37, 15, 41, 74, 74, 34, 48, 51)

  The decimal point is 1 digit(s) to the right of the |

  0 | 145
  2 | 04789
  4 | 181
  6 | 044

 stem(dat, scale=2)

  The decimal point is 1 digit(s) to the right of the |

  1 | 145
  2 | 0
  3 | 4789
  4 | 18
  5 | 1
  6 | 
  7 | 044



== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)   
School of Mathematics and Statistics+61 (8) 6488 3383 (self)  
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway   
commands and reflect on their output.

 sprintf('%1.20g\n', 3.75)
[1] 3.75\n
 sprintf('%1.20g\n', 3.15)
[1] 3.1499112\n
 sprintf('%1.20g\n', 3.7500)
[1] 3.75\n
 sprintf('%1.20g\n', 3.1500)
[1] 3.1499112\n

I know, I should probably do the same thing as I did yesterday and hit
the delete button and go home instead of hitting the sent button.  But
this time I won't.  If anybody feels offended, my apologies.  



__ mailing list

[Rd] Patch proposal for

2006-05-27 Thread Berwin A Turlach
G'day all,

some time ago I sent an email regarding the following behaviour of

 x - matrix(rnorm(200), ncol=2)
 var - fred
 apply(x, 2, var)
Error in get(x, envir, mode, inherits) : variable fred of mode function was 
not found

and asked whether it would be desirable to change this behaviour such
that the function var would be found and no error would be produced.
I also asked whether there are arguments against such a change.

As I did not receive any arguments against such a change, I looked
into changing such that the example above would work.  The
result is the patch attached below.  On my machine, r-devel passes
make check FORCE=FORCE with this patch applied.  Thus, hopefully
this change to does not break anything.

I realise that there is now a lot of code replication in the function,
but was not sure whether it would be worthwile to write a small helper
function to reduce this replication.  (In particular, I was not sure
whether that would then involve using substitute thrice.  Since I
rarely use this command I am not so sure about its proper use.)

Also, I presume it is debatable whether a warning should be issued if
the lookup using get() fails if the argument was a character string of
length one.  Personally, I like it because the user still gets some
x - matrix(rnorm(200), ncol=2)
var - foo
apply(x, 2, var)
   [1] 1.055595 1.098397
   Warning message:
   Error in get(x, envir, mode, inherits) : variable foo of mode function 
was not found

and the combination of warning + error should make it really easy to
track down situations like this:

foo - bar
apply(x, 2, foo)
   Error in get(x, envir, mode, inherits) : variable foo of mode function 
was not found
   In addition: Warning message:
   Error in get(x, envir, mode, inherits) : variable bar of mode function 
was not found

Of course, other people might have other tastes. :)

Please consider applying this patch (or a variation) to r-devel.




Index: src/library/base/R/
--- src/library/base/R/  (revision 38204)
+++ src/library/base/R/  (working copy)
@@ -5,7 +5,7 @@
 if ( is.function(FUN) )
+if (!( (CL1 - is.character(FUN)  length(FUN) == 1) || is.symbol(FUN))) {
 ## Substitute in parent
 FUN - eval.parent(substitute(substitute(FUN)))
 if (!is.symbol(FUN))
@@ -13,12 +13,43 @@
   deparse(FUN)), domain = NA)
 envir - parent.frame(2)
-if( descend )
+if( CL1 ) {
+  if( descend ){
+TMPFUN - try(get(FUN, mode = function, env=envir), silent=TRUE)
+if( inherits(TMPFUN, try-error) ){
+  warning(TMPFUN)
+  FUN - eval.parent(substitute(substitute(FUN)))
+  if (!is.symbol(FUN)){
+stop(gettextf('%s' is not a function, character or symbol,
+  deparse(FUN)), domain = NA)
+  }
+  TMPFUN - get(as.character(FUN), mode = function, env=envir)
+  }
+  else {
+TMPFUN - try(get(FUN, mode = any, env=envir), silent=TRUE)
+if( inherits(TMPFUN, try-error) ){
+  warning(TMPFUN)
I was recently contacted by a user about an alledged problem/bug in
the latest version of lasso2.  After some investigation, we found out
that it was a user error which boils down to the following:

 x - matrix(rnorm(200), ncol=2)
 var - fred
 apply(x, 2, var)
Error in get(x, envir, mode, inherits) : variable fred of mode function was 
not found

only that the offending apply() command happened inside the gl1ce()
function of lasso2.

I was under the impression that R can now distinguish between
variables and functions with the same name and, indeed, the following

 var - 2
 apply(x, 2, var)
[1] 1.053002 1.250875

Poking a bit around, I guess that the ability to distinguish between
variables and functions with the same name comes from the introduction
of the function and, after reading its help page, the
reasons why an error is triggered the first time but not the second
time is perfectly clear to me.

I wonder whether it would make sense to change such that
the first case does not result in an error?  I was thinking along the
line that if the argument to is a variable that contains a
character vector of length one then, using get(), attemps
to find a function with that name.  If the get() command does not
succeed, then a second try is made using the name of the variable
passed by the caller to

So before trying to modify and submitting a patch, I
wanted to ask whether such a change would be accepted?  Is there an
argument that I don't see that the first case should always result in
an error and not be silently resolved?



__ mailing list

Re: [Rd] bounding box in PostScript

2006-04-17 Thread Berwin A Turlach
G'day David,

 DA == David Allen [EMAIL PROTECTED] writes:

DA When a graph is saved as PostScript, the bounding box is often
DA too big.  A consequence is that when the graph is included in
DA a LaTeX document, the spacing does not look good.  Is this a
DA recognized problem? Is someone working on it? Could I help?
How exactly are you saving the graph as PostScript?

I guess the problem is due to the fact that you save the PostScript
picture on a paper according to your default paper size (which is a4
for me).  To create PostScript pictures with reasonalbe tight bounding
boxes that are suitable to be included in a LaTeX document you should
specify `paper=special' and then define your plotting area via the
'height' and 'width' argument.

Hope this helps.



== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)   
School of Mathematics and Statistics+61 (8) 6488 3383 (self)  
The University of Western Australia   FAX : +61 (8) 6488 1028
is not documented to work, but then the documentation just says:

formula: a formula with no response variable.

Thus, to avoid a lot of typing, it would be nice if one could use '.'
and '-' in the formula, e.g.

 res - prcomp(~ . - case - site - Pop - sex, possum)
Error in prcomp.formula(~. - case - site - Pop - sex, possum) : 
PCA applies only to numerical variables
 res - princomp(~ . - case - site - Pop - sex, possum)
Error in princomp.formula(~. - case - site - Pop - sex, possum) : 
PCA applies only to numerical variables

Unfortunately, as the examples above show, this is currently not
possible, since both functions test whether any term mentioned in the
formula is non numeric or a factor, instead of just testing those that
enter the analysis.

The attached patch should allow the use of '.' and '-', while still
producing an error when a factor or a non-numeric variable is
specified to enter the analysis:

 res - prcomp(~ . - case - site - Pop - sex, possum)
 res - princomp(~ . - case - site - Pop - sex, possum)
 res - prcomp(~ . - case - site - Pop, possum)
Error in prcomp.formula(~. - case - site - Pop, possum) : 
PCA applies only to numerical variables
 res - princomp(~ . - case - site - Pop, possum)
Error in princomp.formula(~. - case - site - Pop, possum) : 
PCA applies only to numerical variables

On my machine, `make check FORCE=FORCE' succeeds with this patch and,
as far as I can tell, no modification of the help pages would be



Index: src/library/stats/R/princomp.R
--- src/library/stats/R/princomp.R  (revision 37571)
+++ src/library/stats/R/princomp.R  (working copy)
@@ -10,13 +10,14 @@
 mf$... - NULL
 mf[[1]] -
 mf - eval.parent(mf)
+na.act - attr(mf, na.action)
+mt - attr(mf, terms) # allow model.frame to update it
+attr(mt, intercept) - 0
+mterms - attr(mt, term.labels)
 ## this is not a `standard' model-fitting function,
 ## so no need to consider contrasts or levels
-if(any(sapply(mf, function(x) is.factor(x) || !is.numeric(x
+if(any(sapply(mterms, function(x) is.factor(mf[,x]) || 
 stop(PCA applies only to numerical variables)
-na.act - attr(mf, na.action)
-mt - attr(mf, terms) # allow model.frame to update it
-attr(mt, intercept) - 0
 x - model.matrix(mt, mf)
 res - princomp.default(x, ...)
 ## fix up call to refer to the generic, but leave arg name as `formula'
Index: src/library/stats/R/prcomp.R
--- src/library/stats/R/prcomp.R(revision 37571)
+++ src/library/stats/R/prcomp.R(working copy)
@@ -37,13 +37,14 @@
 mf$... - NULL
 mf[[1]] -
 mf - eval.parent(mf)
+na.act - attr(mf, na.action)
+mt - attr(mf, terms)
+attr(mt, intercept) - 0
+mterms - attr(mt, term.labels)
 ## this is not a `standard' model-fitting function,
 x - model.matrix(mt, mf)
 res - prcomp.default(x, ...)
 ## fix up call to refer to the generic, but leave arg name as `formula'
__ mailing list

Re: [Rd] Internal codes of the factor

2006-03-14 Thread Berwin A Turlach
G'day Gregor,

 GG == Gregor Gorjanc [EMAIL PROTECTED] writes:

GG I am writing some functions and I repeatedly acces internal
GG factor codes. I figured out that internal codes are 1:n where
GG 1 represents 1st level, 2 2nd level etc. This is not
GG documented [...]
The help page for factor states in the 'Details' section:

 The encoding of the vector happens as follows. First all the
 values in 'exclude' are removed from 'levels'. If 'x[i]' equals
 'levels[j]', then the 'i'-th element of the result is 'j'.  If no
 match is found for 'x[i]' in 'levels', then the 'i'-th element of
 the result is set to 'NA'.

Note in particular the part on then the 'i'-th element of the result
is 'j'.  This pretty much documents that the internal codes are 1:n
and 'NA', as documented in the following sentence.



== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)   
School of Mathematics and Statistics+61 (8) 6488 3383 (self)  
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway   
Crawley WA 6009e-mail: [EMAIL PROTECTED]

__ mailing list

Re: [Rd] Links to non-vignette documentation

2006-02-24 Thread Berwin A Turlach
DM That's a nice hack.  You probably want the fitpaper option on the 
DM \includepdf command, so that you don't get an extra border around the 
DM page.  For example, this file test.Rnw [...]

DM \includepdf[fitpaper=true]{response.pdf}
Additionally, if response.pdf has several pages and you want to
include them all, you should also include a pages options, such as:


More details can be found in the pdfpages documentation, but by
default only the first page is included.

DM produces an output that looks pretty much exactly like the 
DM response.pdf file I used as test input in a viewer.
Perhaps interface96.pdf was created too long ago (it says PDF 1.2 at
the top of that file), but the result looks strange in xpdf (the
included pages are quite small and in the upper left corner, selecting
fit to page creates an acceptable viewing results); no problem with
acroread.  This is on a linux box.

DM The only disadvantages I see are that both the test.pdf and
DM test.pdf shows up in the index),
That is a potential disadvantage as it duplicates material.  But I
guess .Rbuildignore in the main directory of the package can help in
this case.  I have put the line inst/doc/interface96.pdf into the
.Rbuildignore file of that package.

DM and that test.pdf is a lot larger than response.pdf.  (This
DM may be because response.pdf was small; I haven't checked if
DM the increase is additive or multiplicative).
I didn't check this either, but here are some results on including a 6
page pdf file (extracts from looking at the .tar.gz file produced by
the build process).  First, the old solution with a separate PDF file
and a dummy vignette:

 drwxr-xr-x  berwin/berwin0 clps/inst/
 drwxr-xr-x  berwin/berwin0 clps/inst/doc/
 -rw-r--r--  berwin/berwin  649 clps/inst/doc/clps.bib
 -rw-r--r--  berwin/berwin  670 clps/inst/doc/interface96-vignette.Rnw
 -rw-r--r--  berwin/berwin   105035 clps/inst/doc/interface96.pdf
 -rw-r--r--  berwin/berwin49242 clps/inst/doc/interface96-vignette.pdf
Second, with \includepdf and .Rbuildignore:

 drwxr-xr-x  berwin/berwin0 clps/inst/
 drwxr-xr-x  berwin/berwin0 clps/inst/doc/
 -rw-r--r--  berwin/berwin  649 clps/inst/doc/clps.bib
 -rw-r--r--  berwin/berwin  440 clps/inst/doc/interface96-vignette.Rnw
 -rw-r--r--  berwin/berwin   191589 clps/inst/doc/interface96-vignette.pdf

Looks like an increase of about 40 kB to me which I would find acceptable.

DM A change to the R package build process would be to add support for a 
DM command like

DM %\VignetteExists

DM to the test.Rnw file, telling R not to bother trying to build the pdf, 
DM because it had already been built by other means.  Then I'd just have 
DM test.Rnw containing
Searching the Writing R Extensions manual for vignette, I noticed
the following:

  Unless @kbd{R CMD build} is invoked with the
  @option{--no-vignettes} option, it will attempt to rebuild the
  vignettes (@pxref{Writing package vignettes}) in the package.
  To do so it installs the current package/bundle into a temporary
  library tree, but any dependent packages need to be installed in
  an available library tree (see the Note: below).

Thus there is already a mechanism to avoid (automatic) rebuilding of
vignettes.  But it seems to be a all-or-nothing solution and I could
imagine that some packages might have real vignettes that the
maintainer would like to have rebuild automatically and dummy
vignettes that should not be rebuild.  So a fine-grained control,
along the way that you suggest, would be a nice way.

 HT == Hin-Tak Leung [EMAIL PROTECTED] writes:

HT I like pdfpages and do use it from time to time [...]  so such
HT constructions would break on sites which hasn't upgraded their
HT LaTeX installation in the last 3 years.
The Writing R Extensions manual states on page 15:

  @code{R CMD build} will automatically create PDF versions of the
  vignettes for distribution with the package sources.  By
  including the PDF version in the package sources it is not
  necessary that the vignettes can be compiled at install time,
 BK == Bernd Kriegstein [EMAIL PROTECTED] writes:

BK void pico ( double *y, int n, int m )
Everything is passed from R to C as pointer, so these should be

Hope this helps.



__ mailing list

Re: [Rd] R CMD check, NAMESPACE, import: bad error?

2006-01-19 Thread Berwin A Turlach
G'day Seth,

 SF == Seth Falcon [EMAIL PROTECTED] writes:

SF I'm seeing errors with R CMD check that I don't understand
SF when checking a package that uses a NAMESPACE file with an
SF import directive.
I came sometime ago across a similar problem and it took me some time
to figure it out.  In my case it was that a .Fortran() call didn't
have a package= argument.  My advise would be to check all .C() and
.Fortran() calls in your package and add the package= argument if it
is missing.

I also guess that if you temporarily remove the NAMESPACE file, the
following step in the checking process:

  * checking foreign function calls ... WARNING
  Error: package/namespace load failed for 'DNAhelperseth'
  Call sequence:
  2: stop(gettextf(package/namespace load failed for '%s', 
libraryPkgName(package)),call. = FALSE, domain = NA)
  1: library(package, lib.loc = lib.loc, character.only = TRUE, verbose = 
  Execution halted
  See section 'System and foreign language interfaces' of the 'Writing R
  Extensions' manual.

will tell you which call the culprit is.  



__ mailing list

[Rd] Enlightenment sought and a possible buglet in vector.Rd

2005-12-02 Thread Berwin A Turlach
Dear all,

First, I recently had reasons to read the help page of as.vector() and
noticed in the example section the following example:

 x - c(a = 1, b = 2)
 all.equal(x, as.vector(x)) ## FALSE

However, in all versions of R in which I executed this example, the
all.equal command returned TRUE which suggest that either the comment
in the help file is wrong or the all.equal/as.vector combination does
below a patch which would fix vector.Rd.

Secondly, I stumbled across two behaviours of R that I cannot explain
but would like to know why R behaves as it does.  But since I expect
the explanations to be quite technical, I though that r-devel is the
more appropriate list to ask on than r-help.

The first example is the following:

   par.def - par(no.readonly=TRUE)
   tt - sys.on.exit()
language par(par.def)
   par.def - par(no.readonly=TRUE)
   print(tt - sys.on.exit())

I found in the R language definition manual the passage that
discourages users of assigning objects within function calls since it
is not guaranteed that the assignment is ever made because of R's lazy
evaluation model.  But this does not seem to explain the above
behaviour since the argument to print is evaluated.  If I replace
sys.on.exit() with, say, ls() in both functions, then they produce the
same output (and the output that I expect).  Why does f2() not work
with sys.on.exit()?

The second behaviour that I cannot explain was produced by code
written by somebody else, namely: 

  z - x/4
  while( abs(z*z*z-x)  1e-10 ){
 z - (2*z+x/z^2)/3

The documentation of function() says that if the end of a function is
reached without calling 'return', the value of the last evaluated
expression is returned.  And this seems to happen in this case:

   z - foo(3)
  [1] 1.442250

However, my understanding was always that the return value of a
function issued on the command line will be printed; except, of
course, if invisible() is used to return the value.  This is not the
case for the above function:


produces no output.  And this had us stunned for some time.  On the
other hand:

   ( foo(3) )
  [1] 1.442250

So my question is why does R, when foo(3) is issued on the command
line, not print the value returned by the function?

Any enlightening comments are highly appreciated.



 giving the parameters.



__ mailing list

[Rd] R-exts.texi in SVN version 36380

2005-11-17 Thread Berwin A Turlach
G'day all,

after issuing `svn up' on my machine this morning, I noticed that
`make info' choked on R-exts.texi.  Below is a patch that seems to
solve the problem.  BTW, while `make info' runs now, I still get the
following warning:

/usr/bin/makeinfo --enable-encoding -D UseExternalXrefs 
-I/opt/src/R-devel-src/doc/manual /opt/src/R-devel-src/doc/manual/R-exts.texi
/opt/src/R-devel-src/doc/manual/R-exts.texi:1219: warning: @strong{Note...} 
produces a spurious cross-reference in Info; reword to avoid that.

No idea how to fix that, my texinfo knowledge is not good enough. :)

Actually, I am not clear on the following two questions:
1) Should such patches be sent to r-devel, r-bugs or both?
2) Should such patches be sent at all, or should users just wait till
   R-core fixes it itself?



[Rd] Suggested changes to R-lang.texi and R-exts.texi

2005-11-12 Thread Berwin A Turlach
Dear all,

I would like to suggest the following changes to the R documentation:

1) R-exts.texi:
   Having had my first experience with uploading a package to, I think it would be nice if the
   documentation pointed out that one should use ftp and not sftp (at
   least on my machine sftp failed to make a connection) and that one
   should log in as user 'anonymous' and not 'guest'.  As it is, I had
   to figure this out by trial and error.  It would also be nice, if
   in the phrase sent a message to [EMAIL PROTECTED] about it the
   e-mail address would be a mailto: URL.

   The patch file attached below would modify R-exts.texi to
   incorporate all these chanes.

2) R-lang.texi:
   There was recently a short discussion on one of the R mailing list
   by someone who got bitten by partial matching.  Looking at
   R-lang.texi and the section that explains how function arguments
   are matched, I notice that the second step is explained thus:
Each named supplied argument is compared to the remaining formal
 arguments using partial matching.
   It might be just me, but when reading a sentence like this I start
   to wonder why the qualifier remaining is used for formal
   arguments but not for named supplied arguments and I am left
   momentarily confused.  I would like to suggest that the start of
   the sentence is changed to Each remaining named supplied

   The patch file attached below would modify R-exts.texi to
   incorporate all these chanes.

The patch file attached below was produced by running svn diff on my
machine in the directory that contains the trunk of the R-devel
version of R.  So the patch file also includes the patch corresponding
to my bugreport #8218



2005-08-29 Thread Berwin A Turlach
G'day Brian,

 BDR == Prof Brian Ripley [EMAIL PROTECTED] writes:

BDR We've never encountered this lying mirror problem.
Indeed, that mirror is a worry, I guess that is the reason why it is
not on the official mirror list.

We had the problem with install.packages/update.packages under linux
too: source packages that supposedly should have been on the mirror
were not and the commands just terminated with an error message instead
of installing all those packages that could be found.  

Unfortunately, the way we are charged for internet traffic, it is much
cheaper for us to use than any other mirror.  I
remember that last year around this time (give or take a some months)
when a new R version was released (2.0.0?) it took actually quite some
time before the sources appeared on the mirror, the directory
structure was mirrored, but not the files.  That was when I decided to
(temporarily) change mirrors.

BDR Perhaps you (or another user of the unreliable mirror) could
BDR contribute suitable fixes.
I will look into this when I find some time.



__ mailing list

Re: [Rd] robustness of install.packages/update.packages (was Re: bug in L-BFGS-B? (PR#8099))

2005-08-29 Thread Berwin A Turlach
G'day Brian,

 BDR == Prof Brian Ripley [EMAIL PROTECTED] writes:

 However, update.packages() wanted to update quite a few
 packages besides MASS (the other packages in the VR bundle,
 nlme, lattice c).  Once it failed on MASS, it terminated with
 an error and did not update any of the other packages.  Would
 it be possible to robustify update.packages behaviour such
 that it would continue in such situations with updating the
 remaining packages?
  Not a good idea. Better to follow the FAQ.  At that point the
 dependencies have been worked out and will not be re-computed
 if a package installation fails.

BDR I checked, and I am unable to reproduce this.  I get
O.k., I tried to reproduce the behaviour.  So I installed the binary
distribution of R 2.1.1 in another directory on my laptop once more
(and then deleted the copy that I actually wanted to keep; shouldn't
do such things at this time of the day).

So below I attach what is, I believe a faithful reproduction of what I
tried to do yesterday.  Except that I did not run the file that
installs all the contributed packages that I like to have installed.

But it seems that now has vanished from and you can see how on the first occassion the
download stops with an error.  I believe that in this case it could
[Rd] Coding standards (was Re: bug in L-BFGS-B? (PR#8099))

2005-08-28 Thread Berwin A Turlach
G'day Brian,

 BDR == Prof Brian Ripley [EMAIL PROTECTED] writes:

BDR As for the problem, yes it probably is a bug in L-BFGS-B.
BDR Fancy debugging the code?
I was afraid that somebody would ask this. ;-)

I looked a bit at the code and it seems to be non-trivial.  Moreover,
it seems to be code translated from FORTRAN to C via f2c and I am not
a fan of such kind of code.  I know that Writing R Extensions lists
in Appendix B (R coding standards) f2c as a tool that `can safely
be assumed for R extensions'.  However, f2c, despite its `-A' option,
does not produce ANSI C compliant code but rather C code that provokes
undefined behaviour.

The problem is, that the code produced by f2c is decrementing pointers
to simulate 1-based vectors and the C FAQ is pretty unambigious about
the fact that this provokes undefined behaviour, see

As far as I understand, this translated code mostly stems from the
time when some platforms did not have ready access to a fortran
compiler and, hence, f2c was used (extensively?).  But now, with g77
this does not seem to be an issue anymore.  So I wonder whether there
are any plans of returning to the original fortran code?  Or are there
plans to clean up these f2c'd code snippet to make them ANSI C

I noticed such f2c'd code already in the splines.c file when I studied
how splinefun was implemented (which lead to bug report #8030).  In
that case, I am fairly familiar with the algorithms used in splines.c
since I programmed them myself on other occassions and I probably
could rewrite the algorithms in proper ANSI C (it would still take
some time).  But it would be nice to know what the official stance of
the R Core Team is.

When I spoke with one member of the R Core Team about this on a
conference in 2004, the answer was, IIRC, yes, we know that this code
invokes undefined behaviour, but there are bigger problems to fix
first and this pointer manipulation seems to work on all platforms
these days.  Another member of the R Core Team whom I recently asked:

  I guess all platforms on which R is running at the moment do not
  have a problem with this trick, but are there any plans to
  change such kind of code to valid C?  Would patches that do
  that be accepted?


  Hmm, I think we'd tend to disagree here. But in any case that
  would be a wide issue.  Can you address this question to R-core,
  please?  (or I forward?)



__ mailing list

[Rd] Writing R-extensions

2005-08-27 Thread Berwin A Turlach
G'day all,

After reading through Writing R Extensions, Version 2.1.1
(2005-06-20), I thought the the following points might need
clarifications or corrections.  (I checked that these comments also
hold for Writing R Extensions, Version 2.2.0.)

1) When I ran package.skeleton recently, I noticed that the
   DESCRIPTION file and an entry `type'.  This surprised me a bit
   since I did not remember reading about it in the description of the
   DESCRIPTION file.  I realised that package types are described in
   the last section of Chapter 1 of Writing R Extensions, but would
   it be possible to add at the end of Section 1.1.1 (The
   `DESCRIPTION' file) something like:

The optional @samp{Type} field specifies the type of the package:
@pxref{Package types}.

2) The description of the `inst' subdirectory states:

Subdirectories of @file{inst} should not interfere with those
used by R (currently, @file{R}, @file{data}, @file{demo},
@file{exec}, @file{libs}, @file{man}, @file{help},
@file{html}, @file{latex}, @file{R-ex}, and @file{Meta}).

   And I wonder whether this list is incomplete.  Should not, with the
   introduction of localisation, at least @file{po} be listed too?

3) The final sentence in the section on `Registering S3 methods' is:

Any methods for a generic defined in a package that does not
use a name space should be exported, and the package defining
and exporting the method should be attached to the search path
if the methods are to be found.

   I wonder whether this should actually be:

Any methods for a generic defined in a package that does not
use a name space should be exported, and the package defining
and exporting the generic should be attached to the search path
if the methods are to be found.

   Or is the implication of that sentence that if I have a package
   with a name space which defines a method for a generic defined in
   another package that does not use a name space, then this method
   is only found if my package is attached to the search path and
   mere loading of the namespace is not sufficient?

4) This could be nit-picking (or just the fact that English is not my
   native language), but the section on `Load hooks' states
  Packages with name spaces do not use the @code{.First.lib}
  function.  Since loading and attaching are distinct operations
  when a name space is used, separate hooks are provided for each.
  These hook functions are called @code{.onLoad} and
   I interpreted this as o.k., loading and attaching are distinct
   operations, if I load a package .onLoad (and only .onLoad) is run,
   if I attach a package then .onAttach (and only .onAttach) is run.
   But the manual continues a bit further down with

  Packages are not likely to need @code{.onAttach} (except perhaps
  for a start-up banner); code to set options and load shared
  objects should be placed in a @code{.onLoad} function, or use
  made of the @code{useDynLib} directive described next.

   This seems to imply to me that the .onLoad is executed also if I
   attach a package.  So should may understanding rather be attaching
   a package implies loading it and, hence, if I attach a package,
   then .onLoad and .onAttach are both run (with .onLoad presumably
   run first?)?



__ mailing list