Re: [Rd] Is it possible to increase MAX_NUM_DLLS in future R releases?

2016-05-10 Thread Henrik Bengtsson
Isn't the problem in Qin's example that unloadNamespace("scde") only
unloads 'scde' but none of its package dependencies that were loaded
when 'scde' was loaded.  For example:

$ R --vanilla
> ns0 <- loadedNamespaces()
> dlls0 <- getLoadedDLLs()

> packageDescription("scde")[c("Depends", "Imports")]
$Depends
[1] "R (>= 3.0.0), flexmix"

$Imports
[1] "Rcpp (>= 0.10.4), RcppArmadillo (>= 0.5.400.2.0), mgcv, Rook, rjson, MASS,
 Cairo, RColorBrewer, edgeR, quantreg, methods, nnet, RMTstat, extRemes, pcaMet
hods, BiocParallel, parallel"

> loadNamespace("scde")
> ns1 <- loadedNamespaces()
> dlls1 <- getLoadedDLLs()

> nsAdded <- setdiff(ns1, ns0)
> nsAdded
 [1] "flexmix"   "Rcpp"  "edgeR" "splines"
 [5] "BiocGenerics"  "MASS"  "BiocParallel"  "scde"
 [9] "lattice"   "rjson" "brew"  "RcppArmadillo"
[13] "minqa" "distillery""car"   "tools"
[17] "Rook"  "Lmoments"  "nnet"  "parallel"
[21] "pbkrtest"  "RMTstat"   "grid"  "Biobase"
[25] "nlme"  "mgcv"  "quantreg"  "modeltools"
[29] "MatrixModels"  "lme4"  "Matrix""nloptr"
[33] "RColorBrewer"  "extRemes"  "limma" "pcaMethods"
[37] "stats4""SparseM"   "Cairo"

> dllsAdded <- setdiff(names(dlls1), names(dlls0))
> dllsAdded
 [1] "Cairo" "parallel"  "limma" "edgeR"
 [5] "MASS"  "rjson" "Rcpp"  "grid"
 [9] "lattice"   "Matrix""SparseM"   "quantreg"
[13] "nnet"  "nlme"  "mgcv"  "Biobase"
[17] "pcaMethods""splines"   "minqa" "nloptr"
[21] "lme4"  "extRemes"  "RcppArmadillo" "tools"
[25] "Rook"  "scde"


If you unload these namespaces, I think the DLLs will also be
detached; or at least they should if packages implement an .onUnload()
with a dyn.unload().  More on this below.


To unloading these added namespaces (with DLLs), they have to be
unloaded in an order that does not break the dependency graph of the
currently loaded packages, because otherwise you'll get errors such
as:

> unloadNamespace("quantreg")
Error in unloadNamespace("quantreg") :
  namespace 'quantreg' is imported by 'car', 'scde' so cannot be unloaded

I don't know if there exist a function that unloads the namespaces in
the proper order, but here is a brute-force version:

unloadNamespaces <- function(ns, ...) {
  while (length(ns) > 0) {
ns0 <- loadedNamespaces()
for (name in ns) {
  try(unloadNamespace(name), silent=TRUE)
}
ns1 <- loadedNamespaces()
## No namespace was unloaded?
if (identical(ns1, ns0)) break
ns <- intersect(ns, ns1)
  }
  if (length(ns) > 0) stop("Failed to unload namespace: ",
paste(sQuote(ns), collapse=", "))
} # unloadNamespaces()


When I run the above on R 3.3.0 patched on Windows, I get:

> unloadNamespaces(nsAdded)
now dyn.unload("C:/Users/hb/R/win-library/3.3/scde/libs/x64/scde.dll") ...
> ns2 <- loadedNamespaces()
> dlls2 <- getLoadedDLLs()
> ns2
[1] "grDevices" "utils" "stats" "datasets"  "base"  "graphics"
[7] "methods"
> identical(sort(ns2), sort(ns0))
[1] TRUE


However, there are some namespaces for which the DLLs are still loaded:

> sort(setdiff(names(dlls2), names(dlls0)))
 [1] "Cairo" "edgeR" "extRemes"  "minqa"
 [5] "nloptr""pcaMethods""quantreg"  "Rcpp"
 [9] "RcppArmadillo" "rjson" "Rook"  "SparseM"


If we look for .onUnload() in packages that load DLLs, we find that
the following does not have an .onUnload() and therefore probably does
neither call dyn.unload() when the package is unloaded:

> sort(dllsAdded[!sapply(dllsAdded, FUN=function(pkg) {
+   ns <- getNamespace(pkg)
+   exists(".onUnload", envir=ns, inherits=FALSE)
+ })])
 [1] "Cairo" "edgeR" "extRemes"  "minqa"
 [5] "nloptr""pcaMethods""quantreg"  "Rcpp"
 [9] "RcppArmadillo" "rjson" "Rook"  "SparseM"


That doesn't look like a coincident to me.  Maybe `R CMD check` should
in addition to checking that the namespace of a package can be
unloaded also assert that it unloads whatever DLL a package loads.
Something like:

* checking whether the namespace can be unloaded cleanly ... WARNING
  Unloading the namespace does not unload DLL

At least I don't think this is tested for, e.g.
https://cran.r-project.org/web/checks/check_results_Cairo.html and
https://cran.r-project.org/web/checks/check_results_Rcpp.html.

/Henrik


On Mon, May 9, 2016 at 11:57 PM, Martin Maechler
 wrote:
>> Qin Zhu 
>> on Fri, 6 May 2016 11:33:37 -0400 writes:
>
> > Thanks for all your great answers.
> > The app I’m working on is indeed an exploratory data analysis tool for 
> gene expression, which requires a bunch of bioconductor packages.
>
> > I guess for now, my best solution is to divide my app into modules and 
> load/unload packages as the user switch from one module to another.
>
>

Re: [Rd] Storage of byte code-compiled functions in sysdata.rda

2016-05-10 Thread Peter Ruckdeschel
Dear Luke,

thanks for taking a look at our problem and for checking it out, in particular 
for
sending us the tool function getbc.

This really sounds like we have somehow messed up our chain with different 
versions
of the byte-code compiler. We will try and fix this and will let you know if we 
succeed.

Thanks again, Peter


Am 05.05.2016 um 22:07 schrieb luke-tier...@uiowa.edu:
> I can't reproduce the more complex version. But the package on CRAN
> fails in the same way on 3.2.3 and 3.3.0.
>
> The problem is that your sysdata.rda includes a function that is
> generating this error. If you do
>
> f <- getFromNamespace(".RMXE", ns ="RobAStRDA")[["GEVFamily"]][["fun.N"]][[1]]
> g <- get("fct", environment(f))
>
> and look at the byte code for g with compiler::disassemble or the
> utility I'll paste in below you get
>
>> getbc(g)
> list(8L, BCMISMATCH.OP)
>
> The only way you can get a file like this is to byte compile and save
> in a version of R with a newer byte code version (the 8L) and then
> load and resave in an older version of R. If you load and run this
> code in an older (or newer) version of R it will revert to the
> standard interpeter using eval but will issue a warning once per
> session. If you try to run it in an R with byte code version 8L you
> get this error.
>
> I can make a small change to that this becomes a once-per-session
> warning, but even then you won't actually be running compiled code.
>
> So I think your task is to figure out how you are ended up with a
> sysdata.rda file created in this incompatible way. Something to look
> for might be whether a call from within your R-devel somehow manages
> to run an R process with an older R version.
>
> Let me know what you find out.
>
> luke
>
> Here is the little utility, adapted from compiler::disassemble:
>
> getbc <- function (code) 
> {
>  .CodeSym <- as.name(".Code")
>  disasm.const <- function(x) if (typeof(x) == "list" && length(x) >
>  0 && identical(x[[1]], .CodeSym))
>  disasm(x)
>  else x
>  disasm <- function(code) {
>  code[[2]] <- compiler:::bcDecode(code[[2]])
>  code[[3]] <- lapply(code[[3]], disasm.const)
>  code
>  }
>  if (typeof(code) == "closure") {
>  code <- .Internal(bodyCode(code))
>  if (typeof(code) != "bytecode")
>  stop("function is not compiled")
>  }
>  invisible(dput(disasm(.Internal(disassemble(code)))[[2]]))
> }
>
> On Sun, 1 May 2016, Peter Ruckdeschel wrote:
>
>> Thanks, Luke, for having a look to it.
>>
>> Sure, I can give you some reproducible example -- even in two degrees of
>> completeness ;-): see below.
>>
>> Thanks again, Peter
>>
>> %---
>> (I) first example
>> %---
>> Just to reproduce the error, on r-devel, try:
>>
>> install.packages("RobAStRDA")
>> require(RobAStRDA)
>> getFromNamespace(".RMXE", ns = 
>> "RobAStRDA")[["GEVFamily"]][["fun.N"]][[1]](1.3)
>>
>> %---
>> (II) an example also giving the context
>> %---
>> For the "complete" story, not only the R-code needs to be given, but also the
>> preparation steps to produce the packages on the right R version;
>>
>> so please follow steps (1)--(6) below; I am not 100% sure whether this 
>> already gives
>> you all information needed for this, but if not so please let me know.
>>
>> (1) create a minimal R-package "InterpolTry"
>>  with byte-compilation on in the DESCRIPTION file
>>  and with stats::approxfun imported in the NAMESPACE file
>>
>> (2) in an R session on R-devel do
>>
>> require(InterpolTry)
>> x <- 1:100
>> y <- 1:100
>> fun <- approxfun(x,y)
>> ## revise the next line accordingly to your local settings
>> SrcRPathInterpolTry <- 
>> RdaFile <- file.path(SrcRPathInterpolTry, "sysdata.rda")
>> save(fun, file = RdaFile)
>> tools::resaveRdaFiles(RdaFile)
>>
>> (3) re-build package InterpolTry and re-install it
>>
>> (4) create a minimal R package "UseInterpolTry", again
>>  with byte-compilation on in the DESCRIPTION file
>>  and with stats::approxfun and package "InterpolTry"
>>  imported in the NAMESPACE file
>>
>> (5) in the R folder of R package "UseInterpolTry" write a function
>>  useInterpolFct()  which goes like this
>>
>>  useInterpolFct <- function(x){
>>   fun <- getFromNamespace("fun", ns = "InterpolTry")
>>   fun(x)
>>  }
>>
>> export this function in the namespace and create an .Rd file to it
>>
>> (6) (re-)build package "UseInterpolTry" and (re-)install it and try
>>
>> require(UseInterpolTry)
>> useInterpolFct(5)
>>
>> Steps (1)--(6) work with R-3.1.3, but no longer with R-devel.
>>
>>
>>
>> Am 01.05.2016 um 14:12 schrieb Tierney, Luke:
>>> Can you provide a complete reproducible example?
>>>
>>> Sent from my iPhone
>>>
 On May 1, 2016, at 6:51 AM, Peter Ruckdeschel  
 wrote:

 Hi r-devels,

 we are seei

Re: [Rd] unloadNamespace problem in 3.3

2016-05-10 Thread Martin Maechler
> Jeroen Ooms 
> on Tue, 10 May 2016 16:39:08 +0200 writes:

> The following used to work in R 3.2.5 but not in 3.3.0:
> library(MASS)
> ns <-.getNamespace("MASS")
> unloadNamespace(ns)

or simply
 
ns <- getNamespace("MASS") ; unloadNamespace(ns)

yes, indeed.   That's a bug.

> Calling unloadNamespace("MASS") directly still works.

which seems to have been the main use, and the only one  used
in any R package on CRAN or Bioconductor, or else we would have
heard about it, long before release, .., right?

This is from the change,

  r70262 | maechler | 2016-03-02 12:36:19
  --
  unloadNamespace() no longer loads & unloads an unloaded namespace - fixing 
PR#16731 

and so I will fix it  ASAP.

Thank you, Jeroen, for the report!
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] unloadNamespace problem in 3.3

2016-05-10 Thread Jeroen Ooms
The following used to work in R 3.2.5 but not in 3.3.0:

  library(MASS)
  ns <-.getNamespace("MASS")
  unloadNamespace(ns)

Calling unloadNamespace("MASS") directly still works.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] complex NA's match(), etc: not back-compatible change proposal

2016-05-10 Thread Martin Maechler
This is an RFC / announcement related to the 2nd part of PR#16885
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16885
about  complex NA's.

The (somewhat rare) incompatibility in R's 3.3.0 match() behavior for the
case of complex numbers with NA & NaN's {which has been fixed for R 3.3.0
patched in the mean time} triggered some more comprehensive "research".

I found that we have had a long-standing inconsistency at least between the
documented and the real behavior.  I am claiming that the documented
behavior is desirable and hence R's current "real" behavior is bugous, and
I am proposing to change it, in R-devel (to be 3.4.0) for now.

In help(match) we have been saying

 |  Exactly what matches what is to some extent a matter of definition.
 |  For all types, \code{NA} matches \code{NA} and no other value.
 |  For real and complex values, \code{NaN} values are regarded
 |  as matching any other \code{NaN} value, but not matching \code{NA}.

for at least 10 years.  But we don't do that at all in the
complex case (and AFAIK never got a bug report about it).

Also, e.g., print(.) or format(.) do simply use  "NA" for all
the different complex NA-containing numbers, where OTOH,
non-NA NaN's { <=>  !is.nan(z) & is.na(z) }
in format() or print() do show the NaN in real and/or imaginary
parts; for an example, look at the "format" column of the matrix
below, after 'print(cbind' ...

The current match()---and duplicated(), unique() which are based on the same
C code---*do* distinguish almost all complex NA / NaN's which is
NOT according to documentation. I have found that this is just because of 
of our hashing function for the complex case, chash() in R/src/main/unique.c,
is bogous in the sense that it is not compatible with the above documentation
and also not with the cequal() function (in the same file uniqu.c) for checking
equality of complex numbers.

As I have found,, a *simplified* version of the chash() function
to make it compatible with cequal() does solve all the problems I've
indicated,  and the current plan is to commit that change --- after some
discussion time, here on R-devel ---  to the code base.

My change passes  'make check-all' fine, but I'm 100% sure that there will
be effects in package-space. ... one reason for this posting.

As mentioned above, note that the chash() function has been in
use for all three functions
 match()
 duplicated()
 unique()
and the change will affect all three --- but just for the case of complex
vectors with NA or NaN's.

To show more, a small R session -- using my version of R-devel
== the proposition: 
The R script ('complex-NA-short.R') for (a bit more than) the
session is attached {{you can attach  text/plain easily}}:

> x0 <- c(0,1, NA, NaN); z <- outer(x0,x0, complex, length.out=1); rm(x0)
> ##   --- = NA_real_  but that does not exist e.g., in R 2.3.1
> ##   similarly,  '1L', '2L', .. do not exist e.g., in R 2.3.1
> (z <- z[is.na(z)])
 [1]   NA NaN+  0i   NA NaN+  1i   NA   NA   NA   NA
 [9]   0+NaNi   1+NaNi   NA NaN+NaNi
> outerID <- function(x,y, ...) { ## ugly; can we get outer() to work ?
+ r <- matrix( , length(x), length(y))
+ for(i in seq(along=x))
+ for(j in seq(along=y))
+ r[i,j] <- identical(z[i], z[j], ...)
+ r
+ }
> ## Very strictly - in the sense of identical() -- these 12 complex numbers 
> all differ:
> ## a version that works in older versions of R, where identical() had fewer 
> arguments!
> outerID.picky <- function(x,y) {
+ nF <- length(formals(identical)) - 2
+ do.call("outerID", c(list(x, y), as.list(rep(FALSE, nF
+ }
> oldR <- !exists("getRversion") || getRversion() < "3.0.0" ## << FIXME: 3.0.0 
> is  a wild guess
> symnum(id.z <- outerID.picky(z,z)) ## == Diagonal matrix [newer versions of R]
 
 [1,] | . . . . . . . . . . .
 [2,] . | . . . . . . . . . .
 [3,] . . | . . . . . . . . .
 [4,] . . . | . . . . . . . .
 [5,] . . . . | . . . . . . .
 [6,] . . . . . | . . . . . .
 [7,] . . . . . . | . . . . .
 [8,] . . . . . . . | . . . .
 [9,] . . . . . . . . | . . .
[10,] . . . . . . . . . | . .
[11,] . . . . . . . . . . | .
[12,] . . . . . . . . . . . |
> try(# for older R versions
+ stopifnot(identical(id.z, outerID(z,z)), oldR || identical(id.z, diag(12) == 
1))
+ )
> (mz <- match(z, z)) # currently different {NA,NaN} patterns differ - not in 
> print()/format() _FIXME_
 [1] 1 2 1 2 1 1 1 1 2 2 1 2
> zRI <- rbind(Re=Re(z), Im=Im(z)) # and see the pattern :
> print(cbind(format = format(z), t(zRI), mz), quote=FALSE)
  format   Re   Im   mz
 [1,]   NA  01 
 [2,] NaN+  0i NaN  02 
 [3,]   NA  11 
 [4,] NaN+  1i NaN  12 
 [5,]   NA 0 1 
 [6,]   NA 1 1 
 [7,]   NA   1 
 [8,]   NA NaN   1 
 [9,]   0+NaNi 0NaN  2 
[10,]   1+NaNi 1NaN  2 
[11,]   NA  NaN  1 
[12,] NaN+NaNi NaN  NaN  2 
>
---
Note that 'mz <- ma

[Rd] recursion problem using do.call(rbind, list(..,,..))

2016-05-10 Thread Martin Maechler
This was originally a bug report about Matrix,
  
https://r-forge.r-project.org/tracker/?func=detail&atid=294&aid=6325&group_id=61
but the bug is rather a "design" bug in R, or a limitation.

This e-mail is a report of the status quo as I see it, and
call for comments, sugguests, help/hints for workarounds,
or even a suggestion for a programming task helping R core to
amend the status {re-writing the methods:::cbind and ...:rbind functions}:


If you read  ?rbind  carefully, you may have learned that rbind() and
cbind() are able to deal with S4 "matrix-like" objects, via the
hidden methods:::rbind /  methods:::cbind functions
where these recursively build on appropriate S4 methods for
rbind2() / cbind2().

That is how cbind() and rbind() work nowadays for Matrix-package
matrices.

However, there is problem lurking from the above paragraph, and
for experienced programmers / computer scientists that may even
be obvious: recursion.

A simple MRE (minimal reproducible example) for the problem seen with Matrix:

  S <- sparseMatrix(i=1:4, j=9:6, x=pi*1:4)
  n <- 410 # it succeeds for 407 -- sometimes even with 408
  X <- replicate(n, S, simplify = FALSE)
  Xb <- do.call("rbind", X)
  ## -> Error in as(x, "CsparseMatrix") : node stack overflow

But as I said, that's not really the Matrix package. A small
reproducible example, not involving the Matrix package at all:

  MM <- setClass("mmatrix", contains="matrix")
  setMethod("rbind2", signature(x="mmatrix", y="mmatrix"),
function(x,y) as(base::rbind(unclass(x), unclass(y)), "mmatrix"))

  (m5 <- MM(diag(5)))
  m45 <- MM(matrix(1:20, 4,5))
  rbind(m5, m45) # fine

  ## fine with 400 :
  mL.4c <- replicate(400, m45, simplify=FALSE)
  mmm4 <- do.call(rbind, mL.4c)
  stopifnot(is(mmm4, "mmatrix"), dim(mmm4) == c(1600L, 5L))
  ## not ok with 410 :
  mL.410 <- replicate(410, m45, simplify=FALSE)
  mmm4 <- do.call(rbind, mL.410)
  ## Error in getExportedValue(pkg, name) (from #2) : node stack overflow

and the "node stack overflow"  is not too helpful.
Unfortunately, this is more than one problem, the first one being
that recursive function calls nowadays often end with this
"node stack overflow" error, rather than the somewhat more understandable
error message, namely

  Error: evaluation nested too deeply: infinite recursion / 
options(expressions=)?

And even worse, that nowadays increasing the option to a higher number N,
options(expressions = N)

does not help at all in this situation, once the code is byte
compiled which of course is true for everything in base R.

But that is basically only the reason for the not-so-helpful
error message (and that raising 'expressions' does not help!),
but the underlying problem is somewhat harder, namely the full
setup the S4-serving methods:::rbind() {and ...cbind()} using
recursion (in the way it does) at all.

There is a simple, in my eyes ugly workaround which also will not scale well,
but works fine and fast enough for the above example:

## Simple ugly workaround .. but well fast enough :
> system.time({
+ r <- mL.410[[1]]
+ for(i in seq_along(mL.410)[-1])
+ r <- rbind2(r, mL.410[[i]])
+ })
   user  system elapsed 
  0.083   0.000   0.083 
> dim(r)
[1] 16405
> 

This should help Ben (the OP of the Matrix bug), and may be
something like that should also guide on how to re-write
the  methods:::rbind()  / methods:::cbind()  in a non-recursive
fashion ?

Thank you for your thoughts.

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel