At the moment, all of this is probably not that big of a deal yet, but
my suggestion has more of a mid-term/long-term character.
Below you find a little illustration. I'm probably asking too much, but
it'd be great if we could get a little discussion going on how to
improve the way of loading packages!
Best regards and thanks for R and all it's packages!
Janko
################################################################################
# PROOF OF CONCEPT
################################################################################
# 1) PROBLEM
# IMHO, with the number of packages submitted to CRAN constantly
increasing,
# over time we will be likely to see problems with respect to name
clashes.
# The main reasons I see for this are the following:
# a) package developers picking identical names for their exported
functions
# b) package developers overwriting base functions in order to add
functionality
# to existing functions
# c) ...
#
# This can create scenarios in which the user might not exactly know that
# he/she is using a 'modified' version of a specific function. More so,
the user
# needs to carefully read the description of each new package he plans
# to use in order to find out which functions are exported and which
existing
# functions might be overwritten. This in turn might imply that the user's
# existing code needs to be refactored (i.e. instead of using 'fun()' it
# might now be necessary to type 'namespace::fun()' to be sure that the
desired
# function is called).
# 2) SUGGESTED SOLUTION
# That being said, why don't we switch to a 'preemptive' coding paradigm
# where the default way of calling functions includes the specification of
# its namespace? In principle, the functionality offered by
'namespace::fun()'
# gets the job done.
# BUT:
# a) it is slower compared to the direct way of calling a function.
# (see illustration below).
# b) this option is not available througout the development process of a
package
# as there is no namespace yet and there's no way to emulate one.
This in
# turn means that even though a package developer would buy into
strictly
# using 'mypkg::fun()' throughout his package code, he can only do so
at the
# very final stage of the process RIGHT before turning his code into a
# working package (when he's absolutely sure everything is working as
planned).
# For debugging he would need to go back to using 'fun()'. Pretty
cumbersome.
# So how about simply automatically prepending a given function's name
with
# the package's name for each package that is build (e.g. 'pkg.fun()' or
# 'pkg_fun()')? In the end, this would just be a small change for new
packages
# without a significant decrease of performance and it could also be
realized
# at early stages of the development process (see illustration below).
# 3) ILLUSTRATION
# Example case where base function 'parse.default' is overwritten:
parse(text="a<- 5") # Works
require(R.utils)
require(roxygen)
parse(text="a<- 5") # Does not work anymore
################# START A NEW R SESSION BEFORE YOU CONTINUE
####################
# Inefficiency of 'namespace::fun()':
require(microbenchmark)
res.a<- microbenchmark(eval(parse(text="a<- 5")))
res.b<- microbenchmark(eval(base::parse(text="a<- 5")))
median(res.a$time)/median(res.b$time)
# Can be made up by explicit assignment:
foo<- base::parse
res.a<- microbenchmark(eval(parse(text="a<- 5")))
res.b<- microbenchmark(eval(foo(text="a<- 5")))
median(res.a$time)/median(res.b$time)
# Automatically prepend function names:
processNamespaces<- function(
do.global=FALSE,
do.verbose=FALSE,
.delim.name="_",
...
){
srch.list.0<- search()
srch.list<- gsub("package:", "", srch.list.0)
if(!do.global){
assign(".NS", new.env(), envir=.GlobalEnv)
}
out<- lapply(1:length(srch.list), function(x.pkg){
pkg<- srch.list[x.pkg]
# SKIP LIST
if(pkg %in% c(".GlobalEnv", "Autoloads")){
return(NULL)
}
# /
# TARGET ENVIR
if(!do.global){
# ADD PACKAGE TO .NS ENVIRONMENT
envir<- eval(substitute(
assign(PKG, new.env(), envir=.NS),
list(PKG=pkg)
))
# /
# envir<- get(pkg, envir=.NS, inherits=FALSE)
envir.msg<- paste(".NS$", pkg, sep="")
} else {
envir<- .GlobalEnv
envir.msg<- ".GlobalEnv"
}
# /
# PROCESS FUNCTIONS
cnt<- ls(pos=x.pkg)
out<- unlist(sapply(cnt, function(x.cnt){
value<- get(x.cnt, pos=x.pkg, inherits=FALSE)
obj.mod<- paste(pkg, x.cnt, sep=.delim.name)
if(!is.function(value)){
return(NULL)
}
if(do.verbose){
cat(paste("Assigning '", obj.mod, "' to '", envir.msg,
"'", sep=""), sep="\n")
}
eval(substitute(
assign(OBJ.MOD, value, envir=ENVIR),
list(
OBJ.MOD=obj.mod,
ENVIR=envir
)
))
return(obj.mod)
}))
names(out)<- NULL
# /
return(out)
})
names(out)<- srch.list
return(out)
}
# +++++
funs<- processNamespaces(do.verbose=TRUE)
ls(.NS)
ls(.NS$base)
.NS$base$base_parse
res.a<- microbenchmark(eval(parse(text="a<- 5")))
res.b<- microbenchmark(eval(.NS$base$base_parse(text="a<- 5")))
median(res.a$time)/median(res.b$time)
#+++++
funs<- processNamespaces(do.global=TRUE, do.verbose=TRUE)
base_parse
res.a<- microbenchmark(eval(parse(text="a<- 5")))
res.b<- microbenchmark(eval(base_parse(text="a<- 5")))
median(res.a$time)/median(res.b$time)
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel