Re: [Rd] Best style to organize code, namespaces

Duncan Murdoch Tue, 23 Feb 2010 14:43:06 -0800

On 22/02/2010 10:15 PM, Gabor Grothendieck wrote:

As you mention ease of debugging basically precludes subfunctions so
style 1 is left.

I think you are also unaware of setBreakpoint() (and trace(), whichunderlies it). So more detail:


If you put this into file test.R, starting on line 1:

outer <- function(x) {
   inner <- function() {
       if (x > 3) {
          x <- 0
       }
   }

   inner()
   print(x)
}

you might want to debug why outer(10) prints 10, not 0. So set abreakpoint on line 4 to see if you get there:


> setBreakpoint("test.R#4")
c:\temp\test.R#4:
 outer step 2,3,3,2,3,2 in <environment: R_GlobalEnv>
> outer(10)
test.R#4
Called from: eval(expr, envir, enclos)
Browse[1]>

Yes, we got there.  Take a step:

Browse[1]> n
debug: x <- 0

Now recognize we should have used x <<- 0.

Now, it would be nice if we had more flexible debugging (e.g. singlestepping that would stay as single stepping when we exited frominner()), but debug() is certainly not the only possibility fordebugging. It's not even the best choice in a lot of situations whereit does work.


Duncan Murdoch


Functions can be nested in environments rather than in other functions
and this will allow debugging to still occur.

The proto package which makes it particularly convenient to nest
functions in environments giving an analog to #3 while still allowing
debugging.  See http//:r-proto.googlecode.com

library(proto)
# p is proto object with variable a and method f
p <- proto(a = 1, f = function(., x = 1) .$a <- .$a + 1)
with(p, debug(f))
p$f()

debugging in: get("f", env = p, inherits = TRUE)(p, ...)
debug: .$a <- .$a + 1
Browse[2]>
exiting from: get("f", env = p, inherits = TRUE)(p, ...)
[1] 2

p$a

[1] 2


On Mon, Feb 22, 2010 at 9:49 PM, Ben <mi...@emerose.org> wrote:

Hi all,

I'm hoping someone could tell me what best practices are as far as
keeping programs organized in R.  In most languages, I like to keep
things organized by writing small functions.  So, suppose I want to
write a function that would require helper functions or would just be
too big to write in one piece.  Below are three ways to do this:


################### Style 1 (C-style) ###############
Foo <- function(x) {
 ....
}
Foo.subf <- function(x, blah) {
 ....
}
Foo.subg <- function(x, bar) {
 ....
}

################### Style 2 (Lispish?) ##############
Foo <- function(x) {
 Subf <- function(blah) {
   ....
 }
 Subg <- function(bar) {
   ....
 }
 ....
}

################### Object-Oriented #################
Foo <- function(x) {
 Subf <- function(blah) {
   ....
 }
 Subg <- function(bar) {
   ....
 }
 Main <- function() {
   ....
 }
 return(list(subf=subf, subg=subg, foo=foo))
}
################### End examples ####################

Which of these ways is best?  Style 2 seems at first to be the most
natural in R, but I found there are some major drawbacks.  First, it
is hard to debug.  For instance, if I want to debug Subf, I need to
first "debug(Foo)" and then while Foo is debugging, type
"debug(Subf)".  Another big limitation is that I can't write
test-cases (e.g. using RUnit) for Subf and Subg because they aren't
visible in any way at the global level.

For these reasons, style 1 seems to be better than style 2, if less
elegant.  However, style 1 can get awkward because any parameters
passed to the main function are not visible to the others.  In the
above case, the value of "x" must be passed to Foo.subf and Foo.subg
explicitly.  Also there is no enforcement of code isolation
(i.e. anyone can call Foo.subf).

Style 3 is more explicitly object oriented.  It has the advantage of
style 2 in that you don't need to pass x around, and the advantage of
style 1 in that you can still write tests and easily debug the
subfunctions.  However to actually call the main function you have to
type "Foo(x)$Main()" instead of "Foo(x)", or else write a wrapper
function for this.  Either way there is more typing.

So anyway, what is the best way to handle this?  R does not seem to
have a good way of managing namespaces or avoiding collisions, like a
module system or explicit object-orientation.  How should we get
around this limitation?  I've looked at sample R code in the
distribution and elsewhere, but so far it's been pretty
disappointing---most people seem to write very long, hard to
understand functions.

Thanks for any advice!

--
Ben

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best style to organize code, namespaces

Reply via email to