Re: [R] Google's R Style Guide (has become S3 vs S4, in part)

2009-09-08 Thread Martin Maechler
 Martin Morgan mtmor...@fhcrc.org
 on Tue, 01 Sep 2009 09:07:05 -0700 writes:

 spencerg wrote:
 Bryan Hanson wrote:
 Looks like the discussion is no longer about R Style, but S3 vs S4?

 yes nice topic rename!

 
 To that end, I asked more or less the same question a few weeks ago,
 arising
 from the much the same motivations.  The discussion was helpful,
 here's the
 link: 
 
http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S
 
 3-to-S4-tc24901482.html#a24904049
 
 For what it's worth, I decided, but with some ambivalence, to stay
 with S3
 for now and possibly move to S4 later.  In the spirit of S4, I did
 write a
 function that is nearly the equivalent of validObject for my S3 object 
of
 interest.
 
 Overall, it looked like I would have to spend a lot of time moving to 
S4,
 while staying with S3 would allow me to get the project done and get
 results
 going much faster (see Frank Harrell's comment in the thread above).

 Bryan's original post started me thinking about this, but I didn't
 respond. I'd classify myself as an 'S4' 'expert', with my ignorance of
 S3 obvious from Duncan's corrections to my earlier post. It's hard for
 me to make a comparative statement about S3 vs. S4, and hard really to
 know what is 'hard' for someone new to S4, to R, to programming, ... I
 would have classified most of the responses in that thread as coming
 from 'S3' 'experts'.

 As a concrete example (concrete for us non-programmers,
 non-statisticians),
 I recently decided that I wanted to add a descriptive piece of text to a
 number of my plots, and it made sense to include the text with the
 object.
 So I just added a list element to the existing S3 object, e.g.
 Myobject$descrip  No further work was necessary, I could use it right
 away.
 If instead, if I had made Myobject an S4 object, then I would have to go
 back, redefine the object, update validObject, and possibly write some
 new
 accessor and definitely constructor functions.  At least, that's how I
 understand the way one uses S4 classes.

 This is a variant of Gabor's comment, I guess, that it's easy to modify
 S3 on an as-needed basis. In S3, forgoing any pretext of 'best
 practices', one might

 s3 - structure(list(x=1:10, y=10:1), class=MyS3Object)
 ## some lines of code...
 if (aTest)
 s3$descraption - A description

 (either 'description' or 'discraption' is a typo, uncaught by S3).

 In S4 I'd have to change my class definition from

 setClass(MyS4Object, representation(x=numeric, y=numeric))

 to

 setClass(MyS4Object, representation(x=numeric, y=numeric,
 description=character))

 but the body of the code would look surprising similar

 s4 - new(MyS4Object, x=1:10, y=10:1)
 ## some lines of code...
 if (aTest)
 s...@description - A description

 (no typo, because I'd have been told that the slot 'discraption' didn't
 exist). In the S3 case the (implicit) class definition is a single line,
 perhaps nested deep inside a function. In S4 the class definition is in
 a single location.

 Best practices might make me want to have a validity method (x and y the
 same dimensions? 'description' of length 1?), to use a constructor and
 accessors (to provide an abstraction to separate the interface from its
 implementation), etc., but those issues are about best practices.

 A downstream consequence is that s4 always has a 'description' slot
 (perhaps initialized with an appropriate default in the 'prototype'
 argument of setClass, but that's more advanced), whereas s3 only
 sometimes has 'description'. So I'm forced to check
 is.null(s3$description) whenever I'm expecting a character vector.

 It doesn't stop there:  If you keep the same name for your
 redefined S4 class, I don't know what happens when you try to access
 stored objects of that class created before the change, but it might not
 be pretty.  If you give your redefined S4 class a different name, then

 Actually, the old object is loaded in R. It is not valid
 (validObject(originalS4) would complain about 'slots in class definition
 not in object'). One might write an 'updateObject' generic and method
 that detects and corrects this. This contrasts with S3, where there is
 no knowing whether the object is consistent with the current (implicit)
 class definition.

 you have a lot more code to change before you can use the redefined
 class like you want.

 For slot addition, this is not true -- old code works fine. For slot
 removal / renaming, this is analogous to S3 -- code needs reworking; use
 of accessors might help isolate code using the class from the
 implementation of the class.

 A couple of 

Re: [R] Google's R Style Guide (has become S3 vs S4, in part)

2009-09-01 Thread Bryan Hanson
Looks like the discussion is no longer about R Style, but S3 vs S4?

To that end, I asked more or less the same question a few weeks ago, arising
from the much the same motivations.  The discussion was helpful, here's the
link:  

http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S
3-to-S4-tc24901482.html#a24904049

For what it's worth, I decided, but with some ambivalence, to stay with S3
for now and possibly move to S4 later.  In the spirit of S4, I did write a
function that is nearly the equivalent of validObject for my S3 object of
interest.

Overall, it looked like I would have to spend a lot of time moving to S4,
while staying with S3 would allow me to get the project done and get results
going much faster (see Frank Harrell's comment in the thread above).

As a concrete example (concrete for us non-programmers, non-statisticians),
I recently decided that I wanted to add a descriptive piece of text to a
number of my plots, and it made sense to include the text with the object.
So I just added a list element to the existing S3 object, e.g.
Myobject$descrip  No further work was necessary, I could use it right away.
If instead, if I had made Myobject an S4 object, then I would have to go
back, redefine the object, update validObject, and possibly write some new
accessor and definitely constructor functions.  At least, that's how I
understand the way one uses S4 classes.

Back to trying to get something done!  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA





On 9/1/09 6:16 AM, Duncan Murdoch murd...@stats.uwo.ca wrote:

 Corrado wrote:
 Thanks Duncan, Spencer,
 
 To clarify, the situation is:
 
 1) I have no reasons to choose S3 on S4 or vice versa, or any other coding
 convention
 2) Our group has not done any OO developing in R and I would be the first, so
 I 
 can set up the standards
 3) I am starting from scratch with a new package, so I do not have any code I
 need to re-use.
 4) I am an R OO newbie, so whatever I can learn from the beginning what is
 better and good for me.
 
 So the questions would be two:
 
 1) What coding style guide should we / I follow? Is the google style guide
 good, or is there something better / more prescriptive which makes our
 research group life easier?
   
 
 I don't think I can answer that.  I'd recommend planning to spend some
 serious time on the decision, and then go by your personal impression.
 S4 is definitely harder to learn but richer, so don't make the decision
 too quickly.  Take a look at John Chamber's new book, try small projects
 in each style, etc.
 
 2) What class type should I use? From what you two say, I should use S3
 because is easier to use  what are the disadvantages? Is there an
 advantages / disadvantages table for S3 and S4 classes?
   
 
 S3 is much more limited than S4.  It dispatches on just one argument, S4
 can dispatch on several.  S3 allows you to declare things to be of a
 certain class with no checks that anything will actually work; S4 makes
 it easier to be sure that if you say something is of a certain class, it
 really is.  S4 hides more under the hood: if you understand how regular
 R functions work, learning S3 is easy, but there's still a lot to learn
 before you'll be able to use S4 properly.
 
 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Google's R Style Guide (has become S3 vs S4, in part)

2009-09-01 Thread spencerg

Bryan Hanson wrote:

Looks like the discussion is no longer about R Style, but S3 vs S4?

To that end, I asked more or less the same question a few weeks ago, arising
from the much the same motivations.  The discussion was helpful, here's the
link:  


http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S
3-to-S4-tc24901482.html#a24904049

For what it's worth, I decided, but with some ambivalence, to stay with S3
for now and possibly move to S4 later.  In the spirit of S4, I did write a
function that is nearly the equivalent of validObject for my S3 object of
interest.

Overall, it looked like I would have to spend a lot of time moving to S4,
while staying with S3 would allow me to get the project done and get results
going much faster (see Frank Harrell's comment in the thread above).

As a concrete example (concrete for us non-programmers, non-statisticians),
I recently decided that I wanted to add a descriptive piece of text to a
number of my plots, and it made sense to include the text with the object.
So I just added a list element to the existing S3 object, e.g.
Myobject$descrip  No further work was necessary, I could use it right away.
If instead, if I had made Myobject an S4 object, then I would have to go
back, redefine the object, update validObject, and possibly write some new
accessor and definitely constructor functions.  At least, that's how I
understand the way one uses S4 classes.
  
 It doesn't stop there:  If you keep the same name for your 
redefined S4 class, I don't know what happens when you try to access 
stored objects of that class created before the change, but it might not 
be pretty.  If you give your redefined S4 class a different name, then 
you have a lot more code to change before you can use the redefined 
class like you want. 



 By contrast, with S3, if you have any code that tests the number 
of components in a list, that will have to be changed. 



 Spencer

Back to trying to get something done!  Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA





On 9/1/09 6:16 AM, Duncan Murdoch murd...@stats.uwo.ca wrote:

  

Corrado wrote:


Thanks Duncan, Spencer,

To clarify, the situation is:

1) I have no reasons to choose S3 on S4 or vice versa, or any other coding
convention
2) Our group has not done any OO developing in R and I would be the first, so
I 
can set up the standards

3) I am starting from scratch with a new package, so I do not have any code I
need to re-use.
4) I am an R OO newbie, so whatever I can learn from the beginning what is
better and good for me.

So the questions would be two:

1) What coding style guide should we / I follow? Is the google style guide
good, or is there something better / more prescriptive which makes our
research group life easier?
  
  

I don't think I can answer that.  I'd recommend planning to spend some
serious time on the decision, and then go by your personal impression.
S4 is definitely harder to learn but richer, so don't make the decision
too quickly.  Take a look at John Chamber's new book, try small projects
in each style, etc.



2) What class type should I use? From what you two say, I should use S3
because is easier to use  what are the disadvantages? Is there an
advantages / disadvantages table for S3 and S4 classes?
  
  

S3 is much more limited than S4.  It dispatches on just one argument, S4
can dispatch on several.  S3 allows you to declare things to be of a
certain class with no checks that anything will actually work; S4 makes
it easier to be sure that if you say something is of a certain class, it
really is.  S4 hides more under the hood: if you understand how regular
R functions work, learning S3 is easy, but there's still a lot to learn
before you'll be able to use S4 properly.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Google's R Style Guide (has become S3 vs S4, in part)

2009-09-01 Thread Martin Morgan
spencerg wrote:
 Bryan Hanson wrote:
 Looks like the discussion is no longer about R Style, but S3 vs S4?

yes nice topic rename!


 To that end, I asked more or less the same question a few weeks ago,
 arising
 from the much the same motivations.  The discussion was helpful,
 here's the
 link: 
 http://www.nabble.com/Need-Advice%3A-Considering-Converting-a-Package-from-S

 3-to-S4-tc24901482.html#a24904049

 For what it's worth, I decided, but with some ambivalence, to stay
 with S3
 for now and possibly move to S4 later.  In the spirit of S4, I did
 write a
 function that is nearly the equivalent of validObject for my S3 object of
 interest.

 Overall, it looked like I would have to spend a lot of time moving to S4,
 while staying with S3 would allow me to get the project done and get
 results
 going much faster (see Frank Harrell's comment in the thread above).

Bryan's original post started me thinking about this, but I didn't
respond. I'd classify myself as an 'S4' 'expert', with my ignorance of
S3 obvious from Duncan's corrections to my earlier post. It's hard for
me to make a comparative statement about S3 vs. S4, and hard really to
know what is 'hard' for someone new to S4, to R, to programming, ... I
would have classified most of the responses in that thread as coming
from 'S3' 'experts'.

 As a concrete example (concrete for us non-programmers,
 non-statisticians),
 I recently decided that I wanted to add a descriptive piece of text to a
 number of my plots, and it made sense to include the text with the
 object.
 So I just added a list element to the existing S3 object, e.g.
 Myobject$descrip  No further work was necessary, I could use it right
 away.
 If instead, if I had made Myobject an S4 object, then I would have to go
 back, redefine the object, update validObject, and possibly write some
 new
 accessor and definitely constructor functions.  At least, that's how I
 understand the way one uses S4 classes.

This is a variant of Gabor's comment, I guess, that it's easy to modify
S3 on an as-needed basis. In S3, forgoing any pretext of 'best
practices', one might

s3 - structure(list(x=1:10, y=10:1), class=MyS3Object)
## some lines of code...
if (aTest)
s3$descraption - A description

(either 'description' or 'discraption' is a typo, uncaught by S3).

In S4 I'd have to change my class definition from

setClass(MyS4Object, representation(x=numeric, y=numeric))

to

setClass(MyS4Object, representation(x=numeric, y=numeric,
 description=character))

but the body of the code would look surprising similar

s4 - new(MyS4Object, x=1:10, y=10:1)
## some lines of code...
if (aTest)
s...@description - A description

(no typo, because I'd have been told that the slot 'discraption' didn't
exist). In the S3 case the (implicit) class definition is a single line,
perhaps nested deep inside a function. In S4 the class definition is in
a single location.

Best practices might make me want to have a validity method (x and y the
same dimensions? 'description' of length 1?), to use a constructor and
accessors (to provide an abstraction to separate the interface from its
implementation), etc., but those issues are about best practices.

A downstream consequence is that s4 always has a 'description' slot
(perhaps initialized with an appropriate default in the 'prototype'
argument of setClass, but that's more advanced), whereas s3 only
sometimes has 'description'. So I'm forced to check
is.null(s3$description) whenever I'm expecting a character vector.

  It doesn't stop there:  If you keep the same name for your
 redefined S4 class, I don't know what happens when you try to access
 stored objects of that class created before the change, but it might not
 be pretty.  If you give your redefined S4 class a different name, then

Actually, the old object is loaded in R. It is not valid
(validObject(originalS4) would complain about 'slots in class definition
not in object'). One might write an 'updateObject' generic and method
that detects and corrects this. This contrasts with S3, where there is
no knowing whether the object is consistent with the current (implicit)
class definition.

 you have a lot more code to change before you can use the redefined
 class like you want.

For slot addition, this is not true -- old code works fine. For slot
removal / renaming, this is analogous to S3 -- code needs reworking; use
of accessors might help isolate code using the class from the
implementation of the class.

A couple of comments on Duncan's

S3Foo - function(x=numeric(), y=numeric()) {
  structure(list(x=as.numeric(x), y=as.numeric(y)), class=S3Foo)
}

I used makeS3Foo to emphasize that it was a constructor, but in my own
code I use S3Foo(). Realizing that, as Henrik has now also pointed out,
I'm far from perfect, the use of as.numeric() combines validity checking
and coercion, which I think is not usually a good thing (even when
efficient). In particular this

  as.numeric(factor(c(one, two, three)))