[Rd] package post install instructions

2009-07-02 Thread Romain Francois

Hello,

I've looked in tools:::.install_packages for some sort of hook that 
would let packages developers point to further instructions after a 
package is installed. For example, some packages need to setup 
environment variables, ...


Is there something I have missed ?

Romain

--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/qzSl : using ImageJ from R: the RImageJ package
|- http://tr.im/qzSJ : with semantics for java objects in rJava
`- http://tr.im/qzTs : Better completion popups

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conditional dependency between packages

2009-07-02 Thread Jon Olav Skoien

Hi Seth,

And thanks for your suggestion! I was not able to do exactly what you 
described (I have no earlier experience with using environments), but 
you mentioning .onLoad was a good starting point. I have now removed all 
references to pkg1 from the NAMESPACE, and wrote the following .onLoad 
function:


.onLoad - function(libname, pkgname) {
 if (pkg1 %in% rownames(utils:::installed.packages()) ) {
   library(pkg1)
   info = matrix(c(fun1, fun2, fun3, rep(pkg2, 3), rep(NA,3)), 
ncol = 3)

   registerS3methods(info, package = pkg1, env = environment(funInPkg2))
 }
}

New methods for functions fun1, fun2 and fun3 seem to be available if 
pkg1 is installed, while they are ignored if pkg1 is not installed. The 
function above loads pkg1 automatically if installed (I would prefer 
this to be optional), but at least it will not be necessary to download 
pkg1 (with all its dependencies) for users without interest in it.


I have not found any description of registerS3methods (except from an 
old .Rd file stating that it is not intended to be called directly), so 
there might be better ways of doing this... And I am sure there is a 
better way of assigning the right environment...


Thanks again,
Jon



Seth Falcon wrote:

Hi Jon,

* On 2009-06-30 at 15:27 +0200 Jon Olav Skoien wrote:
  
I work on two packages, pkg1 and pkg2 (in two different projects). pkg1 is 
quite generic, pkg2 tries to solve a particular problem within same field 
(geostatistics). Therefore, there might be users who want to use pkg2 as an 
add-on package to increase the functionality of pkg1. In other words, 
functions in pkg1 are based on the S3 class system, and I want pkg2 to 
offer methods for pkg2-objects to functions defined in pkg1, for users 
having both packages installed. Merging the packages or making pkg2 always 
depend pkg1 would be the easiest solution, but it is not preferred as most 
users will only be interested in one of the packages.



I'm not sure I understand the above, I think you may have a pkg2 where
you meant pkg1, but I'm not sure it matters.

I think the short version is, pkg2 can be used on its own but will do
more if pkg1 is available.  I don't think R's packaging system
currently supports conditional dependencies as you might like.
However, I think you can get the behavior you want by following a
recipe like:

* In pkg2 DESCRIPTION, list Suggests: pkg1.

* In pkg2 code, you might define a package-level environment and 
  in .onLoad check to see if pkg1 is available.


 PKG_INFO - new.env(parent=emptyenv())
 .onLoad - function(libname, pkgname) {
 if (check if pkg1 is available) {
PKG_INFO[[pkg1]] - TRUE
 }
 }

* Then your methods can check PKG_INFO[[pkg1]].


 
+ seth




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] V2.9.0 changes [Sec=Unclassified]

2009-07-02 Thread Martin Morgan
Troy Robertson wrote:
 Well...
 
 My performance problems were in the pass-by-value semantics of R.
 
 I have just changed my classes to inherit from .environment and then moved 
 data members from S4 slots to the .xData objects as Martin suggested.

Actually, I had hoped the take-home message would be in the final paragraph:

 Of course I haven't seen your code, but a different interpretation of
 your performance issues is that, within the rules of S4, you've chosen
 to implement functionality in an inefficient way. So it might be
 instructive to step back a bit and try to reformulate your data
 structures and methods. This is hard to do.

Martin

 That meant I could remove all my returns and assignments on all method calls.
 
 This has sped execution time for my model up by more than an order of 
 magnitude. Eg one test simulation from 1931 secs down to 175 secs.
 
 Not bad seeing as though the class structure, functionality and logic has not 
 been touched.
 
 I really do think S4 could benfit from having its slots stored in environment 
 when the class enherits from .environment.  It would be a lot more sensible 
 if my data members were still declared as S4 slots instead of having to hide 
 them in .xData
 
 Troy
 
 
 Troy Robertson
 Database and Computing Support Provider
 Southern Ocean Ecosystems, ERM/Fish
 Australian Antarctic Division
 Channel Highway, Kingston 7050
 PH: 03 62323571
 troy.robert...@aad.gov.au
 
 
 -Original Message-
 From: Martin Morgan [mailto:mtmor...@fhcrc.org]
 Sent: Tuesday, 23 June 2009 11:25 PM
 To: Troy Robertson
 Cc: 'r-devel@R-project.org'
 Subject: Re: [Rd] V2.9.0 changes [Sec=Unclassified]

 Troy Robertson wrote:
 Hi all,



 Prefix: I am a frustrated Java coder in R.
 ah good, if you're frustrated with Java you'll find R very different ;)



 I am coding a medium sized ecosystem modelling program in R.  I have
 changed to using S4 objects and it has cost me an order of magnitude in
 execution speed over the functional model.  I cannot afford this penalty
 and have found that it is the result of all the passing-by-value of
 objects.


 I see that you can now safely inherit from environment in V2.9.0.

 That got me all excited that I would now be able to pass objects by
 reference.


 But...

 That doesn't seem to be the case.

 It only seem that passing an environment which holds the object allows
 for pass-by-reference and that passing an object which inherits from
 environment doesn't.
 Why is this the case, either an object inherits the properties of its
 parent or it doesn't.

 The object inherits slots from it's parent, and the methods defined on
 the parent class. Maybe this example helps?

 setClass(Ticker, contains=.environment)

 ## initialize essential, so each instance gets its own environment
 setMethod(initialize, Ticker,
   function(.Object, ..., .xData=new.env(parent=emptyenv()))
 {
 .xData[[count]] - 0
 callNextMethod(.Object, ..., .xData=.xData)
 })

 ## tick: increment (private) counter by n
 setGeneric(tick, function(reference, n=1L) standardGeneric(tick),
signature=reference)

 setMethod(tick, Ticker, function(reference, n=1L) {
 reference[[count]] - reference[[count]] + n
 })

 ## tock: report current value of counter
 setGeneric(tock, function(reference) standardGeneric(tock))

 setMethod(tock, Ticker, function(reference) {
 reference[[count]]
 })

 and then

 e - new(Ticker)
 tock(e)
 [1] 0
 tick(e); tick(e, 10); tock(e)
 [1] 11
 f - e
 tock(f); tick(e); tock(f)
 [1] 11
 [1] 12

 The data inside .environment could be structured, too, using S4.
 Probably it would be more appropriate to have the environment as a slot,
 rather the class that is being extended. And in terms of inherited
 'properties', e.g., the [[ function as defined on environments is
 available

 e[[count]]
 Of course I haven't seen your code, but a different interpretation of
 your performance issues is that, within the rules of S4, you've chosen
 to implement functionality in an inefficient way. So it might be
 instructive to step back a bit and try to reformulate your data
 structures and methods. This is hard to do.

 Martin

 Has anyone else had a play with this?  Or have I got it all wrong.



 I tried the below:

 
 -
 setClass('foo', representation=representation(stuff='list',
 bar='numeric'),
  prototype=list(stuff=list(), bar=0),

  contains='.environment')



 setGeneric('doit', function(.Object, newfoo='environment')
 standardGeneric('doit'))


 setMethod('doit', 'foo', function(.Object, newfoo){new...@bar - 10})



 z - new('foo')



 z...@stuff$x - new('foo')



 doit(z,z...@stuff$x)



 z...@stuff$x@bar



 [1] 0

 
 --


 Can anyone help with a better way of doing this.

 I'm trying to avoid all the 

Re: [Rd] V2.9.0 changes [Sec=Unclassified]

2009-07-02 Thread Gabor Grothendieck
On Thu, Jul 2, 2009 at 1:37 AM, Troy Robertsontroy.robert...@aad.gov.au wrote:
 Well...

 My performance problems were in the pass-by-value semantics of R.

 I have just changed my classes to inherit from .environment and then moved 
 data members from S4 slots to the .xData objects as Martin suggested.


Note that the R.oo and proto packages already use environments for
their storage. e.g.

library(proto)
p - proto(a = 1, incr = function(.) .$a - .$a + 1)
class(p) # c(proto, environment)

p$a # 1
p$incr()
p$a # 2

p$ls() #  c(a, incr)
ls(p) # same

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] OOP performance, was: V2.9.0 changes

2009-07-02 Thread Thomas Petzoldt

Hi Troy,

first of all a question, what kind of ecosystem models are you
developing in R? Differential equations or individual-based?

Your write that you are a frustrated Java developer in R. I have a
similar experience, however I still like JAVA, and I'm now more happy
with R as it is much more efficient (i.e. sum(programming + runtime))
for the things I usually do: ecological data analysis and modelling.

After using functional R quite a time and Java in parallel
I had the same idea, to make R more JAVA like and to model ecosystems in
an object oriented manner. At that time I took a look into R.oo (thanks
Henrik Bengtssson) and was one of the Co-authors of proto. I still think
that R.oo is very good and that proto is a cool idea, but finally I
switched to the recommended S4 for my ecological simulation package.

Note also, that my solution was *not* to model the ecosystems as objects
(habitat - populations- individuals), but instead to model ecological
models (equations, inputs, parameters, time steps, outputs, ...).

This works quite well with S4. A speed test (see useR!2006 poster on
http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
quite comparable performance.

The only thing I have to have in mind are a few rules:

- avoid unnecessary copying of large objects. Sometimes it helps to
prefer matrices over data frames.

- use vectorization. This means for an individual-based model that one
has to re-think how to model an individual: not many [S4] objects
like in JAVA, but R structures (arrays, lists, data frames) where
vectorized functions (e.g. arithmetics or subset) can work with.

- avoid interpolation (i.e. approx) and if unavoidable, minimize the tables.

If all these things do not help, I write core functions in C (others use
Fortran). This can be done in a mixed style and even a full C to C
communication is possible (see the deSolve documentation how to do this
with differential equation models).


Thomas P.



--
Thomas Petzoldt
Technische Universitaet Dresden
Institut fuer Hydrobiologiethomas.petzo...@tu-dresden.de
01062 Dresden  http://tu-dresden.de/hydrobiologie/
GERMANY

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conditional dependency between packages

2009-07-02 Thread Henrik Bengtsson
if (pkg1 %in% rownames(utils:::installed.packages()) ) {
  library(pkg1)
  ...
}

can be replaced by:

if (require(pkg1)) {
  ...
}

/Henrik

On Thu, Jul 2, 2009 at 5:29 AM, Jon Olav Skoienj.sko...@geo.uu.nl wrote:
 Hi Seth,

 And thanks for your suggestion! I was not able to do exactly what you
 described (I have no earlier experience with using environments), but you
 mentioning .onLoad was a good starting point. I have now removed all
 references to pkg1 from the NAMESPACE, and wrote the following .onLoad
 function:

 .onLoad - function(libname, pkgname) {
  if (pkg1 %in% rownames(utils:::installed.packages()) ) {
   library(pkg1)
   info = matrix(c(fun1, fun2, fun3, rep(pkg2, 3), rep(NA,3)), ncol =
 3)
   registerS3methods(info, package = pkg1, env = environment(funInPkg2))
  }
 }

 New methods for functions fun1, fun2 and fun3 seem to be available if pkg1
 is installed, while they are ignored if pkg1 is not installed. The function
 above loads pkg1 automatically if installed (I would prefer this to be
 optional), but at least it will not be necessary to download pkg1 (with all
 its dependencies) for users without interest in it.

 I have not found any description of registerS3methods (except from an old
 .Rd file stating that it is not intended to be called directly), so there
 might be better ways of doing this... And I am sure there is a better way of
 assigning the right environment...

 Thanks again,
 Jon



 Seth Falcon wrote:

 Hi Jon,

 * On 2009-06-30 at 15:27 +0200 Jon Olav Skoien wrote:


 I work on two packages, pkg1 and pkg2 (in two different projects). pkg1
 is quite generic, pkg2 tries to solve a particular problem within same field
 (geostatistics). Therefore, there might be users who want to use pkg2 as an
 add-on package to increase the functionality of pkg1. In other words,
 functions in pkg1 are based on the S3 class system, and I want pkg2 to offer
 methods for pkg2-objects to functions defined in pkg1, for users having both
 packages installed. Merging the packages or making pkg2 always depend pkg1
 would be the easiest solution, but it is not preferred as most users will
 only be interested in one of the packages.


 I'm not sure I understand the above, I think you may have a pkg2 where
 you meant pkg1, but I'm not sure it matters.

 I think the short version is, pkg2 can be used on its own but will do
 more if pkg1 is available.  I don't think R's packaging system
 currently supports conditional dependencies as you might like.
 However, I think you can get the behavior you want by following a
 recipe like:

 * In pkg2 DESCRIPTION, list Suggests: pkg1.

 * In pkg2 code, you might define a package-level environment and  in
 .onLoad check to see if pkg1 is available.

     PKG_INFO - new.env(parent=emptyenv())
     .onLoad - function(libname, pkgname) {
         if (check if pkg1 is available) {
            PKG_INFO[[pkg1]] - TRUE
         }
     }

 * Then your methods can check PKG_INFO[[pkg1]].


     + seth


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] V2.9.0 changes [Sec=Unclassified]

2009-07-02 Thread Troy Robertson

 -Original Message-
 From: Martin Morgan [mailto:mtmor...@fhcrc.org]
 Sent: Thursday, 2 July 2009 10:58 PM
 To: Troy Robertson
 Cc: 'r-devel@R-project.org'
 Subject: Re: [Rd] V2.9.0 changes [Sec=Unclassified]

 Troy Robertson wrote:
  Well...
 
  My performance problems were in the pass-by-value semantics of R.
 
  I have just changed my classes to inherit from .environment and then
 moved data members from S4 slots to the .xData objects as Martin
 suggested.

 Actually, I had hoped the take-home message would be in the final
 paragraph:

  Of course I haven't seen your code, but a different interpretation of
  your performance issues is that, within the rules of S4, you've chosen
  to implement functionality in an inefficient way. So it might be
  instructive to step back a bit and try to reformulate your data
  structures and methods. This is hard to do.


Yes, it just takes a little time and playing around to work out the rules of R 
(and S4) though, and how to use them to your advantage rather than be limited 
by them.  I know I am still probably not using this functional language in the 
best way, but there you go.



  That meant I could remove all my returns and assignments on all method
 calls.
 
  This has sped execution time for my model up by more than an order of
 magnitude. Eg one test simulation from 1931 secs down to 175 secs.
 
  Not bad seeing as though the class structure, functionality and logic
 has not been touched.
 
  I really do think S4 could benfit from having its slots stored in
 environment when the class enherits from .environment.  It would be a lot
 more sensible if my data members were still declared as S4 slots instead
 of having to hide them in .xData
 
  Troy
 
 
  Troy Robertson
  Database and Computing Support Provider
  Southern Ocean Ecosystems, ERM/Fish
  Australian Antarctic Division
  Channel Highway, Kingston 7050
  PH: 03 62323571
  troy.robert...@aad.gov.au
 
 
  -Original Message-
  From: Martin Morgan [mailto:mtmor...@fhcrc.org]
  Sent: Tuesday, 23 June 2009 11:25 PM
  To: Troy Robertson
  Cc: 'r-devel@R-project.org'
  Subject: Re: [Rd] V2.9.0 changes [Sec=Unclassified]
 
  Troy Robertson wrote:
  Hi all,
 
 
 
  Prefix: I am a frustrated Java coder in R.
  ah good, if you're frustrated with Java you'll find R very different ;)
 
 
 
  I am coding a medium sized ecosystem modelling program in R.  I have
  changed to using S4 objects and it has cost me an order of magnitude in
  execution speed over the functional model.  I cannot afford this
 penalty
  and have found that it is the result of all the passing-by-value of
  objects.
 
 
  I see that you can now safely inherit from environment in V2.9.0.
 
  That got me all excited that I would now be able to pass objects by
  reference.
 
 
  But...
 
  That doesn't seem to be the case.
 
  It only seem that passing an environment which holds the object allows
  for pass-by-reference and that passing an object which inherits from
  environment doesn't.
  Why is this the case, either an object inherits the properties of its
  parent or it doesn't.
 
  The object inherits slots from it's parent, and the methods defined on
  the parent class. Maybe this example helps?
 
  setClass(Ticker, contains=.environment)
 
  ## initialize essential, so each instance gets its own environment
  setMethod(initialize, Ticker,
function(.Object, ..., .xData=new.env(parent=emptyenv()))
  {
  .xData[[count]] - 0
  callNextMethod(.Object, ..., .xData=.xData)
  })
 
  ## tick: increment (private) counter by n
  setGeneric(tick, function(reference, n=1L) standardGeneric(tick),
 signature=reference)
 
  setMethod(tick, Ticker, function(reference, n=1L) {
  reference[[count]] - reference[[count]] + n
  })
 
  ## tock: report current value of counter
  setGeneric(tock, function(reference) standardGeneric(tock))
 
  setMethod(tock, Ticker, function(reference) {
  reference[[count]]
  })
 
  and then
 
  e - new(Ticker)
  tock(e)
  [1] 0
  tick(e); tick(e, 10); tock(e)
  [1] 11
  f - e
  tock(f); tick(e); tock(f)
  [1] 11
  [1] 12
 
  The data inside .environment could be structured, too, using S4.
  Probably it would be more appropriate to have the environment as a
 slot,
  rather the class that is being extended. And in terms of inherited
  'properties', e.g., the [[ function as defined on environments is
  available
 
  e[[count]]
  Of course I haven't seen your code, but a different interpretation of
  your performance issues is that, within the rules of S4, you've chosen
  to implement functionality in an inefficient way. So it might be
  instructive to step back a bit and try to reformulate your data
  structures and methods. This is hard to do.
 
  Martin
 
  Has anyone else had a play with this?  Or have I got it all wrong.
 
 
 
  I tried the below:
 
  --
 --
  -
  setClass('foo', 

Re: [Rd] OOP performance, was: V2.9.0 changes [SEC=Unclassified]

2009-07-02 Thread Troy Robertson
Hi Thomas,

It is a population-based model, but I didn't develop the work.  I am just the 
programmer who has been given the job of coding it.  The goal is to allow for a 
plug and play type approach by users to construction of the model (of both 
elements and functionality).  Hence my focus on OO.

You are right about avoiding the copying of large objects.  That is what was 
killing things.  I am now working on vectorizing more of the number crunching 
and removing some of the nested for loops.  That should step things up a little 
too.

I do also need to investigate how to move some of the more expensive code to C.

Had a quick look at simecol which looks interesting.  Will point it out to my 
boss to check out too.

Thanks

Troy

 -Original Message-
 From: Thomas Petzoldt [mailto:thomas.petzo...@tu-dresden.de]
 Sent: Friday, 3 July 2009 1:31 AM
 To: Troy Robertson
 Cc: 'r-devel@R-project.org'
 Subject: OOP performance, was: [Rd] V2.9.0 changes [SEC=Unclassified]

 Hi Troy,

 first of all a question, what kind of ecosystem models are you
 developing in R? Differential equations or individual-based?

 Your write that you are a frustrated Java developer in R. I have a
 similar experience, however I still like JAVA, and I'm now more happy
 with R as it is much more efficient (i.e. sum(programming + runtime))
 for the things I usually do: ecological data analysis and modelling.

 After using functional R quite a time and Java in parallel
 I had the same idea, to make R more JAVA like and to model ecosystems in
 an object oriented manner. At that time I took a look into R.oo (thanks
 Henrik Bengtssson) and was one of the Co-authors of proto. I still think
 that R.oo is very good and that proto is a cool idea, but finally I
 switched to the recommended S4 for my ecological simulation package.

 Note also, that my solution was *not* to model the ecosystems as objects
 (habitat - populations- individuals), but instead to model ecological
 models (equations, inputs, parameters, time steps, outputs, ...).

 This works quite well with S4. A speed test (see useR!2006 poster on
 http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
 quite comparable performance.

 The only thing I have to have in mind are a few rules:

 - avoid unnecessary copying of large objects. Sometimes it helps to
 prefer matrices over data frames.

 - use vectorization. This means for an individual-based model that one
 has to re-think how to model an individual: not many [S4] objects
 like in JAVA, but R structures (arrays, lists, data frames) where
 vectorized functions (e.g. arithmetics or subset) can work with.

 - avoid interpolation (i.e. approx) and if unavoidable, minimize the
 tables.

 If all these things do not help, I write core functions in C (others use
 Fortran). This can be done in a mixed style and even a full C to C
 communication is possible (see the deSolve documentation how to do this
 with differential equation models).


 Thomas P.



 --
 Thomas Petzoldt
 Technische Universitaet Dresden
 Institut fuer Hydrobiologiethomas.petzo...@tu-dresden.de
 01062 Dresden  http://tu-dresden.de/hydrobiologie/
 GERMANY





___

Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not 
the
intended recipient, you are notified that use or dissemination of this 
communication is
strictly prohibited by Commonwealth law. If you have received this transmission 
in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 
3209 and
DELETE the message.
Visit our web site at http://www.antarctica.gov.au/
___

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] ggplot2 x axis question

2009-07-02 Thread Deepayan Sarkar
On Mon, Jun 29, 2009 at 9:05 AM, hadley wickhamh.wick...@gmail.com wrote:
 In that case, try:

 qplot(reorder(factor(model),delta),delta,data=growthm.bic)

 Deepayan: do you think there should also be a numeric method for reorder?

r-devel now has a reorder.default (replacing reorder.factor and
reorder.character), so reorder() should also work for numeric data.

-Deepayan

 Hadley

 On Mon, Jun 29, 2009 at 10:39 AM, Christopher
 Desjardinscddesjard...@gmail.com wrote:
 Hi Hadley,
 Thanks for the reply and the great graphing package. That code is giving me
 the following error:

 qplot(reorder(model,delta),delta,data=growthm.bic)
 Error in UseMethod(reorder) : no applicable method for reorder

 Cheers,
 Chris

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] OOP performance, was: V2.9.0 changes

2009-07-02 Thread Gabor Grothendieck
In terms of performance if you want the fastest
performance in R go with S3 and if you want
even faster performance rewrite your inner loops
in C.  All the other approaches will usually be slower.
Also S3 is simple, elegant and will result in less code
and take you much less time to design, program and
debug.

For 100% R code, particularly for simulations,
proto can sometimes be even faster than pure R code based
S3 as proto supports hand optimizations that cannot readily
be done in other systems.  (For unoptimized code it would
be slower.)  The key trick is based on its ability
to separate dispatching from calling so that if method f and
object p are unchanged in a loop
   for(...) p$f(...)
then the loop can be rewritten
  f - p$f; for(...) f(...)
Note that this still retains dynamic dispatch but
just factors it out of the loop.  With S3 the best you could
get would be for(...) f.p(...) where f is a method of class p
but this is really tantamount to not using OO at all since
no dispatch is done at all.

On Thu, Jul 2, 2009 at 11:31 AM, Thomas
Petzoldtthomas.petzo...@tu-dresden.de wrote:
 Hi Troy,

 first of all a question, what kind of ecosystem models are you
 developing in R? Differential equations or individual-based?

 Your write that you are a frustrated Java developer in R. I have a
 similar experience, however I still like JAVA, and I'm now more happy
 with R as it is much more efficient (i.e. sum(programming + runtime))
 for the things I usually do: ecological data analysis and modelling.

 After using functional R quite a time and Java in parallel
 I had the same idea, to make R more JAVA like and to model ecosystems in
 an object oriented manner. At that time I took a look into R.oo (thanks
 Henrik Bengtssson) and was one of the Co-authors of proto. I still think
 that R.oo is very good and that proto is a cool idea, but finally I
 switched to the recommended S4 for my ecological simulation package.

 Note also, that my solution was *not* to model the ecosystems as objects
 (habitat - populations- individuals), but instead to model ecological
 models (equations, inputs, parameters, time steps, outputs, ...).

 This works quite well with S4. A speed test (see useR!2006 poster on
 http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
 quite comparable performance.

 The only thing I have to have in mind are a few rules:

 - avoid unnecessary copying of large objects. Sometimes it helps to
 prefer matrices over data frames.

 - use vectorization. This means for an individual-based model that one
 has to re-think how to model an individual: not many [S4] objects
 like in JAVA, but R structures (arrays, lists, data frames) where
 vectorized functions (e.g. arithmetics or subset) can work with.

 - avoid interpolation (i.e. approx) and if unavoidable, minimize the tables.

 If all these things do not help, I write core functions in C (others use
 Fortran). This can be done in a mixed style and even a full C to C
 communication is possible (see the deSolve documentation how to do this
 with differential equation models).


 Thomas P.



 --
 Thomas Petzoldt
 Technische Universitaet Dresden
 Institut fuer Hydrobiologie        thomas.petzo...@tu-dresden.de
 01062 Dresden                      http://tu-dresden.de/hydrobiologie/
 GERMANY

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] OOP performance, was: V2.9.0 changes [SEC=Unclassified]

2009-07-02 Thread Troy Robertson
Hi Gabor,

Look, I agree with you about S3 and have at times wished I had chosen that path 
rather than S4.  It seems to do the things I struggle to find answers for with 
S4.  But..., knowing little about R before engaging with this project, I 
decided to go with the latest OO framework, S4.  I do now find that I am 
undoing some of it, such as the use of data member slots, in order to implement 
pass-by-ref via environments and improve performance.  But its all a learning 
experience.

Troy


 -Original Message-
 From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
 Sent: Friday, 3 July 2009 10:29 AM
 To: Troy Robertson; r-devel@R-project.org
 Subject: Re: [Rd] OOP performance, was: V2.9.0 changes [SEC=Unclassified]

 In terms of performance if you want the fastest
 performance in R go with S3 and if you want
 even faster performance rewrite your inner loops
 in C.  All the other approaches will usually be slower.
 Also S3 is simple, elegant and will result in less code
 and take you much less time to design, program and
 debug.

 For 100% R code, particularly for simulations,
 proto can sometimes be even faster than pure R code based
 S3 as proto supports hand optimizations that cannot readily
 be done in other systems.  (For unoptimized code it would
 be slower.)  The key trick is based on its ability
 to separate dispatching from calling so that if method f and
 object p are unchanged in a loop
for(...) p$f(...)
 then the loop can be rewritten
   f - p$f; for(...) f(...)
 Note that this still retains dynamic dispatch but
 just factors it out of the loop.  With S3 the best you could
 get would be for(...) f.p(...) where f is a method of class p
 but this is really tantamount to not using OO at all since
 no dispatch is done at all.

 On Thu, Jul 2, 2009 at 11:31 AM, Thomas
 Petzoldtthomas.petzo...@tu-dresden.de wrote:
  Hi Troy,
 
  first of all a question, what kind of ecosystem models are you
  developing in R? Differential equations or individual-based?
 
  Your write that you are a frustrated Java developer in R. I have a
  similar experience, however I still like JAVA, and I'm now more happy
  with R as it is much more efficient (i.e. sum(programming + runtime))
  for the things I usually do: ecological data analysis and modelling.
 
  After using functional R quite a time and Java in parallel
  I had the same idea, to make R more JAVA like and to model ecosystems in
  an object oriented manner. At that time I took a look into R.oo (thanks
  Henrik Bengtssson) and was one of the Co-authors of proto. I still think
  that R.oo is very good and that proto is a cool idea, but finally I
  switched to the recommended S4 for my ecological simulation package.
 
  Note also, that my solution was *not* to model the ecosystems as objects
  (habitat - populations- individuals), but instead to model ecological
  models (equations, inputs, parameters, time steps, outputs, ...).
 
  This works quite well with S4. A speed test (see useR!2006 poster on
  http://simecol.r-forge.r-project.org/) showed that all OOP flavours had
  quite comparable performance.
 
  The only thing I have to have in mind are a few rules:
 
  - avoid unnecessary copying of large objects. Sometimes it helps to
  prefer matrices over data frames.
 
  - use vectorization. This means for an individual-based model that one
  has to re-think how to model an individual: not many [S4] objects
  like in JAVA, but R structures (arrays, lists, data frames) where
  vectorized functions (e.g. arithmetics or subset) can work with.
 
  - avoid interpolation (i.e. approx) and if unavoidable, minimize the
 tables.
 
  If all these things do not help, I write core functions in C (others use
  Fortran). This can be done in a mixed style and even a full C to C
  communication is possible (see the deSolve documentation how to do this
  with differential equation models).
 
 
  Thomas P.
 
 
 
  --
  Thomas Petzoldt
  Technische Universitaet Dresden
  Institut fuer Hydrobiologiethomas.petzo...@tu-dresden.de
  01062 Dresden  http://tu-dresden.de/hydrobiologie/
  GERMANY
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 
___

Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not 
the
intended recipient, you are notified that use or dissemination of this 
communication is
strictly prohibited by Commonwealth law. If you have received this transmission 
in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 
3209 and
DELETE the message.
Visit our web site at http://www.antarctica.gov.au/
___

__
R-devel@r-project.org mailing