from:"Ross Boylan"

[R-pkg-devel] how to document method arguments that aren't in the signature

2018-05-17 Thread Ross Boylan

What's the best way to document an S4 method that takes arguments
beyond those in the signature?

Consider
setGeneric("sim", function(simP, dataP, ...) standardGeneric("sim"))

setMethod("sim", signature="SimParameters", function(simP, dataP) {
lapply(seq(simP@NIter), function(i) do.one(simP, dataP, i))
 }
)

For which promptClass generates
\section{Methods}{
  \describe{
\item{sim}{\code{signature(simP = "SimParameters")}: ... }
 }
}

I turned that into
\item{sim}{\code{signature(simP = "SimParameters", datap)}: ... }

which seems a little funny since the real signature only mentions one
argument. R CMD check does not complain about it, however. Since omitted
arguments are effectively class "ANY", one alternative is 
\item{sim}{\code{signature(simP = "SimParameters", datap = "ANY")}: ... }

I also considered adding the non-signature arguments in the text.

Finally, although datap is formally untyped, there are requirements on
what kind of object it can be.  In practice it is only likely to be
from one of two classes, but I want to allow the users to make their own.

Thanks for your thoughts.

Ross Boylan

P.S. And what would I do if a particular method actually used an
argument in ..., e.g.,
setMethod("sim", signature="SimParameters", function(simP, dataP, bar,
...)
?
How would one document the bar argument?

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[Rd] modifying a persistent (reference) class

2014-08-01 Thread Ross Boylan

I saved objects that were defined using several reference classes.
Later I modified the definition of reference classes a bit, creating new
functions and deleting old ones.  The total number of functions did not
change.  When I read them back I could only access some of the original
data.

I asked on the user list and someone suggested sticking with the old
class definitions, creating new classes, reading in the old data, and
converting it to the new classes.  This would be awkward (I want the
new classes to have the same name as the old ones), and I can
probably just leave the old definitions and define the new functions I
need outside of the reference classes.

Are there any better alternatives?

On reflection, it's a little surprising that changing the code for a
reference class makes any difference to an existing instance, since all
the function definitions seem to be attached to the instance.  One
problem I've had in the past was precisely that redefining a method in a
reference class did not change the behavior of existing instances.  So
I've tried to follow the advice to keep the methods light-weight.

In this case I was trying to move from a show method (that just printed)
to a summary method that returned a summary object.  So I wanted to add
a summary method and redefine the show to call summary in the base
class, removing all the subclass definitions of show.

Regular S4 classes are obviously not as sensitive since they usually
don't include the functions that operate on them, but I suppose if you
changed the slots you'd be in similar trouble.

Some systems keep track of versions of class definitions and allow one
to write code to migrate old to new forms automatically when the data
are read in.  Does R have anything like that?

The system on which I encountered the problems was running R 2.15.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] modifying a persistent (reference) class

2014-08-01 Thread Ross Boylan

On Fri, 2014-08-01 at 14:42 -0400, Brian Lee Yung Rowe wrote:
 Ross,
 
 
 This is generally a hard problem in software systems. The only
 language I know that explicitly addresses it is Erlang. Ultimately you
 need a system upgrade process, which defines how to update the data in
 your system to match a new version of the system. You could do this by
 writing a script that 
 1) loads the old version of your library
 2) loads your data/serialized reference classes
 3) exports data to some intermediate format (eg a list)
 4) loads new version of library
 5) imports data from intermediate format
My recollection is that in Gemstone's smalltalk database you can define
methods associated with a class that describe how to change an instance
from one version to another.  You also have the choice of upgrading all
persistent objects at once or doing so lazily, i.e., as they are
retrieved.

The brittleness of the representation depends partly on the details.  If
a class has 2 slots, a and b, and the only thing on disk is the contents
of a and the contents of b, almost any change will screw things up.
However, if the slot name is persisted with the instance it's much
easier to reconstruct the instance of the class changes (if slot c is
added and not on disk, set it to nil; if b is removed, throw it out when
reading from disk).  Once could also persist the class definition, or
key elements of it, with individual instances referring to the
definition.

I don't know which, if any of these strategies, R uses for reference or
other classes.
 
 
 Once you've gone through the upgrade process, arguably it's better to
 persist the data in a format that is decoupled from your objects since
 then future upgrades would simply be
 1) load new library
 2) import data from intermediate format
Arguably :)  As I said, some representations could do this
automatically.  And there are still issues such as a change in the type
of a slot, or rules for filling new slots, that would require
intervention.

In my experience with other object systems, usually methods are
attributes of the class.  For R reference classes they appear to be
attributes of the instance, potentially modifiable on a per-instance
basis.

Ross
 
 
 which is no different from day-to-day operation of your app/system (ie
 you're always writing to and reading from the intermediate format). 
 
 
 Warm regards,
 Brian
 
 •
 Brian Lee Yung Rowe
 Founder, Zato Novo
 Professor, M.S. Data Analytics, CUNY
 
 On Aug 1, 2014, at 1:54 PM, Ross Boylan r...@biostat.ucsf.edu wrote:
 
 
  I saved objects that were defined using several reference classes.
  Later I modified the definition of reference classes a bit, creating
  new
  functions and deleting old ones.  The total number of functions did
  not
  change.  When I read them back I could only access some of the
  original
  data.
  
  I asked on the user list and someone suggested sticking with the old
  class definitions, creating new classes, reading in the old data,
  and
  converting it to the new classes.  This would be awkward (I want the
  new classes to have the same name as the old ones), and I can
  probably just leave the old definitions and define the new functions
  I
  need outside of the reference classes.
  
  Are there any better alternatives?
  
  On reflection, it's a little surprising that changing the code for a
  reference class makes any difference to an existing instance, since
  all
  the function definitions seem to be attached to the instance.  One
  problem I've had in the past was precisely that redefining a method
  in a
  reference class did not change the behavior of existing instances.
   So
  I've tried to follow the advice to keep the methods light-weight.
  
  In this case I was trying to move from a show method (that just
  printed)
  to a summary method that returned a summary object.  So I wanted to
  add
  a summary method and redefine the show to call summary in the base
  class, removing all the subclass definitions of show.
  
  Regular S4 classes are obviously not as sensitive since they usually
  don't include the functions that operate on them, but I suppose if
  you
  changed the slots you'd be in similar trouble.
  
  Some systems keep track of versions of class definitions and allow
  one
  to write code to migrate old to new forms automatically when the
  data
  are read in.  Does R have anything like that?
  
  The system on which I encountered the problems was running R 2.15.
  
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
  

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] modifying a persistent (reference) class

2014-08-01 Thread Ross Boylan

On Fri, 2014-08-01 at 16:06 -0400, Brian Lee Yung Rowe wrote:
 Ross,
 
 
 Ah I didn't think about Smalltalk. Doesn't surprise me that they
 supported upgrades of this sort. That aside I think the question is
 whether it's realistic for a language like R to support such a
 mechanism automatically. Smalltalk and Erlang both have tight
 semantics that would be hard to establish in R (given the multiple
 object systems and dispatching systems). 
 
 
 I'm a functional guy so to me it's natural to separate the data from
 the functions/methods. Having spent years writing OOP code I walked
 away concluding that OOP makes things more complicated for the sake of
 being OOP (eg no first class functions). 
In smalltalk everything is an object, and that includes functions,
including class methods.
 Obviously that's changing, and in a language like R it's less of an
 issue. However, something like object serialization smells
 suspiciously similar. If you know that serializing objects is brittle,
 why not look for an alternative approach as opposed to chasing that
 rainbow?
My immediate problem is/was that I have serialized objects representing
weeks of CPU time.  I have to work with them, not some other
representation they might have.  And it's much more natural to work with
R's native persistence than some other scheme I cook up.

I think persistence requires serialization.  The serialization can be
more or less brittle, but I don't think there is an alternative to
serialization.

Since I just worked around my immediate problem a few minutes ago (by
retaining the original class definitions and using setMethod to create
summary methods), my interests are a bit more theoretical.

First, I'd like to understand more about exactly what is saved to disk
for reference and other classes, in particular how much meta-information
they contain.  And my mental model for reference class persistence is
clearly wrong, because in that model instances based on old definitions
come back intact (albeit not with the new method definitions or other
new slots), whereas mine seemed to come back damaged.

Second, I'm still hoping for some elegant way around this problem (how
to redefine classes and still use saved versions from older definitions)
for the future, both with reference and regular classes.  Or at least
some rules about what changes, if any, are safe to make in class
definitions after an instance has been persisted.
 
Third, if changes to R could make things better, I'm hoping some
developers might take them up.  I realize that is unlikely to happen,
for many good reasons, but I can still hope :)

Ross
 
 Warm regards,
 Brian
 
 •
 Brian Lee Yung Rowe
 Founder, Zato Novo
 Professor, M.S. Data Analytics, CUNY
 
 On Aug 1, 2014, at 3:33 PM, Ross Boylan r...@biostat.ucsf.edu wrote:
 
 
  On Fri, 2014-08-01 at 14:42 -0400, Brian Lee Yung Rowe wrote:
   Ross,
   
   
   This is generally a hard problem in software systems. The only
   language I know that explicitly addresses it is Erlang. Ultimately
   you
   need a system upgrade process, which defines how to update the
   data in
   your system to match a new version of the system. You could do
   this by
   writing a script that 
   1) loads the old version of your library
   2) loads your data/serialized reference classes
   3) exports data to some intermediate format (eg a list)
   4) loads new version of library
   5) imports data from intermediate format
  My recollection is that in Gemstone's smalltalk database you can
  define
  methods associated with a class that describe how to change an
  instance
  from one version to another.  You also have the choice of upgrading
  all
  persistent objects at once or doing so lazily, i.e., as they are
  retrieved.
  
  The brittleness of the representation depends partly on the
  details.  If
  a class has 2 slots, a and b, and the only thing on disk is the
  contents
  of a and the contents of b, almost any change will screw things up.
  However, if the slot name is persisted with the instance it's much
  easier to reconstruct the instance of the class changes (if slot c
  is
  added and not on disk, set it to nil; if b is removed, throw it out
  when
  reading from disk).  Once could also persist the class definition,
  or
  key elements of it, with individual instances referring to the
  definition.
  
  I don't know which, if any of these strategies, R uses for reference
  or
  other classes.
   
   
   Once you've gone through the upgrade process, arguably it's better
   to
   persist the data in a format that is decoupled from your objects
   since
   then future upgrades would simply be
   1) load new library
   2) import data from intermediate format
  Arguably :)  As I said, some representations could do this
  automatically.  And there are still issues such as a change in the
  type
  of a slot, or rules for filling new slots, that would require
  intervention.
  
  In my experience with other object systems, usually

Re: [Rd] modifying data in a package [a solution]

2014-03-20 Thread Ross Boylan

On Wed, 2014-03-19 at 19:22 -0700, Ross Boylan wrote:
 I've tweaked Rmpi and want to have some variables that hold data in the
 package.  One of the R files starts
 mpi.isend.obj - vector(list, 500) #mpi.request.maxsize())  
 
 mpi.isend.inuse - rep(FALSE, 500) #mpi.request.maxsize())
 
 and then functions update those variables with -.  When run:
   Error in mpi.isend.obj[[i]] - .force.type(x, type) :  
   
   cannot change value of locked binding for 'mpi.isend.obj'
 
 I'm writing to ask the proper way to accomplish this objective (getting
 a variable I can update in package namespace--or at least somewhere
 useful and hidden from the outside).
 
I've discovered one way to do it:
In one of the regular R files
mpi.global - new.env()

Then at the end of .onLoad in zzz.R:
assign(mpi.isend.obj, vector(list, mpi.request.maxsize()),
mpi.global)
and similary for the logical vector mpi.isend.inuse

Access with functions like this:
## Next 2 functions have 3 modes
  
##  foo()  returns foo from mpi.global  
  
##  foo(request) returns foo[request] from mpi.global   
  
##  foo(request, value) set foo[request] to value   
  
mpi.isend.inuse - function(request, value) {
if (missing(request))
return(get(mpi.isend.inuse, mpi.global))
i - request+1L
parent.env(mpi.global) - environment()
if (missing(value))
return(evalq(mpi.isend.inuse[i], mpi.global))
return(evalq(mpi.isend.inuse[i] - value, mpi.global))
}

# request, if present, must be a single value   
  
mpi.isend.obj - function(request, value){
if (missing(request))
return(get(mpi.isend.obj, mpi.global))
i - request+1L
parent.env(mpi.global) - environment()
if (missing(value))
return(evalq(mpi.isend.obj[[i]], mpi.global))
return(evalq(mpi.isend.inuse[[i]] - value, mpi.global))
}

This is pretty awkward; I'd love to know a better way.  Some of the
names probably should change too: mpi.isend.obj() sounds too much as if
it actually sends something, like mpi.isend.Robj().

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] modifying data in a package

2014-03-19 Thread Ross Boylan

I've tweaked Rmpi and want to have some variables that hold data in the
package.  One of the R files starts
mpi.isend.obj - vector(list, 500) #mpi.request.maxsize())
  
mpi.isend.inuse - rep(FALSE, 500) #mpi.request.maxsize())

and then functions update those variables with -.  When run:
  Error in mpi.isend.obj[[i]] - .force.type(x, type) :

  cannot change value of locked binding for 'mpi.isend.obj'

I'm writing to ask the proper way to accomplish this objective (getting
a variable I can update in package namespace--or at least somewhere
useful and hidden from the outside).

I think the problem is that the package namespace is locked.  So how do
I achieve the same effect?
http://www.r-bloggers.com/package-wide-variablescache-in-r-packages/
recommends creating an environment and then updating it.  Is that the
preferred route?  (It seems odd that the list should be locked but the
environment would be manipulable.  I know environments are special.)

The comments indicate that 500 should be mpi.request.maxsize().  That
doesn't work because mpi.request.maxsize calls a C function, and there
is an error that the function isn't loaded.  I guess the R code is
evaluated before the C libraries are loaded. The packages zzz.R starts
.onLoad - function (lib, pkg) {
library.dynam(Rmpi, pkg, lib)

So would moving the code into .onLoad after that work?  In that case,
how do I get the environment into the  proper scope?  Would
 .onLoad - function (lib, pkg) {
library.dynam(Rmpi, pkg, lib)
assign(mpi.globals, new.env(), environment(mpi.isend))
assign(mpi.isend.obj, vector(list, mpi.request.maxsize(),
mpi.globals)
work?

mpi.isend is a function in Rmpi.  But I'd guess the first assign will
fail because the environment is locked.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Does R ever move objecsts in memory?

2014-03-16 Thread Ross Boylan

R objects can disappear if they are garbage collected; can they move,
i.e., change their location in memory?

I don't see any indication this might happen in Writing R Extensions
or R Internals.  But I'd like to be sure.

Context: Rmpi serializes objects in raw vectors for transmission by
mpi.  Some send operations (isend) return before transmission is
complete and so need the bits to remain untouched until transmission
completes.  If a preserve a reference to the raw vector in R code that
will prevent it from being garbage collected, but if it gets moved
that would invalidate the transfer.

I was just using the blocking sends to avoid this problem, but the
result is significant delays.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 2 versions of same library loaded

2014-03-13 Thread Ross Boylan

On Thu, 2014-03-13 at 10:46 -0700, Ross Boylan wrote:
 1. My premise that R had no references to mpi was incorrect.  The logs
 show  
 24312: file=libmpi.so.1 [0];  needed
 by /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so [0]
 24312: find library=libmpi.so.1 [0]; searching
 24312:  search path=/usr/lib64/R/lib:/home/ross/install/lib
 (LD_LIBRARY_PATH)
 24312:   trying file=/usr/lib64/R/lib/libmpi.so.1
 24312:   trying file=/home/ross/install/lib/libmpi.so.1

Except there is no file /usr/lib64/R/lib/libmpi.so.1



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 2 versions of same library loaded

2014-03-13 Thread Ross Boylan

On Thu, 2014-03-13 at 10:46 -0700, Ross Boylan wrote:
 
 It seems very odd that the same Rmpi.so is requiring both the old and
 new libmpi.so (compare to the first 
 trace in in point 1).  There is this code in Rmpi.c:
 if (!dlopen(libmpi.so.0, RTLD_GLOBAL | RTLD_LAZY)
  !dlopen(libmpi.so, RTLD_GLOBAL | RTLD_LAZY)){
 
 
 So I'm still not sure what it's using, or if there is some mishmash of
 the 2. 

There is an explanation for the explicit load in the Changelog:
2007-10-24, version 0.5-5:

dlopen has been used to load libmpi.so explicitly. This is mainly useful
for Rmpi under OpenMPI where one might see many error messages:
mca: base: component_find: unable to open osc pt2pt: file not found
(ignored) if libmpi.so is not loaded with RTLD_GLOBAL flag.
http://www.stats.uwo.ca/faculty/yu/Rmpi/changelogs.htm

There is another interesting note about openmpi:
It looks like that the option --disable-dlopen is not necessary to
install Open MPI 1.6, at least on Debian. This might be R's .onLoad
correctly loading dynamic libraries and Open MPI is not required to be
compiled with static libraries enabled.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] 2 versions of same library loaded

2014-03-12 Thread Ross Boylan

Can anyone help me understand how I got 2 versions of the same library
loaded, how to prevent it, and what the consequences are?  Running under
Debian GNU/Linux squeeze.

lsof and /proc/xxx/map both show 2 copies of several libraries loaded:
/home/ross/install/lib/libmpi.so.1.3.0
/home/ross/install/lib/libopen-pal.so.6.1.0
/home/ross/install/lib/libopen-rte.so.7.0.0
/home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so
/usr/lib/openmpi/lib/libmpi.so.0.0.2
/usr/lib/openmpi/lib/libopen-pal.so.0.0.0
/usr/lib/openmpi/lib/libopen-rte.so.0.0.0
/usr/lib/R/lib/libR.so


The system has the old version of MPI installed under /usr/lib.  I built
a personal, newer copy in my directory, and then rebuilt Rmpi (an R
package) against it.  ldd on the personal Rmpi.so and libmpi.so shows
all references to mpi libraries on personal paths.

R was installed from a debian package, and presumably compiled without
having MPI around.  Before running I set LD_LIBRARY_PATH to
~/install/lib, and then stuffed the same path at the start of
LD_LIBRARY_PATH using Sys.setenv in my profile because R seems to
prepend some libraries to that path when it starts (I'm curious about
that too).  I also prepended ~/install/bin to my path, though I'm not
sure that's relevant.

Does R use ld.so or some other mechanism for loading libraries?

Can I assume the highest version number of a library will be preferred?
http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#index-Dynamic-loading 
says If a shared object/DLL is loaded more than once the most recent version 
is used.  I'm not sure if most recent means the one loaded most recently by 
the program (I don't know which that is) or highest version number.

Why is /usr/lib/openmpi being looked at in the first place?

How can I stop the madness?  Some folks on the openmpi list have
indicated I need to rebuild R, telling it where my MPI is, but that
seems an awfully big hammer for the problem.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 2 versions of same library loaded

2014-03-12 Thread Ross Boylan

Comments/questions interspersed below.
On Wed, 2014-03-12 at 22:50 -0400, Simon Urbanek wrote:
 Ross,
 
 On Mar 12, 2014, at 5:34 PM, Ross Boylan r...@biostat.ucsf.edu wrote:
 
  Can anyone help me understand how I got 2 versions of the same library
  loaded, how to prevent it, and what the consequences are?  Running under
  Debian GNU/Linux squeeze.
  
  lsof and /proc/xxx/map both show 2 copies of several libraries loaded:
  /home/ross/install/lib/libmpi.so.1.3.0
  /home/ross/install/lib/libopen-pal.so.6.1.0
  /home/ross/install/lib/libopen-rte.so.7.0.0
  /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so
  /usr/lib/openmpi/lib/libmpi.so.0.0.2
  /usr/lib/openmpi/lib/libopen-pal.so.0.0.0
  /usr/lib/openmpi/lib/libopen-rte.so.0.0.0
  /usr/lib/R/lib/libR.so
  
  
  The system has the old version of MPI installed under /usr/lib.  I built
  a personal, newer copy in my directory, and then rebuilt Rmpi (an R
  package) against it.  ldd on the personal Rmpi.so and libmpi.so shows
  all references to mpi libraries on personal paths.
  
  R was installed from a debian package, and presumably compiled without
  having MPI around.  Before running I set LD_LIBRARY_PATH to
  ~/install/lib, and then stuffed the same path at the start of
  LD_LIBRARY_PATH using Sys.setenv in my profile because R seems to
  prepend some libraries to that path when it starts (I'm curious about
  that too).  I also prepended ~/install/bin to my path, though I'm not
  sure that's relevant.
  
  Does R use ld.so or some other mechanism for loading libraries?
  
 
 R uses dlopen to load package libraries - it is essentially identical to 
 using ld.so for dependencies.
 
 
  Can I assume the highest version number of a library will be preferred?
 
 No.
 
Bummer.  The fact that Rmpi is not crashing suggests to me it's using
the right version of the mpi libraries (it does produce lots of errors
if I run it without setting LD_LIBRARY_PATH so only the system mpi libs
are in play), but it would be nice to be certain.  Or the 2 versions
could be combined in a big mess.
 
  http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#index-Dynamic-loading
   says If a shared object/DLL is loaded more than once the most recent 
  version is used.  I'm not sure if most recent means the one loaded most 
  recently by the program (I don't know which that is) or highest version 
  number.
  
 
 The former - whichever you load last wins. Note, however, that this refers to 
 explicitly loaded objects since they are loaded into a flat namespace so a 
 load will overwrite all symbols that get loaded.
It might be good to clarify that in the manual.

If I understand the term, the mpi libraries are loaded implicitly; that
is, Rmpi.so is loaded explicitly, and then it pulls in dependencies.
What are the rules in that case? 

 
 
  Why is /usr/lib/openmpi being looked at in the first place?
  
 
 You'll have to consult your system. The search path (assuming rpath is not 
 involved) is governed by LD_LIBRARY_PATH and /etc/ld.so.conf*. Note that 
 LD_LIBRARY_PATH is consulted at the time of the resolution (when the library 
 is looked up), so you may be changing it too late. Also note that you have to 
 expand ~ in the path (it's not a valid path, it's a shell expansion feature).
 
I just used the ~ as a shortcut; the shell expanded it and the full path
ended up in the variable.

I assume the loader checks LD_LIBRARY_PATH first; once it finds the mpi
libraries there I don't know why it keeps looking.

I'm not sure I follow the part about too late, but is it this?: all the
R's launched under MPI have the MPI library loaded automatically.   If
that happens before my profile is read, reseting LD_LIBRARY_PATH will be
irrelevant.  I don't know whether the profile or Rmpi load happens
first.

The reseting is just a reordering, and since the  other elements in
LD_LIBRARY_PATH don't have any mpi libraries I don't think the order
matters.


 R's massaging of the LD_LIBRARY_PATH is typically done in $R_HOME/etc/ldpaths 
 so you may want to check it and/or adjust it as needed. Normally (in stock 
 R), it only prepends its own libraries and Java so it should not be causing 
 any issues, but you may want to check in case Debian scripts add anything 
 else.
 
The extra paths are limited as you describe, and so are probably no
threat for loading the wrong MPI library
(/usr/lib64/R/lib:/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server).
 
  How can I stop the madness?  Some folks on the openmpi list have
  indicated I need to rebuild R, telling it where my MPI is, but that
  seems an awfully big hammer for the problem.
  
 
 I would check LD_LIBRARY_PATH and also check at which point are those old 
 libraries loaded to find where they are referenced.
How do I tell the point at which the old libraries are loaded?  I assume
it happens implicitly when Rmpi is loaded, but I don't know which of the
2 versions of the libraries is loaded first, and I don't know how to
tell.

Thanks for your help.
Ross

Re: [Rd] C++ debugging help needed

2013-10-02 Thread Ross Boylan

On Wed, Oct 02, 2013 at 11:05:19AM -0400, Duncan Murdoch wrote:

 Up to entry #4 this all looks normal.  If I go into that stack frame, I
 see this:


 (gdb) up
 #4  Shape::~Shape (this=0x15f8760, __in_chrg=optimized out) at
 Shape.cpp:13
 warning: Source file is more recent than executable.

That warning looks suspicious.  Are your sure gdb is finding the right
source files, and that the object code has been built from them?

 13blended(in_material.isTransparent())
 (gdb) p this
 $9 = (Shape * const) 0x15f8760
 (gdb) p *this
 $10 = {_vptr.Shape = 0x72d8e290, mName = 6, mType = {
   static npos = optimized out,
   _M_dataplus = {std::allocatorchar =
 {__gnu_cxx::new_allocatorchar =
 {No data fields}, No data fields},
 _M_p = 0x7f7f7f7f Address 0x7f7f7f7f out of
 bounds}},
 mShapeColor = {mRed = -1.4044474254567505e+306,
   mGreen = -1.4044477603031902e+306, mBlue = 4.24399170841135e-314,
   mTransparent = 0}, mSpecularReflectivity = 0.0078125,
 mSpecularSize = 1065353216, mDiffuseReflectivity = 0.007812501848093234,
 mAmbientReflectivity = 0}

 The things displayed in *this are all wrong.  Those field names come
 from the Shape object in the igraph package, not the Shape object in the
 rgl package.   The mixOmics package uses both.

 My questions:

 - Has my code somehow got mixed up with the igraph code, so I really do
 have a call out to igraph's Shape::~Shape instead of rgl's
 Shape::~Shape, or is this just bad info being given to me by gdb?


I don't know, but I think it's possible to give fully qualified type
names to gdb to force it to use the right definition.  That's assuming
that both Shape's are in different namespaces.  If they aren't, that's
likely the problem.

 - If I really do have calls to the wrong destructor in there, how do I
 avoid this?

Are you invoking the destructor explicitly?  An object should know
it's type, which should result in the right call without much effort.


 Duncan Murdoch

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] C++ debugging help needed

2013-10-02 Thread Ross Boylan

On Wed, 2013-10-02 at 16:15 -0400, Duncan Murdoch wrote:
 On 02/10/2013 4:01 PM, Ross Boylan wrote:
  On Wed, Oct 02, 2013 at 11:05:19AM -0400, Duncan Murdoch wrote:
  
   Up to entry #4 this all looks normal.  If I go into that stack frame, I
   see this:
  
  
   (gdb) up
   #4  Shape::~Shape (this=0x15f8760, __in_chrg=optimized out) at
   Shape.cpp:13
   warning: Source file is more recent than executable.
 
  That warning looks suspicious.  Are your sure gdb is finding the right
  source files, and that the object code has been built from them?
 
 I'm pretty sure that's a warning about the fact that igraph also has a 
 file called Shape.cpp, and the Shape::~Shape destructor was in that 
 file, not in my Shape.cpp file.

I guess the notion of the right source file is ambiguous in this
context.  Suppose you have projects A and B each defining a function f
in f.cpp.  Use A/f() to mean the binary function defined in project A,
found in source A/f.cpp.

The you have some code that means to invoke A/f() but gets B/f()
instead.  Probably gdb should associate this with B/f.cpp, but your
intent was A/f() and A/f.cpp.  If gdb happens to find A/f.cpp, and A was
build after B, that could provoke the warning shown.

 
   13blended(in_material.isTransparent())
   (gdb) p this
   $9 = (Shape * const) 0x15f8760
   (gdb) p *this
   $10 = {_vptr.Shape = 0x72d8e290, mName = 6, mType = {
 static npos = optimized out,
 _M_dataplus = {std::allocatorchar =
   {__gnu_cxx::new_allocatorchar =
   {No data fields}, No data fields},
   _M_p = 0x7f7f7f7f Address 0x7f7f7f7f out of
   bounds}},
   mShapeColor = {mRed = -1.4044474254567505e+306,
 mGreen = -1.4044477603031902e+306, mBlue = 4.24399170841135e-314,
 mTransparent = 0}, mSpecularReflectivity = 0.0078125,
   mSpecularSize = 1065353216, mDiffuseReflectivity = 
   0.007812501848093234,
   mAmbientReflectivity = 0}
  
   The things displayed in *this are all wrong.  Those field names come
   from the Shape object in the igraph package, not the Shape object in the
   rgl package.   The mixOmics package uses both.
  
   My questions:
  
   - Has my code somehow got mixed up with the igraph code, so I really do
   have a call out to igraph's Shape::~Shape instead of rgl's
   Shape::~Shape, or is this just bad info being given to me by gdb?
  
 
  I don't know, but I think it's possible to give fully qualified type
  names to gdb to force it to use the right definition.  That's assuming
  that both Shape's are in different namespaces.  If they aren't, that's
  likely the problem.
 
 Apparently they aren't, even though they are in separately compiled and 
 linked packages.  I had been assuming that the fact that rgl knows 
 nothing about igraph meant I didn't need to worry about it. (igraph does 
 list rgl in its Suggests list.)  On platforms other than Linux, I 
 don't appear to need to worry about it, but Linux happily loads one, 
 then loads the other and links the call to the wrong .so rather than the 
 local one, without a peep of warning, just an eventual crash.

While various OS's and tricks could provide work-arounds for clashing
function definitions (I actually had the impression the R dynamic
loading machinery might) those wouldn't necessary be right.  In
principle package A might use some functions defined in package B.  In
that case the need for namespaces would have become obvious.

 
 Supposing I finish my editing of the 100 or so source files and put all 
 of the rgl stuff in an rgl namespace, that still doesn't protect me 
 from what some other developer might do next week, creating their own 
 rgl namespace with a clashing name.   Why doesn't the linking step 
 resolve the calls, why does it leave it until load time?

I think there is a using namespace directive that might save typing,
putting everything into that namespace by default.  Maybe just the
headers need it.

With dynamic loading you don't know til load time if you've got a
problem.  As I said, the systemm can't simply wall if different
libraries, since they may want to call each other.

The usual solution for two developers picking the same name is to have
an outer level namespace associated with the developer/company/project,
with other namespaces nested inside.  This reduces the problem, though
obviously it can still exist higher up.

Ross
 
 
  - If I really do have calls to the wrong destructor in there, how do I
  avoid this?
 
 Are you invoking the destructor explicitly?  An object should know
 it's type, which should result in the right call without much effort.
 
 
 No, this is an implicit destructor call.  I'm deleting an object whose 
 class descends from Shape.
 
 Duncan Murdoch
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Makevars and Makeconf sequencing

2013-08-23 Thread Ross Boylan

http://cran.r-project.org/doc/manuals/R-exts.html#Configure-and-cleanup
near the start of 1.2.1 Using Makevars says

 There are some macros which are set whilst configuring the building of
 R itself and are stored in R_HOME/etcR_ARCH/Makeconf. That makefile is
 included as a Makefile after Makevars[.win], and the macros it defines
 can be used in macro assignments and make command lines in the latter.
 
I'm confused.  If Makeconf is included after Makevars, then how can
Makevars use macros defined in Makeconf?

Or is the meaning only that a variable definition in Makeconf can be
overridden in Makevars?

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] problems extracting parts of a summary object

2010-03-22 Thread Ross Boylan

summary(x), where x is the output of lm, produces the expectedd display,
including standard errors of the coefficients.

summary(x)$coefficients produces a vector (x is r$individual[[2]]):
 r$individual[[2]]$coefficients
tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn
-2.449188e+04 -4.143249e+00  4.707007e+04 -3.112334e+01  1.671106e-01
   mncigspmkrmnpeldmnsmkpreve
 3.580065e+00  2.029721e+05  4.404915e+01
 class(r$individual[[2]]$coefficients)
[1] numeric

rather than the expected matrix like object with a column for the se's.

When I trace through the summary method, the coefficients value is a
matrix.

I'm trying to pull out the standard errors for some rearranged output.

How can I do that?

And what's going on?  I suspect this may be a namespace issue.

Thanks.
Ross Boylan

P.S. I would appreciate a cc because of some mail problems I'm having.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] problems extracting parts of a summary object

2010-03-22 Thread Ross Boylan

On Mon, 2010-03-22 at 16:30 -0600, tpl...@cybermesa.com wrote:
 Are you sure you're extracting the coefficients component of the summary
 object and not the lm object?
 
 Seems to work ok for me:
  xm - lm(y ~ x, data=data.frame(y=rnorm(20), x=rnorm(2)))
  summary(xm)$coefficients
 Estimate Std. Error  t value  Pr(|t|)
 (Intercept) 1.908948   1.707145 1.118210 0.2781794
 x   1.292263   1.174608 1.100165 0.2857565
  xm$coefficients
 (Intercept)   x
1.9089481.292263
 
 -- Tony Plate
 class(summary(r$individual[[2]]))
[1] summary.lm
But maybe I'm not following the question.
Ross

 
 On Mon, March 22, 2010 4:03 pm, Ross Boylan wrote:
  summary(x), where x is the output of lm, produces the expectedd display,
  including standard errors of the coefficients.
 
  summary(x)$coefficients produces a vector (x is r$individual[[2]]):
  r$individual[[2]]$coefficients
  tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn
  -2.449188e+04 -4.143249e+00  4.707007e+04 -3.112334e+01  1.671106e-01
 mncigspmkrmnpeldmnsmkpreve
   3.580065e+00  2.029721e+05  4.404915e+01
  class(r$individual[[2]]$coefficients)
  [1] numeric
 
  rather than the expected matrix like object with a column for the se's.
 
  When I trace through the summary method, the coefficients value is a
  matrix.
 
  I'm trying to pull out the standard errors for some rearranged output.
 
  How can I do that?
 
  And what's going on?  I suspect this may be a namespace issue.
 
  Thanks.
  Ross Boylan
 
  P.S. I would appreciate a cc because of some mail problems I'm having.
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] problems extracting parts of a summary object [solved]

2010-03-22 Thread Ross Boylan

On Mon, 2010-03-22 at 16:52 -0600, tpl...@cybermesa.com wrote:
 what's the output of:
  summary(r$individual[[2]])$coef
 
 my question was basically whether you were doing
 summary(r$individual[[2]])$coef
 or
 r$individual[[2]]$coef
 
 (the second was what you appeared to be doing from your initial email)
 
Doh!  Thank you; that was it.
This was interacting with another error, which is perhaps how I managed
to miss it.

Ross
 -- Tony Plate
 
 On Mon, March 22, 2010 4:43 pm, Ross Boylan wrote:
  On Mon, 2010-03-22 at 16:30 -0600, tpl...@cybermesa.com wrote:
  Are you sure you're extracting the coefficients component of the summary
  object and not the lm object?
 
  Seems to work ok for me:
   xm - lm(y ~ x, data=data.frame(y=rnorm(20), x=rnorm(2)))
   summary(xm)$coefficients
  Estimate Std. Error  t value  Pr(|t|)
  (Intercept) 1.908948   1.707145 1.118210 0.2781794
  x   1.292263   1.174608 1.100165 0.2857565
   xm$coefficients
  (Intercept)   x
 1.9089481.292263
 
  -- Tony Plate
  class(summary(r$individual[[2]]))
  [1] summary.lm
  But maybe I'm not following the question.
  Ross
 
 
  On Mon, March 22, 2010 4:03 pm, Ross Boylan wrote:
   summary(x), where x is the output of lm, produces the expectedd
  display,
   including standard errors of the coefficients.
  
   summary(x)$coefficients produces a vector (x is r$individual[[2]]):
   r$individual[[2]]$coefficients
   tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn
   -2.449188e+04 -4.143249e+00  4.707007e+04 -3.112334e+01  1.671106e-01
  mncigspmkrmnpeldmnsmkpreve
3.580065e+00  2.029721e+05  4.404915e+01
   class(r$individual[[2]]$coefficients)
   [1] numeric
  
   rather than the expected matrix like object with a column for the
  se's.
  
   When I trace through the summary method, the coefficients value is a
   matrix.
  
   I'm trying to pull out the standard errors for some rearranged output.
  
   How can I do that?
  
   And what's going on?  I suspect this may be a namespace issue.
  
   Thanks.
   Ross Boylan
  
   P.S. I would appreciate a cc because of some mail problems I'm having.
  
   __
   R-devel@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-devel
  
  
 
 
 
 
 
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] y ~ X -1 , X a matrix

2010-03-17 Thread Ross Boylan

While browsing some code I discovered a call to lm that used a formula y
~ X - 1, where X was a matrix.

Looking through the documentation of formula, lm, model.matrix and maybe
some others I couldn't find this useage (R 2.10.1).  Is it anything I
can count on in future versions?  Is there documentation I've
overlooked?

For the curious: model.frame on the above equation returns a data.frame
with 2 columns.  The second column is the whole X matrix.
model.matrix on that object returns the expected matrix, with the
transition from the odd model.frame to the regular matrix happening in
an .Internal call.

Thanks.
Ross

P.S. I would appreciate cc's, since mail problems are preventing me from
seeing list mail.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] y ~ X -1 , X a matrix

2010-03-17 Thread Ross Boylan

On Thu, 2010-03-18 at 00:57 +, ted.hard...@manchester.ac.uk wrote:
 On 17-Mar-10 23:32:41, Ross Boylan wrote:
  While browsing some code I discovered a call to lm that used
  a formula y ~ X - 1, where X was a matrix.
  
  Looking through the documentation of formula, lm, model.matrix
  and maybe some others I couldn't find this useage (R 2.10.1).
  Is it anything I can count on in future versions?  Is there
  documentation I've overlooked?
  
  For the curious: model.frame on the above equation returns a
  data.frame with 2 columns.  The second column is the whole X
  matrix. model.matrix on that object returns the expected matrix,
  with the transition from the odd model.frame to the regular
  matrix happening in an .Internal call.
  
  Thanks.
  Ross
  
  P.S. I would appreciate cc's, since mail problems are preventing
  me from seeing list mail.
 
 Hmmm ... I'm not sure what is the problem with what you describe.
There is no problem in the it doesn't work sense.
There is a problem that it seems undocumented--though the help you quote
could rather indirectly be taken as a clue--and thus, possibly, subject
to change in later releases.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rgeneric.py assists in rearranging generic function definitions [inline]

2010-01-25 Thread Ross Boylan

On Thu, 2010-01-21 at 11:38 -0800, Ross Boylan wrote:
 I've attached a script I wrote that pulls all the setGeneric definitions
 out of a set of R files and puts them in a separate file, default
 allGenerics.R.  I thought it might help others who find themselves in a
 similar situation.
 
 The situation was that I had to change the order in which files in my
 package were parsed; the scheme in which the generic definition is in
 the first file that has the corresponding setMethod breaks under
 re-ordering.  So I pulled out all the definitions and put them first.
 
 In retrospect, it is clearly preferable to create allGenerics.py from
 the start.  If you didn't, and discover you should have, the script
 automates the conversion.
 
 Thanks to everyone who helped me with my packaging problems.  The
 package finally made it to CRAN as
 http://cran.r-project.org/web/packages/mspath/index.html.  I'll send a
 public notice of that to the general R list.
 
 Ross Boylan

Apparently the attachment didn't make it through. I've pasted
Rgeneric.py below.
#! /usr/bin/python
# python 2.5 required for with statement
from __future__ import with_statement

# Rgeneric.py extracts setGeneric definitions from R sources and 
# writes them to a special file, while removing them from the
# original.
#
# Context: In a system with several R files, having generic
# definitions sprinkled throughout, there are errors arising from the
# sequencing of files, or of definitions within files.  In general,
# changing the order in which files are parsed (e.g., by the Collate:
# filed in DESCRIPTION) will break things even when they were
# working.  For example, a setMethod may occur before the
# corresponding setGeneric, and then fail.  Given that it is not safe
# to call setGeneric twice for the same function, the cleanest
# solution may be to move all the generic definitions to a separate
# file that will be read before any of the setMethod's.  Rgeneric.py
# helps automate that process.
#
# It is, of course, preferable not to get into this situation in the
# first place, for example by creating an allGenerics.R file as you
# go.

# Typical useage: ./Rgeneric.py *.R
# Will create allGenerics.R with all the extracted generic
# definitions, including any preceding comments.
# Rewrites the *.R files, replacing the setGeneric's with comments
# indicating the generic has moved to allGenerics.py.
# *.R.old has the original .R files.
#
# The program does not work for all conceivable styles.  In
# particular, it assumes that
#1. setGeneric is immediately followed by an open parenthesis and
#   a quoted name of the function.  Subsequent parts of the
#   definition may be split across lines and have interspersed
#   comments.
#
#2. Comments precede the definition.  They are optional, and will
#   be left in place in the .R file and copied to allGenerics.R.
#
#3. If you first define an ordinary function foo, and then do
#   setGeneric(foo) the setGeneric will be moved to
#   allGenerics.R.  It will not work properly there; you should
#   make manual adjustments such as moving it back to the
#   original.  The code at the bottom reports on all such
#   definitions, and then lists all the generic functions processed.
#
#4. allGenerics.R will contain generic definitions in the order of
#   files examined, and in the order they are defined within the
#   file.  This is to preserve context for the comments, in
#   particular for comments which apply to a block of
#   definitions.  If you would like something else, e.g.,
#   alphabetical ordering, you should post-process the AllForKey
#   object created at the bottom of this file.


#
# There are program (not command line) options to do a read-only scan,
# and a class to hold the results, which can be inspected in various
# ways.

# Copyright 2010 Regents of University of California
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# See http://www.gnu.org/licenses/ for the full license.

# Author: Ross Boylan r...@biostat.ucsf.edu
#
# Revision History:
#
# 1.0 2010-01-21 Initial release.

import os, os.path, re, sys

class ParseGeneric:
Extract setGeneric functions and preceding comments in one file.
states of the parser:
findComment -- look for start of comment
inComment -- found comment; accumulate and look for end
inGeneric -- extract setGeneric definition.

Typical use:
p = ParseGeneric()
results = p.parse(myfile.R

[Rd] Rgeneric.py assists in rearranging generic function definitions

2010-01-21 Thread Ross Boylan

I've attached a script I wrote that pulls all the setGeneric definitions
out of a set of R files and puts them in a separate file, default
allGenerics.R.  I thought it might help others who find themselves in a
similar situation.

The situation was that I had to change the order in which files in my
package were parsed; the scheme in which the generic definition is in
the first file that has the corresponding setMethod breaks under
re-ordering.  So I pulled out all the definitions and put them first.

In retrospect, it is clearly preferable to create allGenerics.py from
the start.  If you didn't, and discover you should have, the script
automates the conversion.

Thanks to everyone who helped me with my packaging problems.  The
package finally made it to CRAN as
http://cran.r-project.org/web/packages/mspath/index.html.  I'll send a
public notice of that to the general R list.

Ross Boylan
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] calling setGeneric() twice

2010-01-19 Thread Ross Boylan

Is it safe to call setGeneric twice, assuming some setMethod's for the
target function occur in between?  By safe I mean that all the
setMethod's remain in effect, and the 2nd call is, effectively, a no-op.

?setGeneric says nothing explicit about this behavior that I can see.
It does say that if there is an existing implicity generic function it
will be (re?)used. I also tried ?Methods, google and the mailing list
archives.

I looked at the code for setGeneric, but I'm not confident how it
behaves.  It doesn't seem to do a simple return of the existing value if
a generic already exists, although it does have special handling for
that case.  The other problem with looking at the code--or running
tests--is that they only show the current behavior, which might change
later.

This came up because of some issues with the sequencing of code in my
package.  Adding duplicate setGeneric's seems like the smallest, and
therefore safest, change if the duplication is not a problem.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] calling setGeneric() twice

2010-01-19 Thread Ross Boylan

On Tue, 2010-01-19 at 10:05 -0800, Seth Falcon wrote:
  This came up because of some issues with the sequencing of code in
 my
  package.  Adding duplicate setGeneric's seems like the smallest, and
  therefore safest, change if the duplication is not a problem.
 
 I'm not sure of the answer to your question, but I think it is the
 wrong 
 question :-)
 
 Perhaps you can provide more detail on why you are using multiple
 calls 
 to setGeneric.  That seems like a very odd thing to do.
My system is defined in a collection of .R files, most of which are
organized around classes.  So the typical file has a setClass(),
setGeneric()'s, and setMethod()'s.

If files that were read in later in the sequence extended an existing
generic, I omitted the setGeneric().

I had to resequence the order in which the files were read to avoid some
undefined slot classes warnings.  The resequencing created other
problems, including some cases in which I had a setMethod without a
previous setGeneric.

I have seen the advice to sequence the files so that class definitions,
then generic definitions, and finally function and method definitions
occur.  I am trying not to do that for two reasons.  First, I'm trying
to keep the changes I make small to avoid introducing errors.  Second, I
prefer to keep all the code related to a single class in a single file.

Some of the files were intended for free-standing use, and so it would
be useful if they could retain setGeneric()'s even if I also need an
earlier setGeneric to make the whole package work.

I am also working on a python script to extract all the generic function
defintions (that is, setGeneric()), just in case.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] calling setGeneric() twice (don't; documentation comments)

2010-01-19 Thread Ross Boylan

On Tue, 2010-01-19 at 12:55 -0800, Seth Falcon wrote:
 I would expect setGeneric to create a new generic function and
 nuke/mask 
 methods associated with the generic that it replaces.
I tried a test in R 2.7.1, and that is the behavior.  I think it would
be worthwhile to document it in ?setGeneric.

Also, ?setGeneric advocates first defining a regular function (e.g.,
bar) and then doing a simple setGeneric(bar).  I think the advice for
package developers is different, so perhaps some changes there would be
a good idea too.

I thought I was defining setGeneric twice for a few functions, and thus
that it did work OK.  It turns out I have no duplicate definitions.

Here's the test:
 setClass(A, representation(z=ANY))
[1] A
 setClass(B, representation(y=ANY))
[1] B
 setGeneric(foo, function(x) standardGeneric(foo))
[1] foo
 setMethod(foo, signature(x=A), function(x) return(foo for A))
[1] foo
 
 a=new(A)
 b=new(B)
 foo(a)
[1] foo for A
 foo(b)
Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function foo, for signature B
 setGeneric(foo, function(x) standardGeneric(foo))
[1] foo
 setMethod(foo, signature(x=B), function(x) return(foo for B))
[1] foo
 
 setGeneric(foo, function(x) standardGeneric(foo))
[1] foo
 setMethod(foo, signature(x=B), function(x) return(foo for B))
[1] foo
 foo(a)
# here's where the disappearance of the prior setMethod shows
Error in function (classes, fdef, mtable)  : 
  unable to find an inherited method for function foo, for signature A
 foo(b)
[1] foo for B

So I guess I am going to pull the setGeneric's out.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] optional package dependency

2010-01-16 Thread Ross Boylan

On Sat, 2010-01-16 at 07:49 -0800, Seth Falcon wrote:
 Package authors
 should be responsible enough to test their codes with and without
 optional features.
It seems unlikely most package authors will have access to a full range
of platform types.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] optional package dependency

2010-01-15 Thread Ross Boylan

On Fri, 2010-01-15 at 09:19 +0100, Kurt Hornik wrote:
 The idea is that maintainers typically want to
 fully check their functionality, suggesting to force suggests by
 default.
This might be the nub of the problem.  There are different audiences,
even for R CMD check.

The maintainer probably wants to check all functionality.  Even then,
there is an issue if functionality differs by platform.

CRAN probably wants to check all functionality.

An individual user just wants to check the functionality they use.

For example, if someone doesn't want to run my package distributed, but
wants to see if it works (R CMD check), they need to be able to avoid
the potentially onerous requirement to install MPI.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] optional package dependency (enhances)

2010-01-15 Thread Ross Boylan

On Fri, 2010-01-15 at 10:48 +, Benilton Carvalho wrote:
 How about using:
 
 Enhances: Rmpi
 
 ?
 
 b
The main reason is that enhances seems a peculiar way to describe the
relation between a package that (optionally) uses a piece of
infrastructure and the infrastructure.  Similarly, I would not say that
a car enhances metal.  The example given in the R extension
documentation (e.g., by providing methods for classes from these
packages) seems more in line with the usual meaning of enhance.

A secondary reason is that I can not tell from the documentation exactly
what putting a package in enhances does.  The example of adding
functionality to a class suggests that packages that are enhanced are
required.  However, clearly one could surround code that enhanced a
class from another package with a conditional, so that if the code was
skipped if the enhanced package was absent.  Even that logic isn't quite
right if the enhanced package is added later.

My package only loads/verifies the presence of rmpi if one attempts to
use the distributed features, so the relation is at run time, not load
time.

Ross
 
 On Fri, Jan 15, 2010 at 6:00 AM, Ross Boylan r...@biostat.ucsf.edu wrote:
  I have a package that can use rmpi, but works fine without it.  None of
  the automatic test code invokes rmpi functionality.  (One test file
  illustrates how to use it, but has quit() as its first command.)
 
  What's the best way to handle this?  In particular, what is the
  appropriate form for upload to CRAN?
 
  When I omitted rmpi from the DESCRITPION file R CMD check gave
  quote
  * checking R code for possible problems ... NOTE
  alldone: no visible global function definition for ‘mpi.bcast.Robj’
  alldone: no visible global function definition for ‘mpi.exit’
  quote
  followed by many more warnings.
 
  When I add
  Suggests: Rmpi
  in DESCRIPTION the check stops if the package is not installed:
  quote
  * checking package dependencies ... ERROR
  Packages required but not available:
   Rmpi
  /quote
  Rmpi is not required, but I gather from previous discussion on this list
  that suggests basically means required for R CMD check.
 
  NAMESPACE seems to raise similar issues; I don't see any mechanism for
  optional imports.  Also, I have not used namespaces, and am not eager to
  destabilize things so close to release.  At least, I hope I'm close to
  release :)
 
  Thanks for any advice.
 
  Ross Boylan
 
  P.S. Thanks, Duncan, for your recent advice on my help format problem
  with R 2.7.  I removed the nested \description, and now things look OK.
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] optional package dependency (suggestions/wishes)

2010-01-15 Thread Ross Boylan

On Fri, 2010-01-15 at 12:34 -0500, Simon Urbanek wrote:
 
 On Jan 15, 2010, at 12:18 , Ross Boylan wrote:
 
  On Fri, 2010-01-15 at 09:19 +0100, Kurt Hornik wrote:
  The idea is that maintainers typically want to
  fully check their functionality, suggesting to force suggests by
  default.
  This might be the nub of the problem.  There are different
 audiences,
  even for R CMD check.
 
  The maintainer probably wants to check all functionality.  Even
 then,
  there is an issue if functionality differs by platform.
 
  CRAN probably wants to check all functionality.
 
  An individual user just wants to check the functionality they use.
 
  For example, if someone doesn't want to run my package
 distributed,  
  but
  wants to see if it works (R CMD check), they need to be able to
 avoid
  the potentially onerous requirement to install MPI.
 
 
 ... that what's why you can decide to run check without forcing  
 suggests  - it's entirely up to you / the user as Kurt pointed out ...
 
 Cheers,
 Simon
This prompts a series of increasing ambitious suggestions:

1. DOCUMENTATION CHANGE
I suggest this info about _R_CHECK_FORCE_SUGGESTS_=false be added to R
CMD check --help.

Until Kurt's email I was unaware of the facility, and it seems to me the
average package user will be even less likely to know.  My concern is
that they would run R CMD check; it would fail because a package such as
rmpi is absent; and the user will throw up their hands and give up.

I did find a Perl variable with similar name in section 1.3.3 of
Writing R Extensions, but that section does not mention environment
variables.  It would also be unnatural for a package user to refer to
it.

Considering there are many variables, maybe the interactive help should
just note that customizing variables (without naming particular ones)
are available, and point to appropriate documentation

2. NEW BEHAVIOR/OPTIONS
On even more exotic wish would be to allow a list of suggested packages
to check.  That way, someone use some, but not all, optional facilities
could check the ones of interest.  Again, even with better documentation
it seems likely most people would be unaware of the feature.

3. SIGNIFICANTLY CHANGED BEHAVIOR
I think the optimal behavior would be for the check environment to
attempt to load all suggested packages, but continue even if some are
missing.  It would then be up to package authors to code appropriate
conditional tests for the presence or absence of suggested packages;
actually, that's probably true even now.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] optional package dependency

2010-01-14 Thread Ross Boylan

I have a package that can use rmpi, but works fine without it.  None of
the automatic test code invokes rmpi functionality.  (One test file
illustrates how to use it, but has quit() as its first command.)

What's the best way to handle this?  In particular, what is the
appropriate form for upload to CRAN?

When I omitted rmpi from the DESCRITPION file R CMD check gave 
quote
* checking R code for possible problems ... NOTE
alldone: no visible global function definition for ‘mpi.bcast.Robj’
alldone: no visible global function definition for ‘mpi.exit’
quote
followed by many more warnings.

When I add
Suggests: Rmpi
in DESCRIPTION the check stops if the package is not installed:
quote
* checking package dependencies ... ERROR
Packages required but not available:
  Rmpi
/quote
Rmpi is not required, but I gather from previous discussion on this list
that suggests basically means required for R CMD check.

NAMESPACE seems to raise similar issues; I don't see any mechanism for
optional imports.  Also, I have not used namespaces, and am not eager to
destabilize things so close to release.  At least, I hope I'm close to
release :)

Thanks for any advice.

Ross Boylan

P.S. Thanks, Duncan, for your recent advice on my help format problem
with R 2.7.  I removed the nested \description, and now things look OK.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] ?setGeneric garbled (PR#14153)

2009-12-17 Thread Ross Boylan

On Thu, 2009-12-17 at 15:24 +0100, Martin Maechler wrote:
  Ross Boylan r...@biostat.ucsf.edu
  on Thu, 17 Dec 2009 02:15:12 +0100 (CET) writes:
 
  Full_Name: Ross Boylan
  Version: 2.10.0
  OS: Windows XP
  Submission from: (NULL) (198.144.201.14)
 
 
  Some of the help for setGeneric seems to have been garbled.  In the 
 section
  Basic Use, 5th paragraph (where the example counts as a single line 
 3rd
  paragraph) it says
  quote
  Note that calling 'setGeneric()' in this form is not strictly
  necessary before calling 'setMethod()' for the same function.  If
  the function specified in the call to 'setMethod' is not generic,
  'setMethod' will execute the call to 'setGeneric' itself.
  Declaring explicitly that you want the function to be generic can
  be considered better programming style; the only difference in the
  result, however, is that not doing so produces a You cannot (and
  never need to) create an explicit generic version of the primitive
  functions in the base package.
  quote
 
  The stuff after the semi-colon of the final sentence is garbled, or at 
 least
  unparseable by me.  Probably something got deleted by mistake.
 
 That's very peculiar.
 
 The corresponding  methods/man/setGeneric.Rd file has not been
 changed in a while,
 but I don't see your problem.

The help from R launched directly from the R shortcut on my desktop
looks fine, in both 2.10 and 2.8.

I closed all my emacs sessions and restarted, but ?setGeneric produces
the same garbled text.  I also tried telling ESS to use a different
working directory when launching R; it didn't help.


The last sentence of this paragraph is also garbled:
quote
 The description above is the effect when the package that owns the
 non-generic function has not created an implicit generic version.
 Otherwise, it is this implicit generic function that is us_same_
 version of the generic function will be created each time.
/quote

Weird.

P.S. http://bugs.r-project.org was extremely sluggish, even timing out,
both yesterday and today for me.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] group generics

2009-12-03 Thread Ross Boylan

Thanks for your help.  I had two concerns about using as: that it would
impose some overhead, and that it would require me to code an explicit
conversion function.  I see now that the latter is not true; I don't
know if the overhead makes much difference.

On Thu, 2009-12-03 at 13:00 -0800, Martin Morgan wrote:
 setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) {
 new(B, xb=e1...@xb, callGeneric(e1, as(e2, A)))
 })

Things were getting too weird, so I punted and used explicitly named
function calls for the multiplication operation that was causing
trouble.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] group generics

2009-12-03 Thread Ross Boylan

On Thu, 2009-12-03 at 14:25 -0800, John Chambers wrote:
 I missed the earlier round of this discussion and only am commenting
 now to say that this doesn't seem weird at all, if I understand what
 you're trying to do.
 
 Martin's basic suggestion,
 v - callGeneric(e1, as(e2, A))
 seems the simplest solution.
 
 You just want to make another call to the actual generic function,
 with new arguments, and let method selection take place.  In fact,
 it's pretty much the standard way to use group generics.  
 
 John
There were 2 weird parts.  Mainly I was referring to the fact that
identical code (posted earlier) worked sometimes and not others.  I
could not figure out what the differences were between the 2 scenarios,
nor could I create a non-working scenario reliably.

The second part that seemed weird was that the code looked as if it
should work all the time (the last full version I posted, which used
callNextMethod() rather than callGeneric()).

Finally, I felt somewhat at sea with the group generics, since I wasn't
sure exactly how they worked, how they interacted with primitives, or
how they interacted with callNextMethod, selectMethod, etc.  I did study
what I thought were the relevant help entries.

Ross
 
 
 Ross Boylan wrote: 
  Thanks for your help.  I had two concerns about using as: that it would
  impose some overhead, and that it would require me to code an explicit
  conversion function.  I see now that the latter is not true; I don't
  know if the overhead makes much difference.
  
  On Thu, 2009-12-03 at 13:00 -0800, Martin Morgan wrote:

   setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) {
   new(B, xb=e1...@xb, callGeneric(e1, as(e2, A)))
   })
   
  
  Things were getting too weird, so I punted and used explicitly named
  function calls for the multiplication operation that was causing
  trouble.
  
  Ross
  
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
  
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] group generics

2009-11-30 Thread Ross Boylan

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Martin Morgan wrote:
 Hi Ross -- 
 
 Ross Boylan r...@biostat.ucsf.edu writes:
 
 I have classes A and B, where B contains A.  In the implementation of
 the group generic for B I would like to use the corresponding group
 generic for A.  Is there a way to do that?

 setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) {
  # the next line does not work right
   v - selectMethod(callGeneric, signature=c(numeric, A))(e1, e2)
 
 v - callGeneric(e1, as(e2, A))
 
 or probably
 
v - callNextMethod(e1, e2)
 
 Martin

A different error this time, one that looks a lot like the report from
stephen.p...@ubs.com on 2007-12-24 concerning callNextMethod:, except
this is with
callGeneric.

HOWEVER, the problem is erratic; when I started from scratch and took
this code into a workspace and executed the commands, they worked as
expected.  I had various false starts and revisions, as well as the real
code on which the example is based, when the error occurred.  I tried
taking in the real code (which defines generics with Arith from my
actual classes, and which also fails as below), and the example still
worked.


My revised code:

setClass(A,
 representation=representation(xa=numeric)
 )

setMethod(Arith, signature(e1=numeric, e2=A), function(e1, e2) {
  new(A, xa=callGeneric(e1, e...@xa))
}
 )

setClass(B,
 representation=representation(xb=numeric),
 contains=c(A)
 )

setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) {
  new(B, xb=e1...@xb, callNextMethod())
}
)

Results:
 options(error=recover)
 tb - new(B, xb=1:3, new(A, xa=10))
 3*tb
Error in get(fname, envir = envir) : object '.nextMethod' not found

Enter a frame number, or 0 to exit

 1: 3 * tb
 2: 3 * tb
 3: test.R#16: new(B, xb = e1 * e...@xb, callNextMethod())
 4: initialize(value, ...)
 5: initialize(value, ...)
 6: callNextMethod()
 7: .nextMethod(e1 = e1, e2 = e2)
 8: test.R#6: new(A, xa = callGeneric(e1, e...@xa))
 9: initialize(value, ...)
10: initialize(value, ...)
11: callGeneric(e1, e...@xa)
12: get(fname, envir = envir)

Selection: 0

The callGeneric in frame 11 is trying to get the primitive for
multiplying numeric times numeric.  Quoting from Pope's analysis:
[The primitive...]
 does not get the various magic variables such as .Generic, .Method,
 etc. defined in its frame. Thus, callGeneric() fails when, failing to
 find .Generic then takes the function symbol for the call (which
 callNextMethod() has constructed to be .nextMethod) and attempts to
 look it up, which of course also fails, leading to the resulting error
 seen above.

I'm baffled, and hoping someone on the list has an idea.
I'm running R 2.10 under ESS (in particular, I use c-c c-l in the code
file to read in the code) on XP.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAksUTcQACgkQTEwcvZWfjMgEdwCfYt/bmsXG76rq3BpbByBYNjLY
ubsAoKnBnBMbd+OlBL2YOg3vWslL35Zg
=D58x
-END PGP SIGNATURE-

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] group generics

2009-11-24 Thread Ross Boylan

I have classes A and B, where B contains A.  In the implementation of
the group generic for B I would like to use the corresponding group
generic for A.  Is there a way to do that?

I would also appreciate any comments if what I'm trying to do seems like
the wrong approach.

Here's a stripped down example:
setClass(A,
 representation=representation(xa=numeric)
 )

setMethod(Arith, signature(e1=numeric, e2=A), function(e1, e2) {
  new(A, ax=e1...@xa)
}
  )

setClass(B,
 representation=representation(xb=numeric),
 contains=c(A)
 )

setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) {
# the next line does not work right
  v - selectMethod(callGeneric, signature=c(numeric, A))(e1, e2)
  print(v)
  new(B, v, xb=e1...@xb)
}
)


Results:
 t1 - new(B, new(A, xa=4), xb=2)
 t1
An object of class “B”
Slot xb:
[1] 2

Slot xa:
[1] 4

 3*t1
Error in getGeneric(f, !optional) : 
  no generic function found for callGeneric

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] bug in heatmap?

2009-11-05 Thread Ross Boylan

Using R 2.10 on WinXP
heatmap(mymap, Rowv=NA, Colv=NA)
with mymap values of
 0   1  2 3   4
0  NaN 0.0 0.00621118 0.000 NaN
10 0.0 0.01041667 0.125 NaN
20 0.004705882 0.02105263 0.333 NaN
30 0.004081633 0.0222 0.500   0
40 0.0 0.01923077 0.167 NaN
60 0.0 0. 0.000 NaN
10   0 0.002840909 0. 0.000 NaN
20   0 0.002159827 0.   NaN NaN
40 NaN 0.009433962 0.   NaN NaN
(the first row and column are labels, not data)
produces a plot in which all of the row labelled 6 (all 0's and NaN) is
white.  This is the same color showing for the NaN values.

In contrast, all other 0 values appear as dark red.

Have I missed some subtlety, or is this a bug?

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] mysteriously persistent generic definition

2009-10-28 Thread Ross Boylan

Here's a self-contained example of the problem:

 foo - function(obj) {return(3);}
 setGeneric(foo)
[1] foo
 removeGeneric(foo)
[1] TRUE
 foo - function(x) {return(4);}
 args(foo)
function (x) 
NULL
 setGeneric(foo)
[1] foo
 args(foo)
function (obj) 
NULL

R 2.7.1.  I get the same behavior whether or not I use ESS.

The reason this is more than a theoretical problem:

 setMethod(foo, signature(x=numeric), function(x) {return(x+4);})
Error in match.call(fun, fcall) : unused argument(s) (x = numeric)

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] mysteriously persistent generic definition

2009-10-28 Thread Ross Boylan

R 2.8.1 on Windows behaves as I expected, i.e., the final args(foo)
returns a function of x.  The previous example (below) was on Debian
GNU/Linux.


On Wed, 2009-10-28 at 12:14 -0700, Ross Boylan wrote:
 Here's a self-contained example of the problem:
 
  foo - function(obj) {return(3);}
  setGeneric(foo)
 [1] foo
  removeGeneric(foo)
 [1] TRUE
  foo - function(x) {return(4);}
  args(foo)
 function (x) 
 NULL
  setGeneric(foo)
 [1] foo
  args(foo)
 function (obj) 
 NULL
 
 R 2.7.1.  I get the same behavior whether or not I use ESS.
 
 The reason this is more than a theoretical problem:
 
  setMethod(foo, signature(x=numeric), function(x) {return(x+4);})
 Error in match.call(fun, fcall) : unused argument(s) (x = numeric)
 
 Ross 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] mysteriously persistent generic definition

2009-10-22 Thread Ross Boylan

Originally I made a function yearStop that took an argument object.  I
made a generic, but later changed the argument to x.  R keeps
resurrecting the old definition.  Could anyone explain what is going on,
or how to fix it?  Note particularly the end of the transcript below: I
remove the generic, verify that the symbol is undefined, make a new
function, and then make a generic.  But the generic does not use the
argument of the new function definition.

quote
 args(yearStop)
function (obj) 
NULL
 yearStop - function(x) x...@yearstop
 args(yearStop)
function (x) 
NULL
 setGeneric(yearStop)
[1] yearStop
 args(yearStop)
function (obj) 
NULL
 removeGeneric(yearStop)
[1] TRUE
 args(yearStop)
Error in args(yearStop) : object yearStop not found
 yearStop - function(x) x...@yearstop
 setGeneric(yearStop)
[1] yearStop
 args(yearStop)
function (obj) 
NULL
/quote

R 2.7.1.  I originally read the definitions in from a file with ^c^l in
ESS; however, I typed the commands above by hand.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] user supplied random number generators

2009-08-18 Thread Ross Boylan

On Sun, 2009-08-16 at 21:24 +0200, Petr Savicky wrote:
 Dear Ross Boylan:
 
 Some time ago, you sent an email to R-devel with the following.
  I got into this because I'm trying to extend the rsprng code; sprng
  returns its state as a vector of bytes.  Converting these to a vector of
  integers depends on the integer length, hence my interest in the exact
  definiton of integer.  I'm interested in lifetime because I believe
  those bytes are associated with the stream and become invalid when the
  stream is freed; furthermore, I probably need to copy them into a buffer
  that is padded to full wordlength.  This means I allocate the buffer
  whose address is returned to the core R RNG machinery.  Eventually
  somebody needs to free the memory.
  
  Far more of my rsprng adventures are on
  http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng.  Feel
  free to read, correct, or extend it.
 
 I am interested to know, what is the current state of your project.
I did figure out some of the lifetime issues; SPRNG does allocate memory
when you ask it for its state.  I also realized that for several reasons
it would not be appropriate to hand that buffer to R.

I've reworked the page extensively since it had the section you quote.
See particularly the Getting and Setting Stream State section near the
bottom.

I submitted patches to hook rsprng into R's standard machinery for
stream state (the user visible part of which is .Random.seed).  The
package developer has reservations about applying them.

As a practical matter, I shifted my package's C code to call back to R
to get random numbers.  If rsprng is loaded and activated, my code will
use it.  I also eliminated all attempts to set the seed in my code.  For
rsprng, in its current form, the R set.seed() function is a no-op and
you have to use an rsprng function to set the seed (generally when
activating the library).
 
 There is a package rngwell19937 with a random number generator, which i 
 develop
 and use for several parallel processes. Setting a seed may be done by a 
 vector,
 one of whose components is the process number. The initialization then 
 provides
 unrelated sequences for different processes.
That sounds interesting; thanks for pointing it out.
 
 Seeding by a vector is also available in the initialization of Mersenne 
 Twister
 from 2002. See mt19937ar.c (ar for array) at
   http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
 Unfortunately, seeding by a vector is not available in R base. R uses
 Mersenne Twister, but with an initialization by a single number.
I think that one could write to .Random.seed directly to set a vector
for many of the generators.  ?.Random.seed does not recommend this and
notes various limits and hazards of this strategy.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] user supplied random number generators

2009-07-30 Thread Ross Boylan

?Random.user says (in svn trunk)
  Optionally,
  functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be
  supplied which are called with no arguments and should return pointers
  to the number of seeds and to an integer array of seeds.  Calls to
  \code{GetRNGstate} and \code{PutRNGstate} will then copy this array to
  and from \code{.Random.seed}.
And it offers as an example
  void  user_unif_init(Int32 seed_in) { seed = seed_in; }
  int * user_unif_nseed() { return nseed; }
  int * user_unif_seedloc() { return (int *) seed; }

First question: what is the lifetime of the buffers pointed to by the
user_unif-* functions, and who is responsible for cleaning them up?  In
the help file they are static variables, but in general they might be
allocated on the heap or might be in structures that only persist as
long as the generator does.

Since the example uses static variables, it seems reasonable to conclude
the core R code is not going to try to free them.

Second, are the types really correct?  The documentation seems quite
explicit, all the more so because it uses Int32 in places.  However, the
code in RNG.c (RNG_Init) says

ns = *((int *) User_unif_nseed());
if (ns  0 || ns  625) {
warning(_(seed length must be in 0...625; ignored));
break;
}
RNG_Table[kind].n_seed = ns;
RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc();
consistent with the earlier definition of RNG_Table entries as
typedef struct {
RNGtype kind;
N01type Nkind;
char *name; /* print name */
int n_seed; /* length of seed vector */
Int32 *i_seed;
} RNGTAB;

This suggests that the type of user_unif_seedloc is Int32*, not int *.
It also suggests that user_unif_nseed should return the number of 32 bit
integers.  The code for PutRNGstate(), for example, uses them in just
that way.

While the dominant model, even on 64 bit hardware, is probably to leave
int as 32 bit, it doesn't seem wise to assume that is always the case.

I got into this because I'm trying to extend the rsprng code; sprng
returns its state as a vector of bytes.  Converting these to a vector of
integers depends on the integer length, hence my interest in the exact
definiton of integer.  I'm interested in lifetime because I believe
those bytes are associated with the stream and become invalid when the
stream is freed; furthermore, I probably need to copy them into a buffer
that is padded to full wordlength.  This means I allocate the buffer
whose address is returned to the core R RNG machinery.  Eventually
somebody needs to free the memory.

Far more of my rsprng adventures are on
http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng.  Feel
free to read, correct, or extend it.

Thanks.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] user supplied random number generators

2009-07-30 Thread Ross Boylan

On Thu, 2009-07-30 at 12:32 +0200, Christophe Dutang wrote:
  This suggests that the type of user_unif_seedloc is Int32*, not int
 *.
  It also suggests that user_unif_nseed should return the number of
 32  
  bit
  integers.  The code for PutRNGstate(), for example, uses them in
 just
  that way.
 
  While the dominant model, even on 64 bit hardware, is probably to  
  leave
  int as 32 bit, it doesn't seem wise to assume that is always the
 case.
 
 You can test the size of an int with a configure script. see for  
 example the package foreign, the package randtoolbox (can be found
 in  
 Rmetrics R forge project) I maintain with Petr Savicky.
http://cran.r-project.org/doc/manuals/R-admin.html#Choosing-between-32_002d-and-64_002dbit-builds
 says All current versions of R use 32-bit integers.

Also, sizeof(int) works at runtime.  But my question was really about
whether code for user defined RNGs should be written using Int32 or int
as the target type for the state vector.  The R core code suggests to me
one should use Int32, but the documentation says int.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] beginner's guide to C++ programming with R packages?

2009-07-03 Thread Ross Boylan

On Fri, 2009-06-26 at 16:17 -0400, Whit Armstrong wrote:
  But this draws me back to the basic question.  I don't want to run R
  CMD INSTALL 20 times per hour.  How do developers actually test
  their code?
 
 check out RUnit for tests.
 http://cran.r-project.org/web/packages/RUnit/index.html
 
 as for testing c++ code.  I have taken an approach which is probably
 different than most.  I try to build my package as a c++ library that
 can be used independent of R.  Then you can test your library with
 whatever c++ test suite that you prefer.  Once you are happy, then
I also have C++ tests that operate separately from R, though I have a
very small library of stub R functions to get the thing to build.  There
have been some tricky issues with R (if one links to the regular R
library) and the test framework fighting over who was main.  I think
that's why I switched to the stub.

Working only with R level tests alone does not permit the kind of lower
level testing that you can get by running your own unit tests.  I use
the boost unit test framework.

Of course, you want R level tests too.  Some of my upper level C++ tests
are mirror images of R tests; this can help identify if a problem lies
at the interface.

For C++ tests I build my code in conjunction with a main program.

I think I also have or had a test building it as a library, but I don't
use that much.

For R, my modules get built into a library.

It's usually cleaner to build the R library from a fresh version of the
sources; otherwise scraps of my other builds tend to end up in the R
package.

Thanks, Whit, for the  pointers to Rcpp and RAbstraction.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] S4 class redefinition

2009-06-30 Thread Ross Boylan

I haven't found much on S4 class redefinition; the little I've seen
indicates the following is to be expected:
1. setClass(foo, )
2. create objects of class foo.
3. execute the same setClass(foo, ...) again (source the same file).
4. objects from step 2 are now NULL.

Is that the expected behavior (I ran under R 2.7.1)?

Assuming it is, it's kind of unfortunate.  I can wrap my setClass code
like this
if (! isClass(foo)) setClass(foo, )
to avoid this problem.  I've seen this in other code; is that the
standard solution? 

I thought that loading a library was about the same as executing the
code in its source files (assuming R only code).  But if this were true
my saved objects would be nulled out each time I loaded the
corresponding library.  That does not seem to be the case.  Can anyone
explain that?

Do I need to put any protections around setMethod so that it doesn't run
if the method is already defined?

At the moment I'm not changing the class defintion but am changing the
methods, so I can simply avoid running setClass.

But if I want to change the class, most likely by adding a slot, what do
I do?  At the moment it looks as if I'd need to make a new class name,
define some coerce methods, and then locate and change the relevant
instances.  Is there a better way?

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 class redefinition

2009-06-30 Thread Ross Boylan

On Tue, 2009-06-30 at 12:58 -0700, Ross Boylan wrote:
 I haven't found much on S4 class redefinition; the little I've seen
 indicates the following is to be expected:
 1. setClass(foo, )
 2. create objects of class foo.
 3. execute the same setClass(foo, ...) again (source the same file).
 4. objects from step 2 are now NULL.
I'm sorry; step 4 is completely wrong.  The objects seem to be
preserved.  Some slightly modified questions remain.

Is it safe to reexecute identical code for setClass or setMethod when
you have existing objects of the class around?

Is there any protection, such as checking for existing definitions, that
is recommended before executing setClass or setMethod?

If you want to change a class or method, and have existing objects, how
do you do that?

Can scoping rules lead to situations in which some functions or methods
end up with references to the older version of the methods?  One example
is relevant to class constructors, and shows they can:

Here's a little test
 trivial - function() 3  # stand in for a class constructor
 maker - function(c=trivial)
+   function(x) x+c()
 oldf - maker()
 oldf(4)
[1] 7
 trivial - function() 20
 oldf(4)
[1] 7
 newf - maker()
 newf(8)
[1] 28

So the old definition is frozen in the inner function, for which it was
captured by lexical scope.  Although definition of maker is not redone
after trivial is redefined, maker's default argument does get the new
value of trivial.  

Methods add another layer.  I'm hoping those with a deeper understanding
than mine can clarify where the danger spots are, and how to deal with
them.

Thanks.
Ross
 
 Is that the expected behavior (I ran under R 2.7.1)?
 
 Assuming it is, it's kind of unfortunate.  I can wrap my setClass code
 like this
 if (! isClass(foo)) setClass(foo, )
 to avoid this problem.  I've seen this in other code; is that the
 standard solution? 
 
 I thought that loading a library was about the same as executing the
 code in its source files (assuming R only code).  But if this were true
 my saved objects would be nulled out each time I loaded the
 corresponding library.  That does not seem to be the case.  Can anyone
 explain that?
 
 Do I need to put any protections around setMethod so that it doesn't run
 if the method is already defined?
 
 At the moment I'm not changing the class defintion but am changing the
 methods, so I can simply avoid running setClass.
 
 But if I want to change the class, most likely by adding a slot, what do
 I do?  At the moment it looks as if I'd need to make a new class name,
 define some coerce methods, and then locate and change the relevant
 instances.  Is there a better way?
 
 Thanks.
 Ross Boylan
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] could not find function in R CMD check [solved, but is this an R bug?]

2007-09-12 Thread Ross Boylan

On Wed, 2007-09-12 at 11:31 +0200, Uwe Ligges wrote:
 Perhaps Namespace issues? But no further ideas. You might want to make 
 your package available at some URL so that people can look at it and help...
 
 Uwe Ligges
 
Thanks.  The problem lay elsewhere.  I was able to fix it by adding 
library(mspath)
to the top of the .R file in data/ that defined some data using the
package's functions  In other words
---
library(mspath)
### from various papers on errors in reading fibrosis scores
rousselet.jr - readingError(c(.91, .09, 0, 0, 0,
   .11, .78, .11, 0, 0,
   0, .17, .75, .08, 0,
   0, .06, .44, .50, 0,
   0, 0, 0, .07, .93),
 byrow=TRUE, nrow=5, ncol=5)
-
works as data/readingErrorData.R, but without the library() call I  get
the error shown in my original message (see below).  readingError() is a
function defined in my package.

Does any of this inndicate a bug or undesirable feature in R?

First, it seems a little odd that I need to include loading the library
in data that is defined in the same library.  I think I've noticed
similar behavior in other places, maybe the code snippets that accompany
the documentation pages (that is, one needs library(mypackage) in order
for the snippets to check out).

Second, should R CMD check fail so completely and opaquely in this
situation?


 
 Ross Boylan wrote:
  During R CMD check I get this:
  ** building package indices ...
  Error in eval(expr, envir, enclos) : could not find function
  readingError
  Execution halted
  ERROR: installing package indices failed
  
  The check aborts there.  readingError is a function I just added; for
  reference
  setClass(readingError, contains=matrix)
  readingError - function(...) new(readingError, matrix(...))
  which is in readingError.R in the project's R subdirectory.
  
  Some code in the data directory invokes readingError, and the .Rd file
  includes \alias{readingError}, \alias{readingError-class},
  \name{readingError-class} and an example invoking readingError.
  
  I'm using R 2.5.1 as packaged for Debian GNU/Linux.
  
  Does anyone have an idea what's going wrong here, or how to fix or debug
  it?
  
  The code seems to work OK when I use it from ESS.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] could not find function in R CMD check

2007-09-11 Thread Ross Boylan

During R CMD check I get this:
** building package indices ...
Error in eval(expr, envir, enclos) : could not find function
readingError
Execution halted
ERROR: installing package indices failed

The check aborts there.  readingError is a function I just added; for
reference
setClass(readingError, contains=matrix)
readingError - function(...) new(readingError, matrix(...))
which is in readingError.R in the project's R subdirectory.

Some code in the data directory invokes readingError, and the .Rd file
includes \alias{readingError}, \alias{readingError-class},
\name{readingError-class} and an example invoking readingError.

I'm using R 2.5.1 as packaged for Debian GNU/Linux.

Does anyone have an idea what's going wrong here, or how to fix or debug
it?

The code seems to work OK when I use it from ESS.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] codetools really optional for R CMD check?

2007-07-25 Thread Ross Boylan

After upgrading to R 2.5.1 on Debian, R CMD check gives
* checking Rd cross-references ... WARNING
Error in .find.package(package, lib.loc) : 
there is no package called 'codetools'
Execution halted
* checking for missing documentation entries ... WARNING
etc

The NEWS file says (for 2.5.0; I was on 2.4 before the recent upgrade)
o   New recommended package 'codetools' by Luke Tierney provides
code-analysis tools.  This can optionally be used by 'R CMD
check' to detect problems, especially symbols which are not
visible.

This sounds as if R CMD check should run OK without the package, and it
doesn't seem to.

Have I misunderstood something, or is their a problem with R CMD check's
handling of the case with missing codetools.

I don't have codetools installed because the Debian r-recommended
package was missing a dependency; I see that's already been fixed
(wow!).

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Reported invalid memory references

2007-06-13 Thread Ross Boylan

While testing for leaks in my own code I noticed some reported memory
problems from valgrind, invoked with 
$ R --vanilla -d valgrind --leak-check=full

This is on Debian GNU/Linux (testing aka lenny) with a 2.6 kernel, R
package version 2.4.1-2.  I was running in an emacs shell.

The immediate source of all the problems before I get to the prompt is
the system dynamic loader ld-2.5.so, invoked from R.  Then, when I exit,
there are a bunch of reported leaks, some of which appear to be more
directly from R (though some involve, e.g., readline).

Are these reported errors actually problems?  If so, do they indicate
problems in R or some other component (e.g., ld.so).  Put more
practically, should I file one or more bugs, and if so, against what?

Thanks.
Ross Boylan

==30551== Invalid read of size 4
==30551==at 0x4016503: (within /lib/ld-2.5.so)
==30551==by 0x4006009: (within /lib/ld-2.5.so)
==30551==by 0x40084F5: (within /lib/ld-2.5.so)
==30551==by 0x40121D4: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4414494: __libc_dlopen_mode
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF73E: __nss_lookup_function
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43F1595: __nss_passwd_lookup
(in /lib/i686/cmov/libc-2.5.so)
==30551==  Address 0x4EFB560 is 32 bytes inside a block of size 34
alloc'd
==30551==at 0x40234B0: malloc (vg_replace_malloc.c:149)
==30551==by 0x4008AF3: (within /lib/ld-2.5.so)
==30551==by 0x40121D4: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4414494: __libc_dlopen_mode
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF73E: __nss_lookup_function
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43F1595: __nss_passwd_lookup
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x439D87D: getpwuid_r (in /lib/i686/cmov/libc-2.5.so)
==30551== 
==30551== Invalid read of size 4
==30551==at 0x4016530: (within /lib/ld-2.5.so)
==30551==by 0x4006009: (within /lib/ld-2.5.so)
==30551==by 0x40084F5: (within /lib/ld-2.5.so)
==30551==by 0x400C616: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x400CBDA: (within /lib/ld-2.5.so)
==30551==by 0x4012234: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4414494: __libc_dlopen_mode
(in /lib/i686/cmov/libc-2.5.so)
==30551==  Address 0x4EFB8A8 is 24 bytes inside a block of size 27
alloc'd
==30551==at 0x40234B0: malloc (vg_replace_malloc.c:149)
==30551==by 0x4008AF3: (within /lib/ld-2.5.so)
==30551==by 0x400C616: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x400CBDA: (within /lib/ld-2.5.so)
==30551==by 0x4012234: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4414494: __libc_dlopen_mode
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF73E: __nss_lookup_function
(in /lib/i686/cmov/libc-2.5.so)
==30551== 
==30551== Conditional jump or move depends on uninitialised value(s)
==30551==at 0x400B3CC: (within /lib/ld-2.5.so)
==30551==by 0x401230B: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4414494: __libc_dlopen_mode
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF73E: __nss_lookup_function
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so)
==30551==by 0x43F1595: __nss_passwd_lookup
(in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x439D87D: getpwuid_r (in /lib/i686/cmov/libc-2.5.so)
==30551==by 0x439D187: getpwuid (in /lib/i686/cmov/libc-2.5.so)
==30551== 
==30551== Conditional jump or move depends on uninitialised value(s)
==30551==at 0x400B0CA: (within /lib/ld-2.5.so)
==30551==by 0x401230B: (within /lib/ld-2.5.so)
==30551==by 0x400E255: (within /lib/ld-2.5.so)
==30551==by 0x4011C5D: (within /lib/ld-2.5.so)
==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so

[Rd] undefined symbol: Rf_rownamesgets

2007-04-17 Thread Ross Boylan

I get the error
 undefined symbol: Rf_rownamesgets
when I try to load my package, which include C++ code that calls that
function.  This is particularly strange since the code also calls
Rf_classgets, and it loaded OK with just that.

Can anyone tell me what's going on?  

For the record, I worked around this with the general purpose
attribute setting commands and R_RowNamesSymbol.  I discovered that
even with that I wasn't constructing a valid data.frame, and fell back
to returning a list of results.

I notice Rinternals.h defines
LibExtern SEXP  R_RowNamesSymbol;   /* row.names */
twice in the same block of code.

I'm using R 2.4.1 on Debian. The symbol seems to be there:
$ nm -D /usr/lib/R/lib/libR.so | grep classgets
00032e70 T Rf_classgets
$ nm -D /usr/lib/R/lib/libR.so | grep namesgets
00031370 T Rf_dimnamesgets
00034500 T Rf_namesgets

The source includes
#define R_NO_REMAP 1
#include R.h
#include Rinternals.h
and later
#include memory  // I think this is why I needed R_NO_REMAP

I realize this is not a complete example, but I'm hoping this will
ring a bell with someone.  I encountered this while running
R CMD check.  The link line generated was
g++ -shared  -o mspath.so AbstractTimeStepsGenerator.o Coefficients.o 
CompositeHistoryComputer.o CompressedTimeStepsGenerator.o Covariates.o Data.o 
Environment.o Evaluator.o FixedTimeStepsGenerator.o LinearProduct.o Manager.o 
Model.o ModelBuilder.o Path.o PathGenerator.o PrimitiveHistoryComputer.o 
SimpleRecorder.o Specification.o StateTimeClassifier.o SuccessorGenerator.o 
TimePoint.o TimeStepsGenerator.o mspath.o mspathR.o   -L/usr/lib/R/lib -lR

Thanks.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] undefined symbol: Rf_rownamesgets

2007-04-17 Thread Ross Boylan

On Tue, Apr 17, 2007 at 11:07:12PM -0400, Duncan Murdoch wrote:
 On 4/17/2007 10:43 PM, Ross Boylan wrote:
 I get the error
  undefined symbol: Rf_rownamesgets
 when I try to load my package, which include C++ code that calls that
 function.  This is particularly strange since the code also calls
 Rf_classgets, and it loaded OK with just that.
 
 Can anyone tell me what's going on?  
 
 For the record, I worked around this with the general purpose
 attribute setting commands and R_RowNamesSymbol.  I discovered that
 even with that I wasn't constructing a valid data.frame, and fell back
 to returning a list of results.
 
 I notice Rinternals.h defines
 LibExtern SEXP   R_RowNamesSymbol;   /* row.names */
 twice in the same block of code.
 
 I'm using R 2.4.1 on Debian. The symbol seems to be there:
 $ nm -D /usr/lib/R/lib/libR.so | grep classgets
 00032e70 T Rf_classgets
 $ nm -D /usr/lib/R/lib/libR.so | grep namesgets
 00031370 T Rf_dimnamesgets
 00034500 T Rf_namesgets
 
 I don't see Rf_rownamesgets there, or in the R Externals manual among 
 the API entry points listed.  
You're right; sorry.  So does this function just not exist?  If so,
it would be good to remove the corresponding entries in Rinternals.h.

Can't you use the documented dimnamesgets?
I did one better and didn't use anything!  I thought presence in
Rinternals.h constituted (terse) documentation, since the R Externals
manual says (Handling R objects in C)
---
   There are two approaches that can be taken to handling R objects from
within C code.  The first (historically) is to use the macros and
functions that have been used to implement the core parts of R through
`.Internal' calls.  A public subset of these is defined in the header
file `Rinternals.h' ...
-

So is relying on Rinternals.h a bad idea?

In this case, accessing the row names through dimnamesgets looks a
little awkward, since it requires navigating to the right spot in
dimnames.  I would need Rf_dimnamesgets since I disabled the shortcut
names.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] future plans for missing() in inner functions

2007-04-03 Thread Ross Boylan

Currently, if one wants to test if an argument to an outer function is
missing from within an inner function, this works:
 g5 - function(a) {
+   inner - function(a) {
+ if (missing(a))
+   outer arg is missing
+ else
+   found outer arg!
+   }
+   inner(a)
+ }
 g5(3)
[1] found outer arg!
 g5()
[1] outer arg is missing

While if inner is defined as function() (without arguments) one gets
an error: 'missing' can only be used for arguments.

However, ?missing contains a note that this behavior is subject to
change.

I'm particularly interested in whether the code shown above will
continue to work.  While it does what I want in this case, the
behavior seems a bit surprising since textually the call to inner does
provide an argument.  So it seems possible that might change.

Can anyone provide more insight into how things may change?

Thanks.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Replacing slot of S4 class in method of S4 class?

2007-03-30 Thread Ross Boylan

On Fri, Mar 30, 2007 at 10:45:38PM +0200, cstrato wrote:
 Dear all,
 
 Assume that I have an S4 class MyClass with a slot myname, which
 is initialized to:  myname= in method(initialize):
myclass - new(MyClass, myname=)
 
 Assume that class MyClass has a method mymethod:
mymethod.MyClass -
function(object, myname=character(0), ...) {
[EMAIL PROTECTED] - myname;
#or:myName(object) - myname
}
setMethod(mymethod, MyClass, mymethod.MyClass);
 
 Furthermore, I have a replacement method:
 setReplaceMethod(myName, signature(object=MyClass, value=character),
function(object, value) {
   [EMAIL PROTECTED] - value;
   return(object);
}
 )
 
 I know that it is possible to call:
myName(myclass) - newname
 
 However, I want to replace the value of slot myname for object myclass
 in method mymethod:
mymethod(myclass,  myname=newname)
 
 Sorrowly, the above code in method mymethod does not work.
 
 Is there a possibility to change the value of a slot in the method of a 
 class?

Yes, but to make the effect persistent (visible might be a more
accurate description) that method must return the
object being updated, and you must use the return value.  R uses call
by value semantics, so in the definition of mymethod.MyClass when you
change object you only change a local copy.  It needs to be
 mymethod.MyClass -
function(object, myname=character(0), ...) {
[EMAIL PROTECTED] - myname;
object
  }

Further, if you invoke it with
mymethod(myclass, new name)
you will discover myclass is unchanged.  You need
myclass - mymethod(myclass, new name)

You might consider using the R.oo package, which probably has
semantics closer to what you're expecting.  Alternately, you could
study more about R and functional programming.

Ross Boylan

P.S. Regarding the follow up saying that this is the wrong list, the
guide to mailing lists says of R-devel
This list is intended for questions and discussion about code
development in R. Questions likely to prompt discussion unintelligible
to non-programmers or topics that are too technical for R-help's
audience should go to R-devel,
The question seems to fall under this description to me, though I am
not authoritative.  It is true that further study would have disclosed
what is going on.  Since the same thing tripped me up too, I thought
I'd share the answer.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rmpi and OpenMPI ?

2007-03-30 Thread Ross Boylan

On Fri, Mar 30, 2007 at 03:01:19PM -0500, Dirk Eddelbuettel wrote:
 
 On 30 March 2007 at 12:48, Ei-ji Nakama wrote:
 | Prof. Nakano(ism Japan) and I wrestled in Rmpi on HP-MPI.
 | Do not know a method to distinguish MPI well?
 | It is an ad-hoc patch at that time as follows.

There are some autoconf snippets for figuring out how to compile
various MPI versions; it's not clear to me they are much help in
figuring out which version you've got.  Perhaps they are some help:
http://autoconf-archive.cryp.to/ax_openmp.html
http://autoconf-archive.cryp.to/acx_mpi.html

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check ignores .Rbuildignore?

2007-03-19 Thread Ross Boylan

On Mon, Mar 19, 2007 at 11:33:37AM +0100, Martin Maechler wrote:
  RossB == Ross Boylan [EMAIL PROTECTED]
  on Sun, 18 Mar 2007 12:39:14 -0700 writes:
 
 RossB The contents of .Rbuildignore seems to affect
 RossB R CMD build
 RossB but not 
 RossB R CMD check.
 
 RossB I'm using R 2.4.0 on Debian.
 
 RossB Is my understanding correct?  
 
 yes.  That's why it's called 'buildignore'.
 It's a big feature for me as package developer:
It's more of a bug for me :(.  I was thinking of check as answering
the question If I build this package, will it work?
 
 E.g., I can have extra tests (which e.g. only apply on my
 specific platform) in addition to those which are use when the
 package is built (to be checked on all possible platforms).
 
 Some have proposed to additionally define a '.Rcheckignore' 
 but they haven't been convincing enough.

How about an option to have check use the buildignore file?  If there
are 2 separate files, there's always the risk they will get out of
sync.  Of course, in your case, you want them out of sync...

 
 RossB And is there anything I can do about it?
 
 First build, then check is one way;
 Something which is recommended anyway in some cases,
 e.g., if you have an (Sweave - based) vignette.
Kurt Hornick, offlist, also advised this, as well as noting that using
R CMD check directly on the main development directory isn't really
supported.

From my perspective, needing to do a build before a check is extra
friction, which would reduce the amount of checking I do during
development.  Also, doesn't build do some of the same checks as check?

Minimally, I think some advice in the R Extensions manual needs to be
qualified:
In 
1.3 Checking and building packages
==

Before using these tools, please check that your package can be
installed and loaded.  `R CMD check' will _inter alia_ do this, but you
will get more informative error messages doing the checks directly.

---
There seem to be a couple of problems with this advice, aside from the
fact that it says to check first.  One problem is that the advice
seems internally inconsistent.  Before using these tools seems to
refer to the build and check tools in the section.  Since check is one
of the tools, it can't be used before using the tools.

Also, I still can't figure out what doing the checks directly refers
to.

Another section:
1.3.2 Building packages
---
[2 paragraphs skipped]

   Run-time checks whether the package works correctly should be
performed using `R CMD check' prior to invoking the build procedure.

Since this advice, check then build, is the exact opposite of the
current recommendations, build then check, it probably needs to be
changed.

R CMD check --help (and other spots) refer to the command as
Check R packages from package sources, which can be directories or
gzipped package 'tar' archives with extension '.tar.gz' or '.tgz'.  I
read this as referring to my source directory, although I guess other
readings are possible (i.e., package source = the source bundle as
distributed).

 
 RossB In my case, some of the excluded files contain references to other
 RossB libraries, so linking fails under R CMD check.  I realize I could 
 add
 RossB the library to the build (with Makevars, I guess), but I do not 
 want
 RossB to introduce the dependency.
 
 It depends on the circumstances on how I would solve this
 problem 
 [Why have these files as part of the package sources at all?]

I have 2 kinds of checks, those at the R level, which get executed as
part of R CMD check, and C++ unit tests, which I execute separately.
The latter use the boost test library, part of which is a link-time
library that ordinary users should not need.

All the C++ sources come from a web (as in Knuth's web) file, so the
main files and the testing files all get produced at once.

I worked around the problem by using the top level config script to
delete the test files.  I suppose I could also look into moving those
files into another directory when they are produced, but that would
further complicate my build system.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check ignores .Rbuildignore? [correction]

2007-03-19 Thread Ross Boylan

On Mon, Mar 19, 2007 at 10:38:02AM -0700, Ross Boylan wrote:
 Kurt Hornick, offlist, also advised this, as well as noting that using
Sorry.  That should be Kurt Hornik.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R/C++/memory leaks

2007-02-26 Thread Ross Boylan

On Mon, 2007-02-26 at 16:08 +, Ernest Turro wrote:
 Thanks for your comments Ross. A couple more comments/queries below:
 
 On 26 Feb 2007, at 06:43, Ross Boylan wrote:
 
  [details snipped]
 
  The use of the R api can be confined to a wrapper function.  But I can
  think of no reason that a change to the alternate approach I outlined
  would solve the apparent leaking you describe.
 
 
 I'm not sure I see how a wrapper function using the R API would  
 suffice. Example:
It doesn't sound as if it would suffice.  I was responding to your
original remark that

 Since this is a standalone C++ program too, I'd rather use the R API  
 as little as possible... But I will look at your solution if I find  
 it is really necessary.. Thanks

I thought that was expressing a concern about using the alternate
approach I outlined because it would use the R API.  If you need to use
that API for other reasons, you're still stuck with it :)
 
 During heavy computation in the C++ function I need to allow  
 interrupts from R. This means that R_CheckUserInterrupt needs to be  
 called during the computation. Therefore, use of the R API can't be  
 confined to just the wrapper function.
 
 In fact, I'm worried that some of the libraries I'm using are failing  
 to release memory after interrupt and that that is the problem. I  
 can't see what I could do about that... E.g.
 
 #include valarray
 
 valarraydouble foo; // I don't know 100% that the foo object hasn't  
 allocated some memory. if the program is interrupted it wouldn't be  
 released
That's certainly possible, but you seem to be overlooking the
possibility that all the code is releasing memory appropriately, but the
process's memory footprint isn't going down correspondingly.  In my
experience that's fairly typical behavior.

In that case, depending on your point of view, you either don't have a
problem or you have a hard problem.  If you really want the memory
released back to the system, it's a hard problem.  If you don't care, as
long as you have no leaks, all's well.

 
 I find it's very unfortunate that R_CheckUserInterrupt doesn't return  
 a value. If it did (e.g. if it returned true if an interrupt has  
 occurred), I could just branch off somewhere, clean up properly and  
 return to R.
 
 Any ideas on how this could be achieved?
I can't tell from the info page what function gets called in R if there
is an interrupt, but it sounds as you could do the following hack:
The R interrupt handler gets a function that calls a C function of your
devising.  The C function sets a flag meaning interrupt requested.
Then in your main code, you periodically call R_CheckUserInterrupt.
When it returns you check the flag; if it's set, you cleanup and exit.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R/C++/memory leaks

2007-02-25 Thread Ross Boylan

On Sun, Feb 25, 2007 at 05:37:24PM +, Ernest Turro wrote:
 Dear all,
 
 I have wrapped a C++ function in an R package. I allocate/deallocate  
 memory using C++ 'new' and 'delete'. In order to allow user  
 interrupts without memory leaks I've moved all the delete statements  
 required after an interrupt to a separate C++ function freeMemory(),  
 which is called using on.exit() just before the .C() call.

Do you mean that you call on.exit() before the .C, and the call to
on.exit() sets up the handler?  Your last sentence sounds as if you
invoke freeMemory() before the .C call.

Another approach is to associate your C objects with an R object, and
have them cleaned up when the R object gets garbage collected.
However, this requires switching to a .Call interface from the more
straightforward .C interface.

The finalizer call I used doesn't assure cleanup on exit. The optional
argument to R_RegisterCFinalizerEx might provide such assurance, but I
couldn't tell what it really does.  Since all memory should
be released by the OS, when the process ends, I wasn't so worried
about that.


Here's the pattern:
// I needed R_NO_REMAP to avoid name collisions.  You may not.
#define R_NO_REMAP 1
#include R.h
#include Rinternals.h

extern C {
// returns an |ExternalPtr|
SEXP makeManager(
@makeManager args@);


// user should not need to call
// cleanup
void finalizeManager(SEXP ptr);

}

SEXP makeManager(
@makeManager args@){
//  stuff

Manager* pmanager = new Manager(pd, pm.release(), 
*INTEGER(stepNumerator), *INTEGER(stepDenominator),
(*INTEGER(isexact)) != 0);

// one example didn't use |PROTECT()|
SEXP ptr;
Rf_protect(ptr = R_MakeExternalPtr(pmanager, R_NilValue, R_NilValue));
R_RegisterCFinalizer(ptr, (R_CFinalizer_t) finalizeManager);
Rf_unprotect(1);
return ptr;

}

void finalizeManager(SEXP ptr){
  Manager *pmanager = static_castManager *(R_ExternalPtrAddr(ptr));
  delete pmanager;
  R_ClearExternalPtr(ptr);
}

I'd love to hear from those more knowledgeable about whether I did
that right, and whether the FinalizerEx call can assure cleanup on
exit.

Make manager needes to be called from R like this
  mgr - .Call(makeManager, args)

 
 I am concerned about the following. In square brackets you see R's  
 total virtual memory use (VIRT in `top`):
 
 1) Load library and data [178MB] (if I run gc(), then [122MB])
 2) Just before .C [223MB]
 3) Just before freeing memory [325MB]
So you explicitly call your freeMemory() function?
 4) Just after freeing memory [288MB]
There are at least 3 possibilities:
  * your C++ code is leaking
  * C++ memory is never really returned (Commonly, at least in C, the
  amount of memory allocated to the process never goes down, even if
  you do a free.  This may depend on the OS and the specific calls the
  program makes.
  * You did other stuff in R  that's still around.  After all you went
  up +45MB between 1 and 2; maybe it's not so odd you went up +65MB
  between 2 and 4.
 5) After running gc() [230MB]
 
 So although the freeMemory function works (frees 37MB), R ends up  
 using 100MB more after the function call than before it. ls() only  
 returns the data object so no new objects have been added to the  
 workspace.
 
 Do any of you have any idea what could be eating this memory?
 
 Many thanks,
 
 Ernest
 
 PS: it is not practical to use R_alloc et al because C++ allocation/ 
 deallocation involves constructors/destructors and because the C++  
 code is also compiled into a standalone binary (I would rather avoid  
 maintaining two separate versions).

I use regular C++ new's too (except for the external pointer that's
returned).  However, you can override the operator new in C++ so that
it uses your own allocator, e.g., R_alloc.  I'm not sure about all the
implications that might make that dangerous (e.g., can the memory be
garbage collected?  can it be moved?).  Overriding new is a bit tricky
since there are several variants.  In particular, there is one with
and one without an exception.  Also, invdividual classes can define
their own new operators; if you have any, you'd need to change those
too.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R/C++/memory leaks

2007-02-25 Thread Ross Boylan

Here are a few small follow-up comments:
On Sun, Feb 25, 2007 at 11:18:56PM +, Ernest Turro wrote:
 
 On 25 Feb 2007, at 22:21, Ross Boylan wrote:
 
 On Sun, Feb 25, 2007 at 05:37:24PM +, Ernest Turro wrote:
 Dear all,
 
 I have wrapped a C++ function in an R package. I allocate/deallocate
 memory using C++ 'new' and 'delete'. In order to allow user
 interrupts without memory leaks I've moved all the delete statements
 required after an interrupt to a separate C++ function freeMemory(),
 which is called using on.exit() just before the .C() call.
 
 Do you mean that you call on.exit() before the .C, and the call to
 on.exit() sets up the handler?  Your last sentence sounds as if you
 invoke freeMemory() before the .C call.
 
 
  'on.exit' records the expression given as its argument as needing
  to be executed when the current function exits (either naturally
  or as the result of an error).
 
 
 This means you call on.exit() somewhere at the top of the function.  
 You are guaranteed the expression you pass to on.exit() will be  
 executed before the function returns. So, even though you call on.exit 
 () before .C(), the expression you pass it will actually be called  
 after .C().
 
 This means you can be sure that freeMemory() is called even if an  
 interrupt or other error occurs.
 
 
 Another approach is to associate your C objects with an R object, and
 have them cleaned up when the R object gets garbage collected.
 However, this requires switching to a .Call interface from the more
 straightforward .C interface.
[details snipped]
 
 Since this is a standalone C++ program too, I'd rather use the R API  
 as little as possible... But I will look at your solution if I find  
 it is really necessary.. Thanks

The use of the R api can be confined to a wrapper function.  But I can
think of no reason that a change to the alternate approach I outlined
would solve the apparent leaking you describe.

 
 
 I am concerned about the following. In square brackets you see R's
 total virtual memory use (VIRT in `top`):
 
 1) Load library and data [178MB] (if I run gc(), then [122MB])
 2) Just before .C [223MB]
 3) Just before freeing memory [325MB]
 So you explicitly call your freeMemory() function?
 
 This is called thanks to on.exit()
 
 4) Just after freeing memory [288MB]
 There are at least 3 possibilities:
   * your C++ code is leaking
 
 The number of news and deletes are the same, and so is their  
 branching... I don't think it is this.
 
   * C++ memory is never really returned (Commonly, at least in C, the
   amount of memory allocated to the process never goes down, even if
   you do a free.  This may depend on the OS and the specific calls the
   program makes.
 
 OK, but the memory should be freed after the process completes,
 surely?

Most OS's I know will free memory when a process finishes, except for
shared memory.  But is that relevant?  I assume the process doesn't
complete until you exit R.  Your puzzle seems to involve different
stages within the life of a single process.

 
   * You did other stuff in R  that's still around.  After all you went
   up +45MB between 1 and 2; maybe it's not so odd you went up +65MB
   between 2 and 4.
 
 Yep, I do stuff before .C and that accounts for the increase  
 before .C. But all the objects created before .C go out of scope by  
 4) and so, after gc(), we should be back to 122MB. As I mentioned, ls 
 () after 5) returns only the data loaded in 1).

In principle (and according to ?on.exit) the expression registered by
on.exit is evaluated when the relevant function is exited.  In
principle garbage collection reclaims all unused space (though with no
guarantee of when).

It may be that the practice is looser than the principle.  For example,
Python always nominally managed memory for you, but I think for
quite awhile it didn't really reclaim the memory (because garbage
collection didn't exist or had been turned off).


 
 5) After running gc() [230MB]
 
 So although the freeMemory function works (frees 37MB), R ends up
 using 100MB more after the function call than before it. ls() only
 returns the data object so no new objects have been added to the
 workspace.
 
 Do any of you have any idea what could be eating this memory?
 
 Many thanks,
 
 Ernest
 
 PS: it is not practical to use R_alloc et al because C++ allocation/
 deallocation involves constructors/destructors and because the C++
 code is also compiled into a standalone binary (I would rather avoid
 maintaining two separate versions).
 
 I use regular C++ new's too (except for the external pointer that's
 returned).  However, you can override the operator new in C++ so that
 it uses your own allocator, e.g., R_alloc.  I'm not sure about all the
 implications that might make that dangerous (e.g., can the memory be
 garbage collected?  can it be moved?).  Overriding new is a bit tricky
 since there are several variants.  In particular, there is one with
 and one without an exception

Re: [Rd] trying to understand condition handling

2007-02-20 Thread Ross Boylan

On Tue, Feb 20, 2007 at 07:35:51AM +, Prof Brian Ripley wrote:
 Since you have not told us what 'the documents' are (and only vaguely 
 named one), do you not think your own documentation is inadequate?
I mean the command description produced by ?tryCatch.
 
 There are documents about the condition system on developer.r-project.org: 
 please consult them.
 
OK, though I would hope the user level documentation would suffice.
 I quess 'ctl-C' is your private abbreviation for 'control C' (and not a 
 type of cancer): that generates an interrrupt in most (but not all) R 
 ports.  Where it does, you can set up interrupt handlers (as the help page 
 said)
 
My P.S. concerned whether the code that was interrupted could continue
from the point of interruption.  As far as I can tell from ?tryCatch
there is not,

 
 On Mon, 19 Feb 2007, Ross Boylan wrote:
 
 I'm confused by the page documenting tryCatch and friends.
 
 I think it describes 3 separate mechanisms: tryCatch (in which control
 returns to the invoking tryCatch), withCallHandlers (in which control
 goes up to the calling handler/s but then continues from the point at
 which signalCondition() was invoked), and withRestarts (I can't tell
 where control ends up).
 
 For tryCatch the docs say the arguments ... provide handlers, and that
 these are matched to the condition.  It appears that matching works by
 providing entries in ... as named arguments, and the handler matches
 if the name is one of the classes of the condition.  Is that right?  I
 don't see the matching rule explicitly stated.  And then  the handler
 itself is a single argument function, where the argument is the
 condition?
 
 My reading is that if some code executes signalCondition and it is
 running inside a tryCatch, control will not return to the line after
 the signalCondition.  Whereas, if the context is withCallHandlers,
 the call to signalCondition does return (with a NULL) and execution
 continues.  That seems odd; do I have it right?
 
 Also, the documents don't explicitly say that the abstract subclasses
 of 'error' and 'warning' are subclasses of 'condition', though that
 seems to be implied and true.
 
 It appears that for tryCatch only the first matching handler is
 executed, while for withCallHandlers all matching handlers are
 executed.
 
 And, finally, with restarts there is again the issue of how the name
 in the name=function form gets matched to the condition, and the more
 basic question of what happens.  My guess is that control stays with
 the handler, but then this mechanism seems very similar to tryCatch
 (with the addition of being able to pass extra arguments to the
 handler and  maybe a more flexible handler specification).
 
 Can anyone clarify any of this?
 
 P.S. Is there any mechanism that would allow one to trap an interrupt,
 like a ctl-C,  so that if the user hit ctl-C some state would be
 changed but execution would then continue where it was?  I have in
 mind the ctl-C handler setting a time to finish up flag which the
 maini code checks from time to time.
 
 Thanks.
 
 Ross Boylan
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] trying to understand condition handling (interrupts)

2007-02-20 Thread Ross Boylan

[resequencing and deleting for clarity]
On Tue, Feb 20, 2007 at 01:15:25PM -0600, Luke Tierney wrote:
 On Tue, 20 Feb 2007, Ross Boylan wrote:
 P.S. Is there any mechanism that would allow one to trap an interrupt,
 like a ctl-C,  so that if the user hit ctl-C some state would be
 changed but execution would then continue where it was?  I have in
 mind the ctl-C handler setting a time to finish up flag which the
 maini code checks from time to time.
 
 On Tue, Feb 20, 2007 at 07:35:51AM +, Prof Brian Ripley wrote:
...

 I quess 'ctl-C' is your private abbreviation for 'control C' (and
 not a
[yes, ctl-C = control C, RB]
 type of cancer): that generates an interrrupt in most (but not all) R
 ports.  Where it does, you can set up interrupt handlers (as the help page
 said)
 
 My P.S. concerned whether the code that was interrupted could continue
 from the point of interruption.  As far as I can tell from ?tryCatch
 there is not,
 
 Currently interrupts cannot be handled in a way that allows them to
 continue at the point of interruption.  On some platforms that is not
 possible in all cases, and coming close to it is very difficult.  So
 for all practical purposes only tryCatch is currently useful for
 interrupt handling.  At some point disabling interrupts will be
 possible from the R level but currently I believe it is not.
 
 Best,
 
 luke
I had suspected that, since R is not thread-safe, handling
asynchronous events might be challenging.

I tried the following experiment on Linux:

 h-function(e) print(Got You!)
 f-function(n, delay) for (i in seq(n)) {Sys.sleep(delay); print(i)}
 withCallingHandlers(f(7,1), interrupt=h)
[1] 1
[1] Got You!

So in this case the withCallingHandlers acts like a tryCatch, in that
control does not return to the point of interruption.  However,
sys.calls within h does show where things were just before the
interrupt:
 h-function(e) {print(Got You!); print(sys.calls());}
 withCallingHandlers(f(7,1), interrupt=h)
[1] 1
[1] 2
[1] 3
[1] Got You!
[[1]]
withCallingHandlers(f(7, 1), interrupt = h)

[[2]]
f(7, 1)

[[3]]
Sys.sleep(delay)

[[4]]
function (e)
{
print(Got You!)
print(sys.calls())
}(list())


Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] trying to understand condition handling

2007-02-20 Thread Ross Boylan

Thanks; your response is very helpful.  This message has some remarks
on my questions relative to the developer docs, one additional
question, and some documentation comments.

I'm really glad to hear you plan to revise the exception/condition
docs. since I  found the existing ones a bit murky.

Below, [1] means
http://www.stat.uiowa.edu/~luke/R/exceptions/simpcond.html,
one of the documents Prof Ripley referred to.

That page also has a nice illustration of using the restart facility.

On Tue, Feb 20, 2007 at 01:40:11PM -0600, Luke Tierney wrote:
 On Mon, 19 Feb 2007, Ross Boylan wrote:
 
 I'm confused by the page documenting tryCatch and friends.
 
 I think it describes 3 separate mechanisms: tryCatch (in which control
 returns to the invoking tryCatch), withCallHandlers (in which
  should have been withCallingHandlers
 control
 goes up to the calling handler/s but then continues from the point at
 which signalCondition() was invoked),
 
 unless a handler does a non-local exit, typically by invoking a restart
 
 and withRestarts (I can't tell
 where control ends up).
 
 at the withRestarts call
 
 For tryCatch the docs say the arguments ... provide handlers, and that
 these are matched to the condition.  It appears that matching works by
 providing entries in ... as named arguments, and the handler matches
 if the name is one of the classes of the condition.  Is that right?  I
 don't see the matching rule explicitly stated.  And then  the handler
 itself is a single argument function, where the argument is the
 condition?
From [1], while discussing tryCatch,

Handlers are specified as

name = fun

where name specifies an exception class and fun is a function of one
argument, the condition that is to be handled.  


...
 
 Also, the documents don't explicitly say that the abstract subclasses
 of 'error' and 'warning' are subclasses of 'condition', though that
 seems to be implied and true.
The class relations are explicit in [1].

 
 It appears that for tryCatch only the first matching handler is
 executed, while for withCallHandlers all matching handlers are
 executed.
 
 All handlers are executed, most recently established first, until
 there are none left or there is a transfer of control.  Conceptually,
 exiting handlers established with tryCatch execute a transfer of
 control and then run their code.

Here's the one point of clarification: does the preceding paragraph
about all handlers are executed apply only to withCallingHandlers,
or does it include tryCatch as well?  Rereading ?tryCatch, it still
looks as if the first match only will fire.

 
 Hopefully a more extensive document on this will get written in the
 next few months; for now the notes available off the developer page
 may be useful.
 
 best,
 
 luke
Great.  FWIW, here are some suggestions about the documentation:

I would find a presentation that provided an overall orientation and
then worked down easiest to follow.  So, goiing from the top down: 
1. there are 3 forms of exception handling: try/catch, calling
handlers and restarts.
2. the characteristic behavior of each is ... (i.e., what's the flow
of control).  Maybe give a snippet of typical uses of each.
3. the details (exact calling environment of the handler(s), matching
rules, syntax...)
4. try() is basically a convenient form of tryCatch.
5. Other relations between these 3 forms: what happens if they are
nested; how restarts alter the standard control flow of the other
forms.  I also found the info that the restart mechanism is the most
general and complicated useful for orientation (that might go under
point 1).

It might be appropriate to document each form on a separate manual
page; I'm not sure if they are too linked (particularly by the use of
conditions and the control flow of restart) to make that a good idea.

I notice that some of the outline above is not the standard R manual
format; maybe the big picture should go in the language manual or on a
concept page (?Exceptions maybe).

Be explicit about the relations between conditions (class inheritance
relations).

Be explicit about how handlers are chosen and which forms they take.

It might be worth mentioning stuff that is a little surprising.  The
fact that the value of  the finally is not the value of the tryCatch
was a little surprising, since usually the value of a series of
statements or expression is that of the last one.  The fact that
signalCondition can participate in two different flows of control
(discussion snipped above) was also surprising to me.  In both cases
the current ?tryCatch is pretty explicit already, so that's not the
issue.

I found the current language (for ?tryCatch) about the calling context
of different handlers a bit obscure.  For example, discussing tryCatch:
If a handler
 is found then control is transferred to the 'tryCatch' call that
 established the handler, the handler found and all more recent
 handlers are disestablished, the handler is called

[Rd] trying to understand condition handling

2007-02-19 Thread Ross Boylan

I'm confused by the page documenting tryCatch and friends.

I think it describes 3 separate mechanisms: tryCatch (in which control
returns to the invoking tryCatch), withCallHandlers (in which control
goes up to the calling handler/s but then continues from the point at
which signalCondition() was invoked), and withRestarts (I can't tell
where control ends up).

For tryCatch the docs say the arguments ... provide handlers, and that
these are matched to the condition.  It appears that matching works by
providing entries in ... as named arguments, and the handler matches
if the name is one of the classes of the condition.  Is that right?  I
don't see the matching rule explicitly stated.  And then  the handler
itself is a single argument function, where the argument is the
condition?

My reading is that if some code executes signalCondition and it is
running inside a tryCatch, control will not return to the line after
the signalCondition.  Whereas, if the context is withCallHandlers,
the call to signalCondition does return (with a NULL) and execution
continues.  That seems odd; do I have it right?

Also, the documents don't explicitly say that the abstract subclasses
of 'error' and 'warning' are subclasses of 'condition', though that
seems to be implied and true.

It appears that for tryCatch only the first matching handler is
executed, while for withCallHandlers all matching handlers are
executed.

And, finally, with restarts there is again the issue of how the name
in the name=function form gets matched to the condition, and the more
basic question of what happens.  My guess is that control stays with
the handler, but then this mechanism seems very similar to tryCatch
(with the addition of being able to pass extra arguments to the
handler and  maybe a more flexible handler specification).

Can anyone clarify any of this?

P.S. Is there any mechanism that would allow one to trap an interrupt,
like a ctl-C,  so that if the user hit ctl-C some state would be
changed but execution would then continue where it was?  I have in
mind the ctl-C handler setting a time to finish up flag which the
maini code checks from time to time.

Thanks.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] pinning down symbol values (Scoping/Promises) question

2007-02-16 Thread Ross Boylan

I would like to define a function using symbols, but freeze the symbols
at their current values at the time of definition.  Both symbols
referring to the global scope and symbols referring to arguments are at
issue. Consider this (R 2.4.0):
 k1 - 5
 k
[1] 100
 a - function(z) function() z+k
 a1 - a(k1)
 k1 - 2
 k - 3
 a1()
[1] 5
 k - 10
 k1 - 100
 a1()
[1] 12

First, I'm a little surprised that that the value for k1 seems to get
pinned by the initial evaluation of a1.  I expected the final value to
be 110 because the z in z+k is a promise.

Second, how do I pin the values to the ones that obtain when the
different functions are invoked?  In other words, how should a be
defined so that a1() gets me 5+100 in the previous example?

I have a partial solution (for k), but it's ugly.  With k = 1 and k1 =
100,
 a - eval(substitute(function(z) function() z+x, list(x=k)))
 k - 20
 a1 - a(k1)
 a1()
[1] 101
(by the way, I thought a - eval(substitute(function(z) function() z+k))
would work, but it didn't).

This seems to pin the passed in argument as well, though it's even
uglier:
 a - eval(substitute(function(z) { z; function() z+x}, list(x=k)))
 a1 - a(k1)
 k1 - 5
 a1()
[1] 120

-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R Language Manual: possible error

2007-02-16 Thread Ross Boylan

The R Language manual, section 4.3.4 (Scope), has
 f - function(x) {
 y - 10
 g - function(x) x + y
 return(g)
 }
 h - f()
 h(3)
  
... When `h(3)' is evaluated we see that its body is that of `g'.
Within that body `x' and `y' are unbound.

Is that last sentence right?  It looks to me as if x is a bound
variable, and the definitions given in the elided material seem to say
so too.  I guess there is hidden, outer, x that is unbound.  Maybe the
example was meant to be 
g - function(a) a + y?

The front page of the manual says
 The current version of this document is 2.4.0 (2006-11-25) DRAFT.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem using ofstream in C++ class in package for MacOS X

2007-02-08 Thread Ross Boylan

On Sun, Feb 04, 2007 at 10:47:37PM +0100, cstrato wrote:
 Seth Falcon wrote:
  cstrato [EMAIL PROTECTED] writes:

  Thank you for your fast answer.
  Sorrowly, I don´t know how to use a debugger on MacOS X, I am using 
  old-style print commands.
  
 
  You should be able to use gdb on OS X (works for me, YMMV).  So you
  could try:
 
R -d gdb
run
# source a script that causes crash
# back in gdb, use backtrace, etc.
 
  + seth
 
 

 Dear Seth
 
 Thank you for this tip, I just tried it and here is the result:
 
 Welcome to MyClass
   writeFileCpp(myout_fileCpp.txt)
 [1] outfile =  myout_fileCpp.txt
 Writing file myout_fileCpp.txt using C++ style.
 ---MyClassA::MyClassA()-
 ---MyClassA::WriteFileCpp-
 
 Program received signal EXC_BAD_ACCESS, Could not access memory.
 Reason: KERN_PROTECTION_FAILURE at address: 0x0006
 0x020fe231 in std::ostream::flush (this=0x214f178) at 
 /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc:395
 395 
 /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc: 
 No such file or directory.
 in 
 /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc
 (gdb)
 
 It seems that it cannot find ostream.tcc, whatever this extension means.
 
 Best regards
 Christian

I also don't see what the problem is, but have a couple of thoughts.
Under OS-X there is an environment variable you can define to get the
dynamic linker to load debug versions of libraries.  I can't remember
what it is, but maybe something like DYLD_DEBUG (but probably DEBUG is
part of the value of the variable).

For that, or the tracing above, to be fully informative you need to
have installed the appropriate debugging libraries and sources.

You may need to set an explicit source search path in gdb to pick up
the source files.

Try stepping through the code from write before the crash to determine
exactly where it runs into trouble.

Does the output file you are trying to create exist?

Unfortunately, none of this really gets at your core bug, but it might
help track it down.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] One possible use for threads in R

2007-02-08 Thread Ross Boylan

I have been using R on a cluster with some work that does not
parallelize neatly because the time individual computations take
varies widely and unpredictably.

So I've considered implementing a work-stealing arrangement, in which
idle nodes grab tasks from busy nodes.  It might also be useful for
nodes to communicate results with each other.

My first thought on handling this was to have one R thread that
managed the communication, and 2 that managed computation (each node
is dual-processor).

Previous discussion has noted that R is not multi-threaded, and also
asked what use cases multi-threading might address.  So here's a use
case.

The advantage of having R doing the communication is that it's easy to
pass R-level objects around using, e.g., Rmpi.  The advantage of
having the communicator and the calculators share the same thread is
that work and information the communicator got would be immediately
available to the calculators.

Other comments suggested IPC is fast (though one comment referred
specifically to Linux, and the cluster is OS-X), so it may be quite
workable to have each thread in a separate process.

I'm not at all sure the implementation I sketched above is the best
approach to this problem (or even that it would be if R were
multi-threaded), but it does seem to me this might be one area where
threads would be handy in R.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problem using ofstream in C++ class in package for MacOS X

2007-02-08 Thread Ross Boylan

On Thu, Feb 08, 2007 at 11:53:21PM +0100, cstrato wrote:
...
 Maybe there's some subtle linker problem, or a problem with the
 representation of strings
 
 
   
 What do you mean with linker problem?
 
Nothing very specific, but generically wrong options, wrong
objects/libraries, or wrong order of the first 2.  Wrong includes
omitting something that should be there or including something that
shouldn't.

Linking on OS-X is unconventional relative to other systems I have
used.  In particular, one usually gets lots of errors about duplicate
symbols (which can be turned off, at some risk) and needs to specify
flat rather than 2-level namespace.  There's lots more if you look at
the linker page (man ld).

Similar issues can arise at the compiler phase too.

Another fun thing on OS-X is that they have a libtool that is
different from the GNU libtool, and your project might use both.  So
you need to be sure to get the right one.  But it's unlikely you could
even build if that were an issue.

If different parts (e.g., R vs your code) are built with different
options, that can cause trouble.

For example, my Makefile has
MAINCXXFLAGS :=  $(shell R CMD config --cppflags) -std=c++98 -Wall 
-I$(TRUESRCDIR)
This relies on GNU make features.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] encoding issues even w/o accents (background on single quotes)

2007-01-19 Thread Ross Boylan

On Wed, Jan 17, 2007 at 11:56:15PM -0800, Ross Boylan wrote:
 An earlier thread (in 10/2006) discussed encoding issues in the
 context of R data and the desire to represent accented characters.
 
 It matters in another setting: the output generated by R and the
 seemingly order character ' (single quote).  In particular, R CMD
^^^ should be ordinary
 check runs test code and compares the generated output to a saved file
 of expected output.  This does not work reliably across encoding
 schemes.  This is unfortunate, since it seems the expected output
 files will necessarily be wrong for someone.
 
 The problem for me was triggered by the single-quote character '.
 On my older systems, this is encoded by 0x27, a perfectly fine ASCII
 character.  That is on a Debian GNU/Linux system with LANG=en_US.  On
 a newer system I have LANG=en_US.UTF-8.  I don't recall whether
 this was a deliberate choice on my part, or simply reflects changing
 defaults for the installer.  (Note the earlier thread referred to the
 Debian-derived Ubuntu systems as having switched to UTF-8).  Under
 UTF-8 the same character is encoded in the 3-byte sequence 0xE28098
 (which seems odd; I thought the point of UTF-8 was that ASCII was a
 legitimate subset).

Apparently quoting, particularly single quotes, is a can of worms:
http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
When Unicode is available (which would be the case with UTF-8),
particular non-ASCII characters are recommended for single quoting.
The 3 byte sequence is the UTF-8 encoding of x2018, the recommended
left single quote mark.

See http://en.wikipedia.org/wiki/UTF-8 on UTF-8 encoding.

This is more than I or, probably, you ever wanted to know about this
issue!

Ross

 
 The coefficient  printing methods in the stats package use the
 single-quote in the key explaining significance levels:
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 I suppose one possible work-around for R CMD check would be to set the
 encoding to  some standard value before it runs tests, but that has
 some drawbacks.  It doesn't work for packages needing a different
 encoding (but perhaps the package could specify an encoding to use by
 default?)(*),  It will leave the output files looking weird on systems
 with a different encoding.  It will get messed up if one generates the
 files under the wrong encoding.
 
 And none of this addresses stuff beyond the context of output file
 comparison in R CMD check.
 
 Any thoughts?
 
 Ross Boylan
 
 
 * From the R Extensions document, discussing the DESCRIPTION file:
If the `DESCRIPTION' file is not entirely in ASCII it should contain
 an `Encoding' field specifying an encoding.  This is currently used as
 the encoding of the `DESCRIPTION' file itself, and may in the future be
 taken as the encoding for other documentation in the package.  Only
 encoding names `latin1', `latin2' and `UTF-8' are known to be portable.
 
 I would not expect that the test output files be considered
 documentation, but I suppose that's subject to interpretation.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] C vs. C++ as learning and development tool for R

2007-01-19 Thread Ross Boylan

On Fri, Jan 19, 2007 at 03:55:30AM -0500, Kimpel, Mark William wrote:
 I have 3 years of experience with R and have an interest in becoming a
 better programmer so that I might someday be able to contribute
 packages. Other than R, my only experience was taking Lisp from Daniel
 Friedman in the 1970's. I would like to learn either C or C++ for
 several reasons:
 
 To gain a better concept of object oriented programming so that I can
 begin to use S4 methods in R.
 
 To perhaps speed up some things I do repeatedly in R
 
 To be able to contribute a package someday.
 
  
 
 I have been doing some reading and from what I can tell R is more
 compatible with C, but C++ has much greater capabilities for OO
 programming.
 
  
 
 I have just started reading The C++ Programming Language: Special
 Edition by Bjarne Stroustrup
 http://search.barnesandnoble.com/booksearch/results.asp?ATH=Bjarne+Stro
 ustrupz=y , he recommends first learning C++ and then then C if
 necessary, but as a developer of C++, he is probably biased.
 
  
 
 I would greatly appreciate the advice of the R developers and package
 contributors on this subject. C or C++?
 

To echo several other comments, if your goal is to work in R, it would
be best to go straight to R.  I haven't used lisp much, but I believe
it is much closer to R than most other languages you could pick.  It
has a functional style, and I recall reading the R's scoping rules
were directly inspired by Scheme, a Lisp variant.  In fact, I didn't
feel I fully grasped them until I looked at Abelson and Sussman's
Structure and Interpretation of Computer Languages (which uses
Scheme).

The functional OO of R is significantly different from the
class-based OO found in most languages calling themselves object
oriented, including C++, Java, Python and smalltalk.  Learning those
other languages to understand R could actually interfere with learning
R.

If and when you need speed, you can program in any language that
supports Fortran or C interfaces, which is almost all of them.

If you're just doing general education

I use C++ in R, and I have to say that programming in C++ is a
wretched experience.  You have to make a major committment to learning
the language, which is a minefield of gotcha's, to use it in full OO
style.  As others on this list and Stroustrup suggest, you can use it
and just incrementally add features over what you would do in C.  It
can also be speedy and powerful (to run, not to program in!), which is
why I'm using it.

For pure OO, I think you can't beat smalltalk, which is freely
available at www.squeak.org (also there is a GNU and several
commerical versions).  The language rules and syntax fit on one page.
The catch is that to use it you need to learn the environment and the
class library; these too are big tasks.

Objective-C is a much more lightweight C'ish OO than C++ (the author
moved smalltalk concepts into C).  It's available as part of the GNU
compilers.  Unlike smalltalk, you might use it if you cared about
performance, and it's the native language of Mac OS-X.  It has a
relatively small learning curve.

Python and Java are other choices for OO, both significantly simpler
than C++.  I find Python to be simple and elegant; it's also nifty for
scripting random tasks.  Java's widely used on the web and in the
enterprise.  Eiffel is also interesting.

I can't say much about libraries already on other machines, but the
C runtime is probably the one you can count on being there the most.

Of course, another route would be to explore other functional
languages, a terrain I barely know: Haskell, ML, OCaml...  In
particular, some of them have lazy evaluation of arguments, which R
also employs.  And there are the functional/object languages like CLOS
(I think the O in OCaml is Object).

Anyway, this risks becoming a general language thread.  My main point,
as someone who's been there, is don't use C++ unless you have a
compelling reason and a lot of time!

Ross Boylan

(Among the languages listed, the ones I've used extensively are C,
C++, Objective-C, Python, R, and smalltalk.)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Problems with checking documentation vs data, and a proposal

2007-01-16 Thread Ross Boylan

I have a single data file inputs.RData that contains 3 objects.  I
generated an Rd page for each object using prompt().
When I run R CMD check I get
* checking for code/documentation mismatches ... WARNING
Warning in utils::data(list = al, envir = data_env) : 
 data set 'gold' not found
(gold is one of the objects).

This appears to be coming from the codocData function defined in
src/library/tools/R/QC.R (this is in the Debianised source 2.4.1, so the
path might be a little different).

According to the help on this function, it will only attempt a match
when there is a single alias in the documentation file, although I'm not
sure that's what the code does (it seems to check only if there is more
than one format section).  At any rate, the central logic appears to
gather up names of data objects and then to load them with
## Try loading the data set into data_env.
utils::data(list = al, envir = data_env)
if(exists(al, envir = data_env, mode = list,
  inherits = FALSE)) {
al - get(al, envir = data_env, mode = list)
}
Since there is no gold.RData, this is failing.

This leads to 2 issues: what should I do now, and how might this work
better in the future.

Taking the future first, how about having the code first load all the
data files that it finds somewhere near the beginning?  If it did so,
the code
## Try finding the variable or data set given by the alias.
al - aliases[i]
if(exists(al, envir = code_env, mode = list,
  inherits = FALSE)) {
which precedes the earlier snippet, would find the symbol was defined
and be happy.  I suppose the data could be loaded into code_env,
although using it seems to risk deciding that a data symbol is defined
when the symbol refers to a code object.

I'm not sure if attempting to load the data objects individually should
still be attempted under this scenario, if the symbol is not already
present.

What can I do in the short run, particularly since I would like to have
the code pass R CMD check with versions of R that don't include this
possible enhancement, what can I do?  I see several options, none of
them beautiful:
1) Delete inputs.RData and create 3 separate data objects.  However, I
have code that relies on inputs being present, and the 3 data items go
together naturally.
2) Make a single document describing inputs.RData.  First problem: the
page would be awkward combining all 3 things.  Second, it looks as if
codocData might still try loading the individual data objects, since it
tries to pull data names out of the documentation, even out of
individual item inside \describe.
3) Attempt to disable the checks by adding multiple aliases or something
else to be revealed by closer inspection of the code.  This is a hack
that bypasses the checking altogether (unless it turns out I still get a
complaint about missing documentation).
4) Create gold.RData and others as symlinks to inputs.RData.  Fragile
across operating systems, version control systems, and versions of tar.
Might get errors about multiple data definitions.

Usual caveats: this is all based on my imperfect understanding of the
code.

So, any comments on the possible modification to codocData or the
work-arounds?
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problems with checking documentation vs data, and a proposal

2007-01-16 Thread Ross Boylan

On Tue, 2007-01-16 at 14:03 -0800, Ross Boylan wrote:
 I have a single data file inputs.RData that contains 3 objects.  I
 generated an Rd page for each object using prompt().
 When I run R CMD check I get
 * checking for code/documentation mismatches ... WARNING
 Warning in utils::data(list = al, envir = data_env) : 
data set 'gold' not found
 (gold is one of the objects).
.
 What can I do in the short run, particularly since I would like to have
 the code pass R CMD check with versions of R that don't include this
 possible enhancement, what can I do?  I see several options, none of
 them beautiful:
...
 4) Create gold.RData and others as symlinks to inputs.RData.  Fragile
 across operating systems, version control systems, and versions of tar.
 Might get errors about multiple data definitions.

Option 4 worked, though the symlinks were converted to regular files by
R CMD check.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Am I missing something about debugging?

2007-01-04 Thread Ross Boylan

On Thu, 2007-01-04 at 17:06 +1100, [EMAIL PROTECTED] wrote:
 It is possible to do some of these things with the 'debug' package-- the
 article in R-news 2003 #3 shows a few of the tricks. Suppose 'b1' calls
 'c1'. If 'c1' exists as permanent function defined outside 'b1' (which
 I generally prefer, for clarity), then you can call 'mtrace( c1)' and
 'c1' will be invoked whenever it's called-- you don't have to first
 'mtrace' 'b1' and then manually call 'mtrace(c1)' while inside 'b1'.
Is the effect of mtrace permanent?  For example, if 
b1 - function() {
  # stuff
  c1()
  # stuff
  c1()
}
And you mtrace(c1), will both calls to c1, as well as any outside of b1,
bring up the debugger?

I ask because sometimes the normal step semantics in debugging is more
useful, i.e., debug into the next call to c1 only.  As I understand it,
the debug package can put a one-time only breakpoint (with go), but only
in the body of the currently active function.

Am I correct that both the debug package and the regular debug require
explicitly removing debugging from a function to turn off debugging?

 
 Even if 'c1' is defined inside the body of 'b1', you can get something
 similar by using conditional breakpoints, like this
 
  mtrace( b1)
  # whatever you type to get 'b1' going
 D(17) # now look at the code window for 'b1' and find the line just
 after the definition of 'c1'
 D(17) # ... say that's on line 11
 D(17) bp( 11, {mtrace( c1);FALSE})
 # which will auto-mtrace 'c1' without stopping; of course you could
 hardwire this in the code of 'b1' too
 
If you invoke b1 multiple times, will the previous procedure result in
wrapping c1 multiple times, e.g., first time through c1 is replaced by
mtrace(c1); second time rewrites the already rewritten fucntion?  Is
that a problem?

 the point is that you can stick all sorts of code inside a conditional
 breakpoint to do other things-- if the expression returns FALSE then the
 breakpoint won't be triggered, but the side-effects will still happen.
 You can also use conditional breakpoints and 'skip' command to patch
 code on-the-fly, but I generally find it's too much trouble.
 
 
 Note also the trick of 
 D(17) bp(1,F) 
 
 which is useful if 'b1' will be called again within the lifetime of the
 current top-level expression and you actually don't want to stop.
Is bp(1, FALSE) equivalent to mtrace(f, false), if one is currently
debugging in f?

 
 The point about context is subtle because of R's scoping rules-- should
 one look at lexical scope, or at things defined in calling functions?
 The former happens by default in the 'debug' package (ie if you type the
 name of something that can be seen from the current function, then the
 debugger will find it, even if it's not defined in the current frame).
 For the latter, though, if you are currently inside c1, then one way
 to do it is to use 'sys.parent()' or 'sys.parent(2)' or whatever to
 figure out the frame number of the context you want, then you could do
 e.g.

 D(18) sp - sys.frame( sys.parent( 2))
 D(18) evalq( ls(), sp)
 
 etc which is not too bad. It's worth experimenting with sys.call etc
 while inside my debugger, too-- I have gone to some lengths to try to
 ensure that those functions work the way that might be expected (even
 though they actually don't... long story).
 
sys.parent and friends  didn't seem to work for me in vanilla R
debugging, so this sounds really useful to me.
 If you are 'mtrace'ing one of the calling functions as well, then you
 can also look at the frame numbers in the code windows to work out where
 to 'evalq'.
I thought the frame numbers shown in the debugger are numbered
successively for the call stack, and that these are not necessarily the
same as the frame numbers in R.  My understanding is that the latter are
not guaranteed to be consecutive (relative to the call stack).  From the
description of sys.parent:
 The parent frame of a function evaluation is the environment in
 which the function was called.  It is not necessarily numbered one
 less than the frame number of the current evaluation,

 
 The current 'debug' package doesn't include a watch window (even
 though it's something I rely on heavily in Delphi, my main other
 language) mainly because R can get stuck figuring out what to display in
 that window. 
Just out of curiosity, how does that problem arise?  I'd expect showing
the variables in a frame to be straightforward.

 It's not that hard to do (I used ot have one in the Splus
 version of my debugger) and I might add one in future if demand is high
 enough. It would help if there was some way to time-out a
 calculation-- e.g. a 'time.try' function a la
 
   result - time.try( { do.some.big.calculation}, 0.05)
 
 which would return an object of class too-slow if the calculation
 takes more than 0.05s.
 
 I'm certainly willing to consider adding other features to the 'debug'
 package if they are easy enough and demand is high enough! [And if I
 have time, which I mostly

Re: [Rd] Which programming paradigm is the most used for make R packages?

2007-01-03 Thread Ross Boylan

On Wed, Jan 03, 2007 at 11:46:16AM -0600, Ricardo Rios wrote:
 Hi wizards, does somebody know  Which programming paradigm is the most
 used for make R packages ? Thanks in advance.
 
You need to explain what you mean by the question, for example what
paradigms you have in mind.

R is a functional language; as I've discovered, this means some
standard OO programming approaches don't carry over too naturally.  In
particular, functions don't really belong to classes.  R purists
would probably want that to say class-based 00 programming doesn't
fit, since R is function-based OO.

There is a package that  permits a more traditional (class-based) OO
style; I think it's called R.oo.

Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Am I missing something about debugging?

2007-01-02 Thread Ross Boylan

I would like to be able to trace execution into calls below the current
function, or to follow execution as calls return.  This is roughly the
distinction between step and next in many debuggers.

I would also like to be able to switch to a location further up the call
stack than the location at which I enter the debugger, to see the
context of the current operations.

Are there ways to do these things with the R debugger?  I've studied the
man pages and FAQ's, and looked at the debug package, but I don't see a
way except for manually calling debug on the function that is about to
be called if I want to descend.  That's quite awkward, particularly
since it must be manually undone (the debug package may be better on
that score).  I'm also not entirely sure that such recursion
(essentially, debugging within the debugger) is OK.

I tried looking up the stack with things like sys.calls(), from within
the browser, but they operated as if I were at the top level (e.g.,
sys.function(-1) gets an error that it can't go there).  I was doing
this in ess, and there's some chance the can't write .Last.value error
(wording approximate) cause by having an old version is screwing things
up).

Since R is interpreted I would expect debugging to be a snap, but these
limitations make me suspect there is something about the language design
that makes implementing these facilities hard.  For example, the browser
as documented in the Green book has up and down functions to change the
frame (p. 265); these are conspicuously absent in R.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Am I missing something about debugging?

2007-01-02 Thread Ross Boylan

On Tue, 2007-01-02 at 17:24 -0500, Duncan Murdoch wrote:
 I don't think you're missing anything with the debug() function. It 
 needs updating.
Bummer!
 
 I don't think there's any structural reason why you shouldn't be able to 
 do the things you're talking about in R, but they haven't been 
 implemented.
 
That's good to know.  I was wondering if the lexical scoping was
complicating things.  At least the way I think of it, every call has two
sets of (potentially) nested environments: the lexical scopes of the
function definition and the dynamic scopes of the call.  But since the
dynamic scopes are available, using them seems possible.

 Mark Bravington put together a package (called debug) that does more 
 than debug() does, but I haven't used it much, and I don't know if it 
 does what you want.
 
It looked to me as if it was some help, but no advance on the
investigating dynamic frames front.

 I recently added things to the R parser to keep track of connections 
 between R code and source files; that was partly meant as a first step 
 towards improving the debugging facilities.  I'd be happy to help anyone 
 who wants to do the hard work, but I don't think I'll be able to work on 
 it before next summer.  (If you do decide to work on it, please let me 
 know, just in case I do get a chance:  no point duplicating effort.)
I didn't even realize such a facility was needed, which shows how much I
know!  Working on the debugger is probably not in my job description,
unless I get really annoyed.

The smalltalk debugger is the standard by which I judge all others; it's
just amazing.  You can go up and down the stack, graphically examine
variables (and follow links), and change code in the middle of debugging
and then continue.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Capturing argument values

2006-12-30 Thread Ross Boylan

I would like to preserve the values of all the arguments to a function
in a results object.
foo - function(a, b=1) 
foo(x, 3)

match.call() looks promising, but it records that a is x, while I want
the value of x (in the calling frame).  Also, if the invocation is
foo(x), then match.call doesn't record that b is 1.

So I tried this (inside the function definition):
myargs - lapply(names(formals()),
 function(x) eval(as.name(x)))

That's pretty close.  However, my function has an optional argument in
this sense:
bar - function(x, testing) ...
where code in the body is 
  if (! missing(testing)) do stuff

When the eval in the previous lapply runs for a function call in which
testing is not supplied, I get
Error in eval(expr, envir, enclos) : argument testing is missing,
with no default
exposing a weakness in both my implementation and problem specification.

I think I could simply screen testing out of the formals and be
happy, but are there better ways of handling this situation?

I realize I could capture the function's entire local frame, but that
has quite a bit of stuff I don't want in it.  I suspect some of the
items in it might be promises, and so would not have the values I
needed as well.  (Also the frame could later change, though I guess I
could convert it to a list to avoid that problem.)


Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] setGeneric and file order in a package

2006-12-28 Thread Ross Boylan

Are there any assumptions I can make about the order in which
different files in the R package subdirectory will be evaluated?  For
example, will it be alphabetical in the file names?  Will case
distinctions be ignored?

I ask because I would like to use setGeneric, as in
setGeneric(foo, function(x) standardGeneric(foo)) and am wondering
where that should go.

I realize I could explicitly test for the existence of the generic in
each spot, and execute setGeneric as needed.  That seems wasteful and
error prone.

Is there other recommended practice in setting generics?  For example,
should I test for existence of a generic in the one spot I create it?
Since that seems like a half-measure (if a generic exists it may well
have different arguments) I suppose I should use namespaces...

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] promptClass misses methods

2006-12-02 Thread Ross Boylan

On Sat, Dec 02, 2006 at 05:11:22PM +0100, Martin Maechler wrote:
  RossB == Ross Boylan [EMAIL PROTECTED]
  on Fri, 1 Dec 2006 11:33:21 -0800 writes:
 
 RossB On Fri, Dec 01, 2006 at 11:37:46AM +0100, Martin
 RossB Maechler wrote:
   RossB == Ross Boylan [EMAIL PROTECTED]
   on Thu, 30 Nov 2006 22:29:06 -0800 writes:
  
 RossB I've had repeated problems with promptClass missing
 RossB methods, usually telling me a class has no methods
 RossB when it does.
 
 RossB In my current case, I've defined an S4 class
 RossB mspathCoefficients with a print method
 RossB setMethod(print, signature(x=mspathCoefficients),
 RossB function(x, ...)  { # etc
   You should *not* define print methods for S4 classes;
  rather you should define show methods.
 
 RossB Is that because print is used by the S3 system?  
 
 no, not really.
 RossB And is the general rule to avoid using S3 methods for S4
 RossB classes?
 
 Well your wording is murky, but no, you *should* define (S4) methods 
 for S3 generics very well.  The S3 generics are automagically
 promoted to S4 generics as soon as you define an S4 method for it.

That answers my question.  The meaning was if foo is an S3 method,
should one avoid defining foo as an S4 method.  And the answer is
no, it's OK.  I assume one should strive to use the same argument
names, although since S3 methods don't need to use the same argument
names I'm not sure how that works (e.g. for S3
  foo.class1 - function(x, a, b) but
  foo.class2 - function(x, c)
). 

 
 print() is just a big exception.
 

How come?  

Is it an exception in the sense that it is not automatically used to
display the object, or in the sense that one should never define S4
print methods at all?  (Looks like the first alternative based on the
example later.)  The print methods I have seem to work OK, provided I
don't expect them to be called automatically and provided I don't
expect promptClass to pick them up.


 RossB   For example, http://www.omegahat.org/RSMethods/Intro.pdf, which 
 is
 RossB referenced in the package help for methods, discusses
 RossB show, print and plot as 3 alternatives in S4 (p. 9,
 RossB though a footnote says that at that time--2001--R
 RossB didn't recognize formal methods for printing
 RossB objects.)
 
 2001 is way in the past concerning S4 implementation in R.

Perhaps the reference should be removed then.  Maybe the newer
http://developer.r-project.org/howMethodsWork.pdf would be better?
However, that is pitched more toward the internals, and is already
referenced in ?Methods.

 Specifically, using S4 in R; we'd   **very strongly** recommend
 R 2.4.0 (and ideally even R-patched) because of several recent
 good developments.

Fortunately that's what I'm using.  I wonder if this is so important I
should require R = 2.4 for my package.  It was working fine in
earlier versions.  The main user visible changes I'm aware of are
those in the object forms (i.e., binary incompatibility) and some
improvements in the algorithm for choosing which method to dispatch to
(semantically, sometimes a different method gets called; it sounds
faster too).  I'm not distributing any data files with S4 class
objects, and don't have any corner cases on method dispatch.

 
 RossB I've been unable to locate much information about
 RossB combining S3 and S4 methods, though I recall seeing a
 RossB note saying this issue was still to be addressed in
 RossB R.  Perhaps it has been now, with setOldClass?  At
 RossB any rate, the help for that method addresses classes
 RossB rather than methods, and I didn't see anything in
 RossB ?Methods, ?setMethod, or ?setGeneric.
 
 RossB show() raises two additional issues for me.  First,
 RossB it takes a single argument, and I want to be able to
 RossB pass in additional arguments via ... .  
 
 That's not possible currently.

In the expected use of show(), namely automatically showing an object,
additional arguments don't make sense (since there's no chance to
provide them).

If the only problem in my use of print is that it's not called
automatically, then perhaps I should leave it as is and define a show
method that invokes print().  That seems to be the pattern in the
example you provided below.

 
 And I agree that in certain cases, I would want to have the
 flexibility of print(..) there;
 One case is for printing/showing fitted LMER objects; the
 following code is used :
 
   ## This is modeled a bit after  print.summary.lm :
   printMer - function(x, digits = max(3, getOption(digits) - 3),
  correlation = TRUE, symbolic.cor = x$symbolic.cor,
  signif.stars = getOption(show.signif.stars), ...)
   {
 ...
 ...
   invisible(x)
   }
 
   setMethod(print, mer, printMer)
   setMethod(show, mer, function(object) printMer(object))
 
 
 RossB Second, I read

Re: [Rd] promptClass misses methods

2006-12-01 Thread Ross Boylan

On Fri, Dec 01, 2006 at 11:37:46AM +0100, Martin Maechler wrote:
  RossB == Ross Boylan [EMAIL PROTECTED]
  on Thu, 30 Nov 2006 22:29:06 -0800 writes:
 
 RossB I've had repeated problems with promptClass missing
 RossB methods, usually telling me a class has no methods
 RossB when it does.
 
 RossB In my current case, I've defined an S4 class
 RossB mspathCoefficients with a print method
 RossB setMethod(print, signature(x=mspathCoefficients),
 RossB function(x, ...)  { # etc
 
 You should *not* define print methods for S4 classes;
 rather you should define show methods.

Is that because print is used by the S3 system?  And is the general
rule to avoid using S3 methods for S4 classes?  For example,
http://www.omegahat.org/RSMethods/Intro.pdf, which is referenced in
the package help for methods, discusses show, print and plot as 3
alternatives in S4 (p. 9, though a footnote says that at that
time--2001--R didn't recognize formal methods for printing objects.)

I've been unable to locate much information about combining S3 and S4
methods, though I recall seeing a note saying this issue was still to
be addressed in R.  Perhaps it has been now, with setOldClass?  At any
rate, the help for that method addresses classes rather than methods,
and I didn't see anything in ?Methods, ?setMethod, or ?setGeneric.

show() raises two additional issues for me.  First, it takes a single
argument, and I want to be able to pass in additional arguments via
... .  Second, I read some advice somewhere, that I no longer can find,
that show methods should return an object and that object in turn
should be the thing that is printed.  I don't understand the
motivation for that rule, at least in this case, because my object is
already a results object.

 
 RossB The file promptClass creates has no methods in it.
  showMethods(classes=mspathCoefficients)
 RossB Function: initialize (package methods)
 RossB .Object=mspathCoefficients (inherited from:
 RossB .Object=ANY)
 
 so it's just inherited from ANY
 
 RossB Function: print (package base) 
 RossB x=mspathCoefficients
 
 that's the one

So why isn't promptClass picking it up?

 
 RossB Function: show (package methods)
 RossB object=mspathCoefficients
 RossB  (inherited from: object=ANY)
 so it's just inherited from ANY
 
 Ross, it would really be more polite to your readers if you
 followed the posting guide and posted complete
 fully-reproducible code...

I thought it might be overkill in this case.  At any rate, it sounds
as if I may be trying to do the wrong thing, so I'd appreciate
guidance on what the right thing to do is.

Here's a toy example:
setClass(A, representation(x=numeric))
setMethod(print, signature(x=A), function(x, ...) print([EMAIL PROTECTED], 
...) )
promptClass(A)

The generated file has no print method.
 
 
  getGeneric(print)
 RossB standardGeneric for print defined from package
 RossB base
 
 RossB function (x, ...)  standardGeneric(print)
 RossB environment: 0x84f2d88 Methods may be defined for
 RossB arguments: x
 
 
 RossB I've looked through the code for promptClass, but
 RossB nothing popped out at me.
 
 RossB It may be relevant that I'm running under ESS in
 RossB emacs.  However, I get the same results running R
 RossB from the command line.
 
 RossB Can anyone tell me what's going on here?  This is
 RossB with R 2.4, and I'm not currently using any namespace
 RossB for my definitions.
 
 [and not a package either?]
The code is part of  a package, but I'm developing code snippets in
ESS without loading the whole package.
 
 I'm very willing to look at this, once
 you've provided what the posting guide asks for, see above.
 
 Regards,
 Martin

Thank you.  For completeness, here's some system info:

 sessionInfo()
R version 2.4.0 (2006-10-03) 
i486-pc-linux-gnu 

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] methods   stats graphics  grDevices utils datasets 
[7] base 

The Debian package is r-base-core 2.4.0.20061103-1.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] printing coefficients with text

2006-12-01 Thread Ross Boylan

On Fri, Dec 01, 2006 at 10:34:45AM +0100, Martin Maechler wrote:
  RossB == Ross Boylan [EMAIL PROTECTED]
  on Thu, 30 Nov 2006 12:17:55 -0800 writes:
 
 RossB I want to print the coefficient estimates of a model
 RossB in a way as consistent with other output in R as
 RossB possible. stats provides the printCoefmat function
 RossB for doing this, but there is one problem.  I have an
 RossB additional piece of textual information I want to put
 RossB on the line with the other info on each coefficient.
 
 that's not a real problem, see below
 
 RossB The documentation for printCoefmat says the first
 RossB argument must be numeric, which seems to rule this out.
 
 it does say that (it says x: a numeric matrix like object
 which includes data frames with factors)
 but you are right that it does preclude a column of character.

Having gone through the code, it's clear the code itself requires all
numerics.

 
 RossB I just realized I might be able to cheat by inserting
 RossB the text into the name of the variable (fortunately
 RossB there is just one item of text).  I think that's in
 RossB the names of the matrix given as the first argument
 RossB to the function.
 
 yes; it's the rownames();
 i.e., you'd do something like
   rownames(myx) - paste(rownames(myx), format(mytext_var)))
 
 which seems simple enough to me,
 but it only works when the text is the first column

This actually worked out great for me.

 
 RossB Are there any better solutions?  Obviously I could
 RossB just copy the method and modify it, but that creates
 RossB duplicate code and loses the ability to track future
 RossB changes to printCoefmat.
 
 As original author of printCoefmat(), I'm quite willing to
 accept and incorporate a patch to the current function
 definition (in https://svn.r-project.org/R/trunk/src/library/R/anova.R),
 if it's well written.
 
 As a matter of fact, I think already see how to generalize printCoefmat()
 to work for the case of data frame with character columns

Yes, that seems as if it would be a good generalization.  However,
there is code that makes inferences based on the number of columns of
data, and I'm not sure how that should work.  Probably it should
ignore the non-numeric data.

 [I would not want a character matrix however; since that would mean
  going numeric - character - numeric - formatting (i.e character)
  for the 'coefficients' themselves].
 
 Can you send me a reproducible example?
You mean of an input data frame?  Or something else?  The input isn't
currently a data frame, but I could certainly make one.

Do you think generalizing to other types (factor, logical) would make
sense too?

 or at least an *.Rda file of a save()d such data frame?
 
 RossB Thanks.  Ross Boylan
 
 You're welcome,
 Martin Maechler, ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Web site link problems (PR#9401)

2006-11-30 Thread Ross Boylan

On Thu, Nov 30, 2006 at 10:59:13AM +0100, Peter Dalgaard wrote:
 [EMAIL PROTECTED] wrote:

 2. http://www.r-project.org/posting-guide.html includes in the section
 Surprising behavior and bugs, make sure you read R Bugs in the R-faq.  
 The
 latter is the link http://cran.r-project.org/doc/FAQ/R-FAQ.html#R%20Bugs, 
 which
 takes me to the page but not the section.  The link on the FAQ page to that
 section is http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Bugs (i.e., no 
 %20).
 
 Desired state: update the link.
   
 Yes. (It's only a half-page scroll plus an extra click though...)

The risk is that someone will just conclude it's a bad link and stop
there.

 You also might want to consider footers on your web pages saying to report
 problems with this web page do x.  The pages I looked at didn't have 
 this
 info, as far as I can tell.
   
 Maybe, if it is easy. The whole bug repository is overdue for 
 replacement, so things that are not critical and/or easy to fix may be 
 left alone... 
 I hope this is an appropriate place to let you know!
   
 It'll do. Don't report other website issues to the bug repository though.
OK.  Where should such reports go?

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] printing coefficients with text

2006-11-30 Thread Ross Boylan

I want to print the coefficient estimates of a model in a way
as consistent with other output in R as possible. stats provides the
printCoefmat function for doing this, but there is one problem.  I
have an additional piece of textual information I want to put on the
line with the other info on each coefficient.

The documentation for printCoefmat says the first argument must be
numeric, which seems to rule this out.

I just realized I might be able to cheat by inserting the text into
the name of the variable (fortunately there is just one item of
text).  I think that's in the names of the matrix given as the first
argument to the function.

Are there any better solutions?  Obviously I could just copy the
method and modify it, but that creates duplicate code and loses the
ability to track future changes to printCoefmat.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] promptClass misses methods

2006-11-30 Thread Ross Boylan

I've had repeated problems with promptClass missing methods, usually
telling me a class has no methods when it does.

In my current case, I've defined an S4 class mspathCoefficients with
a print method
setMethod(print, signature(x=mspathCoefficients), function(x, ...)
{ # etc

The file promptClass creates has no methods in it.
 showMethods(classes=mspathCoefficients)
Function: initialize (package methods)
.Object=mspathCoefficients
(inherited from: .Object=ANY)

Function: print (package base)
x=mspathCoefficients

Function: show (package methods)
object=mspathCoefficients
(inherited from: object=ANY)

 getGeneric(print)
standardGeneric for print defined from package base

function (x, ...) 
standardGeneric(print)
environment: 0x84f2d88
Methods may be defined for arguments: x 


I've looked through the code for promptClass, but nothing popped out
at me.

It may be relevant that I'm running under ESS in emacs.  However, I
get the same results running R from the command line.

Can anyone tell me what's going on here?  This is with R 2.4, and I'm
not currently using any namespace for my definitions.

Thanks.
Ross Boylan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] promptClass misses methods (addendum)

2006-11-30 Thread Ross Boylan

On Thu, Nov 30, 2006 at 10:29:06PM -0800, Ross Boylan wrote:
 I've had repeated problems with promptClass missing methods, usually
 telling me a class has no methods when it does.
 
 In my current case, I've defined an S4 class mspathCoefficients with
 a print method
 setMethod(print, signature(x=mspathCoefficients), function(x, ...)
 { # etc

It may also be relevant that  there is a mspathCoefficients function,
which constructs a member of the class.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Missing values for S4 slots [One Solution]

2006-11-24 Thread Ross Boylan

On Fri, Nov 24, 2006 at 11:23:14AM -0800, Ross Boylan wrote:
 Using R 2.4, the following fails:
 setClass(testc, representation(a=ANY))
 makeC - function(myarg) new(testc, a=myarg)
 makeC()
 - Error in initialize(value, ...) : argument myarg is missing,
with no default

 
 I suspect there's something I could do to get the constructor
 arguments, modify the list (i.e., delete args that were missing and
 insert new ones), and do.call(new, myArgList).  Not only am I unsure
 how to do that (or if it would work), I'm hoping there's a better way.

I didn't find a way to get all the arguments easily(*), but manually
constructing the list works.  Here are fragments of the code:

mspathCoefficients - function(
   aMatrix,
   params,
   offset=0,
   baseConstrVec, 
   covLabels # other args omitted
   ) {
  # innerArgs are used with do.call(new, innerArgs)
  innerArgs - list(
Class = mspathCoefficients,
aMatrix = aMatrix,
baseConstrVec = as.integer(baseConstrVec),
params = params
)

# the next block inserts the covLabels argument
# only if it is non-missing
  if (missing(covLabels)) {
# 
  } else {
innerArgs$covLabels - covLabels
  }
#...
  map - list() 
# fill in the map
# add it to the arguments
  innerArgs$map - map
  do.call(new, innerArgs)
}

This calls new(mspathCoefficients, ...) with just the non-missing
arguments.  The constructed object has appropriately missing values
in the slots (e.g., character(0) given a slot of type character).


(*) Inside the function, arg - list(...) will capture the unnamed
arguments, but I don't have any.  as.list(sys.frame(sys.nframe()) is
closer to what I was looking for, though all the values are promises.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] tail recursion in R

2006-11-14 Thread Ross Boylan

Apparently Scheme is clever and can turn certain apparently recursive
function calls into into non-recursive evaluations.

Does R do anything like that?  I could find no reference to it in the
language manual.

What I'm wondering is whether there are desirable ways to express
recursion in R.

Thanks.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] \link to another package

2006-10-19 Thread Ross Boylan

In the documentation for my package I would like to reference the Rmpi
documentation.  I started with \link{Rmpi}, which caused R CMD check to
complain that it could not resolve the link.  Since Rmpi wasn't loaded,
this isn't surprising.

Ideally the user would see Rmpi, but the link would go to Rmpi's
Rmpi-pkg.  It's not clear to me if this is possible.  I've combined two
documented features to say \link[Rmpi=Rmpi-pkg]{Rmpi}.  Is that OK?

According to the 2.3.1 documentation
\link[Rmpi]{Rmpi} gets me the Rmpi link in the Rmpi package and
\line[=Rmpi-pkg]{Rmpi} gets me Rmpi-pkg.
There is no mention of combining these two syntaxes.

There is also a discussion of \link[Rmpi:bar]{Rmpi}, but that appears to
refer to a *file* bar.html rather than an internal value (i.e.,
\alias{bar}).

Not only R CMD check but some users of the package may not have Rmpi
loaded, so I'd like the documentation to work gracefully in that
situation.  Gracefully means that if Rmpi is not loaded the help still
shows; it does not mean that clicking on the link magically produces the
Rmpi documentation.

-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] documenation duplication and proposed automatic tools

2006-10-02 Thread Ross Boylan

I've been looking at documenting S4 classes and methods, though I have a
feeling many of these issues apply to S3 as well.

My impression is that the documentation system requires or recommends
creating basically the same information in several places.  I'd like to
explain that, see if I'm correct, and suggest that a more automated
framework might make life easier.

PROBLEM

Consider a class A and a method foo that operates on A.  As I understand
it, I must document the generic function foo (or ?foo will not produce a
response) and the method foo (or methods ? foo will not produce a
response).  Additionally, it appears to be recommended that I document
foo in the Methods section of the documentation for class A.  Finally, I
may want to document the method foo with specific arguments
(particularly if if uses unusual arguments, but presumably also if the
semantics are different in a class that extends A).

This seems like a lot of work to me, and it also seems error prone  and
subject to synchronization errors.  R CMD check checks vanilla function
documentation for agreement with the code, but I'm not sure that method
documentation in other contexts gets much help.

To complete the picture, suppose there is a another function, bar,
that operates on A.  B extends A, and reimplements foo, but not bar.

I think the suggestion is that I go back and add the B-flavored method
foo to the general methods documentation for foo.  I also have a choice
whether I should mention bar in the documentation for the class B.  If I
mention it, it's easier for the reader to grasp the whole interface that
B presents.  However, I make it harder to determine which methods
implement new functionality.

SOLUTION

There are a bunch of things users of OO systems typically want to know:
1) the relations between classes
2) the methods implemented by a class (for B, just foo)
3) the interface provided by a class (for B, foo and bar)
4) the various implementations of a particular method

All of these can be discovered dynamically by the user.  The problem is
that current documentation system attempts to reproduce this dynamic
information in static pages.  prompt, promptClass and promptMethods
functions generate templates that contain much of the information (or at
least there supposed to; they seem to miss stuff for me, for example
saying there are no methods when there are methods).  This is helpful,
but has two weaknesses.  First, the class developer must enter very
similar information in multiple places (specifically, function, methods,
and class documentation).  Second, that information is likely to get
dated as the classes are modified and extended.

I think it would be better if the class developer could enter the
information once, and the documentation system assemble it dynamically
when the user asks a question.  For example, if the user asks for
documentation on a class, the resulting page would be contstructed by
pulling together the class description, appropriate method descriptions,
and links to classes the focal class extends (as well, possibly, as
classes that extend it).  Similarly, a request for methods could
assemble a page out of the snippets documenting the individual methods,
including links to the relevant classes.

I realize that implementing this is not trivial, and I'm not necessarily
advocating it as a priority.  But I wonder how it strikes people.

-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 accessors

2006-09-27 Thread Ross Boylan

I'm trying to understand what the underlying issues are here--with the
immediate goal of how that affects my design and documentation
decisions.

On Wed, Sep 27, 2006 at 02:08:34PM -0400, John Chambers wrote:
 Seth Falcon wrote:
  John Chambers [EMAIL PROTECTED] writes:
 

  There is a point that needs to be remembered in discussions of
  accessor functions (and more generally).
 
  We're working with a class/method mechanism in a _functional_
  language.  Simple analogies made from class-based languages such as
  Java are not always good guides.
 
  In the example below, a function foo that only operates on that
  class is not usually a meaningful concept in R.   

The sense of meaningful here is hard for me to pin down, even with
the subsequent discussion.

I think the import is more than formal: R is not strongly typed, so
you can hand any argument to any function and the language will not
complain.

  
 
  If foo is a generic and the only method defined is for class Bar, then
  the statement seems meaningful enough?

 
 This is not primarily a question about implementation but about what the 
 user understands.   IMO, a function should have an intuitive meaning to 
 the user.  Its name is taking up a global place in the user's brain, 
 and good software design says not to overload users with too many 
 arbitrary names to remember.

It's true that clashing uses of the same name may lead to confusion,
but that need not imply that functions must be applicable to all
objects.  Many functions only make sense in particular contexts, and
sometimes those contexts are quite narrow.

One of the usual motivations for an OO approach is precisely to limit
the amount of global space taken up by, for example, functions that
operate on the class (global in both the syntactic sense and in the
inside your brain sense).  Understanding a traditional OO system, at
least for me, is fundamentally oriented to understanding the objects
first, with the operations on them as auxiliaries.  As you point out,
this is just different from the orientation of a functional language,
which starts with the functions.

 
 To be a bit facetious, if flag is a slot in class Bar, it's really not 
 a good idea to define the accessor for that slot as
  flag - function(object)[EMAIL PROTECTED]
 
 Nor is the situation much improved by having flag() be a generic, with 
 the only method being for class Bar.  We're absconding with a word that 
 users might think has a general meaning.  OK, if need be we will have 
 different flag() functions in different packages that have _different_ 
 intuitive interpretations, but it seems to me that we should try to 
 avoid that problem when we can.
 
 OTOH, it's not such an imposition to have accessor functions with a 
 syntax that includes the name of the slot in a standardized way:
   get_flag(object)
 (I don't have any special attachment to this convention, it's just there 
 for an example)

I don't see why get_flag differs from flag; if flag lends itself to
multiple interpretations or meanings, wouldn't get_flag have the
same problem?

Or are you referring to the fact that flag sounds as if it's a verb
or action?  That's a significant ambiguity, but there's nothing about
it that is specific to a functional approach.

   

  Functions are first-class objects and in principle every function
  should have a function, a purpose.  Methods implement that purpose
  for particular combinations of arguments.
 

If this is a claim that every function should make sense for every
object, it's asking too much.  If it's not, I don't really see how a
function can avoid having a purpose.  The purpose of accessor
functions is to get or set the state of the object.

  Accessor functions are therefore a bit anomalous.  
  
 
  How?  A given accessor function has the purpose of returning the
  expected data contained in an instance.  It provides an abstract
  interface that decouples the structure of the class from the data it
  needs to provide to users.

 
 See above.  That's true _if_ the name or some other syntactic sugar 
 makes it clear the this is indeed an accessor function, but not otherwise.

Aside from the fact that I don't see why get_flag is so different from
flag, the syntactic sugar argument has another problem.  The usually
conceived purpose of accessors is to hide from the client the
internals of the object.  To take an example that's pretty close to
one of my classes, I want startTime, endTime, and duration.
Internally, the object only needs to hold 2 of these quantities to get
the 3rd, but I don't want the client code to be aware of which choice
I made.  In particular, I don't what the client code to change from 
duration to get_duration if I switch to a representation that stored
the duration as a slot.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 accessors

2006-09-26 Thread Ross Boylan

On Tue, 2006-09-26 at 10:43 -0700, Seth Falcon wrote:
 Ross Boylan [EMAIL PROTECTED] writes:

  If anyone else is going to extend your classes, then you are doing
  them a disservice by not making these proper methods.  It means that
  you can control what happens when they are called on a subclass. 
 
  My style has been to define a function, and then use setMethod if I want
  to redefine it for an extension.  That way the original version becomes
  the generic.
 
  So I don't see what I'm doing as being a barrier to adding methods.  Am
  I missing something?
 
 You are not, but someone else might be: suppose you release your code
 and I would like to extend it.  I am stuck until you decide to make
 generics.
This may be easier to do concretely.
I have an S4 class A.
I have defined a function foo that only operates on that class.
You make a class B that extends A.
You wish to give foo a different implementation for B.

Does anything prevent you from doing 
setMethod(foo, B, function(x) blah blah)
(which is the same thing I do when I make a subclass)?
This turns my original foo into the catchall method.

Of course, foo is not appropriate for random objects, but that was true
even when it was a regular function.

 
  Originally I tried defining the original using setMethod, but this
  generates a complaint about a missing function; that's one reason I fell
  into this style.
 
 You have to create the generic first if it doesn't already exist:
 
setGeneric(foo, function(x) standardGeneric(foo))
I wonder if it might be worth changing setMethod so that it does this by
default when no existing function exists. Personally, that would fit the
style I'm using better.
 
  For accessors, I like to document them in the methods section of the
  class documentation.
 
  This is for accessors that really are methods, not my fake
  function-based accessors, right?
 
 Which might be a further argument not to have the distinction in the
 first place ;-)
 
 To me, simple accessors are best documented with the class.  If I have
 an instance, I will read help on it and find out what I can do with
 it.  
 
  If you use foo as an accessor method, where do you define the associated
  function (i.e., \alias{foo})? I believe such a definition is expected by
  R CMD check and is desirable for users looking for help on foo (?foo)
  without paying attention to the fact it's a method.
 
 Yes you need an alias for the _generic_ function.  You can either add
 the alias to the class man page where one of its methods is documented
 or you can have separate man pages for the generics.  This is
 painful.  S4 documentation, in general, is rather difficult and IMO
 this is in part a consequence of the more general (read more powerful)
 generic function based system.
As my message indicates, I too am struggling with an appropriate
documentation style for S4 classes and methods.  Since Writing R
Extensions has said Structure of and special markup for documenting S4
classes and methods are still under development. for as long as I cam
remember, perhaps I'm not the only one.

Some of the problem may reflect the tension between conventional OO and
functional languages, since R remains the latter even under S4.  I'm not
sure if it's the tools or my approach that is making things awkward; it
could be both!

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] S4 accessors

2006-09-25 Thread Ross Boylan

I have a small S4 class for which I've written a page grouping many of
the accessors and replacement functions together.  I would be interested
in people comments on the approach I've taken.

The code has a couple of decisions for which I could imagine
alternatives.  First, even simple get/set operations on class elements
are wrapped in functions.  I suppose I could just use [EMAIL PROTECTED] to
do some of these operations, though that is considered bad style in more
traditional OO contexts.

Second, even though the functions are tied to the class, I've defined
them as free functions rather than methods.  I suppose I could create a
generic that would reject most arguments, and then make methods
appropriately.

For the documentation, I've created a single page that groups many of
the functions together.  This is a bit awkward, since the return values
are necessarily the same.  Things are worse for replacement functions;
as I understand it, they must use value for their final argument, but
the value has different meanings and types in different contexts.

Any suggestions or comments?

I've attached the .Rd file in case more specifics would help.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062
\name{runTime-accessors}
\alias{startTime}
\alias{endTime}
\alias{wallTime}
\alias{waitTime}
\alias{cpuTime}
\alias{mpirank}
\alias{mpirank-}
\alias{remoteTime-}
\title{ Accessors for runTime class}
\description{
  Set and get runTime related information.
}
\usage{
startTime(runTime)
endTime(runTime)
wallTime(runTime)
waitTime(runTime)
cpuTime(runTime)
mpirank(runTime)

mpirank(runTime) - value
remoteTime(runTime) - value
}
%- maybe also 'usage' for other objects documented here.
\arguments{
  \item{runTime}{a \code{runTime} object}
  \item{value}{for \code{mpirank}, the MPI rank of the associated job.
For \code{remoteTime}, a vector of statistics from the remote processor: 
user
cpu, system cpu, wall clock time for main job, wall clock time
waiting for the root process.}
}
\details{
  All times are measured from start of job.  The sequence of events is
  that the job is created locally, started remotely, finished remotely,
  and completed locally.  Scheduling and transmission delays may occur.
  
  \item{startTime}{When the job was created, locally.}
  \item{endTime}{When job finished locally.}
  \item{wallTime}{How many seconds between local start and completion.}
  \item{cpuTime}{Remote cpu seconds used, both system and user.}
  \item{waitTime}{Remote seconds waiting for response from the local
system after the remote computation finished.}
  \item{mpirank}{The rank of the execution unit that handled the remote 
computation.}
}
\value{
  Generally seconds, at a system-dependent resolution.
  \code{mpirank} is an integer.
  Replacement functions return the \code{runTime} object itself.
}
\author{Ross Boylan}
\note{Clients that use replacement functions should respect the semantics above.
}
\seealso{\code{\link{runTime-class}}}
\keyword{programming}
\keyword{environment}
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 accessors (corrected)

2006-09-25 Thread Ross Boylan

On Tue, 2006-09-26 at 00:20 +, Ross Boylan wrote:
 I have a small S4 class for which I've written a page grouping many of
 the accessors and replacement functions together.  I would be interested
 in people comments on the approach I've taken.
 
 The code has a couple of decisions for which I could imagine
 alternatives.  First, even simple get/set operations on class elements
 are wrapped in functions.  I suppose I could just use [EMAIL PROTECTED] to
 do some of these operations, though that is considered bad style in more
 traditional OO contexts.
 
 Second, even though the functions are tied to the class, I've defined
 them as free functions rather than methods.  I suppose I could create a
 generic that would reject most arguments, and then make methods
 appropriately.
 
 For the documentation, I've created a single page that groups many of
 the functions together.  This is a bit awkward, since the return values
 are 
NOT
 necessarily the same.  Things are worse for replacement functions;
 as I understand it, they must use value for their final argument, but
 the value has different meanings and types in different contexts.
 
 Any suggestions or comments?
 
 I've attached the .Rd file in case more specifics would help.

Sorry!

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 classes and C

2006-05-19 Thread Ross Boylan

On Fri, 2006-05-19 at 11:46 +0200, Martin Maechler wrote:
  Seth == Seth Falcon [EMAIL PROTECTED]
  on Thu, 18 May 2006 12:22:36 -0700 writes:
 
 Seth Ross Boylan [EMAIL PROTECTED] writes:
  Is there any good source of information on how S4 classes (and methods)
  work from C?
 
 Hmm, yes; there's nothing in the Writing R Extensions manual,
 and there's not so much in the ``The Green Book''
 (Chambers 1998), which is prominently cited by Doug Bates (and Saikat Debroy)
 in the paper given at DSC 2003 (mentioned earlier in this
 thread, 
 http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/BatesDebRoy.pdf )

Particularly relevant is Appendix A.6 of the Green book.  The index
isn't particularly helpful finding it.  Of course, Appendix A can't
answer the question of how the R implementation might differ.

For example, A.6 discusses GET_SLOT_OFFSET, which does not appear to be
available in R.

Thanks for the pointers.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S4 classes and C

2006-05-18 Thread Ross Boylan

On Thu, 2006-05-18 at 13:53 -0400, McGehee, Robert wrote:
 I believe the paper on which those lecture notes were based can be found
 here:
 http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/BatesDebRoy.pdf 
 
Thank you.  It looks as if it has some useful stuff in it.
Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Recommended style with calculator and persistent data

2006-05-18 Thread Ross Boylan

I have some calculations that require persistent state.  For example,
they retain most of the data across calls with different parameters.
They retain parameters across calls with different subsets of the cases
(this is for distributed computation).  They retain early analysis of
the problem to speed later computations.

I've created an S4 object, and the stylized code looks like this
calc - makeCalculator(a, b, c)
calc - setParams(calc, d, e)
calc - compute(calc)
results - results(calc)
The client code (such as that above) must remember to do the
assignments, not just invoke the functions.

I notice this does not seem to be the usual style, which is more like
results - compute(calc)

and possibly using assignment operators like
params(calc) - x
(actually, I have a call like that, but some of the updates take
multiple arguments).

Another route would be to use lexical scoping to bundle all the
functions together (umm, I'm not sure how that would work with S4
methods) to ensure persistence without requiring assignment by the
client code.  Obviously this would decrease portability to S, but I'm
already using lexical scope a bit.

Is there a recommended R'ish way to approach the problem?  My current
approach works, but I'm concerned it is non-standard, and so would be
unnatural for users.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] S4 documentation

2006-02-07 Thread Ross Boylan

1. promptClass generated a file that included
\section{Methods}{
No methods defined with class mspathDistributedCalculator in the
signature.
}
Yet there are such methods.  Is this a not-working yet feature, or is
something funny going on (maybe I have definitions in the library and in
the global workspace...)?

2. Is the \code{\link{myS4class-class}} the proper way to
cross-reference a class?  \code{\link{myS4method-method}} the right way
to refer to a method?  I looked for something like \class or \method to
correspond to \code, but didn't see anything.

3.  This question goes beyond documentation.  I have been approaching
things like this:
setClass(A, )
foo - function(a) 
setClass(B, ...)
setMethod(foo, B, )
so the first foo turns into the default function for the generic.

This was primarily motivated by discovering that
setMethod(foo, A) where I have the first function definition
produced an error.

Is this a reasonable way to proceed?

Then do I document the generic with standard function documentation for
foo?  Are there some examples I could look at?  When I want to refer to
the function generically, how do I do that?

I'm using R 2.2.1, and I've found the documentation on documenting S4 a
bit too brief (even after looking at the recommended links, which in
some cases don't have much on documentation).  Since the docs say it's a
work in progress, I'm hoping to get the latest word here.

Thanks.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] checkpointing

2006-01-06 Thread Ross Boylan

Here's some code I put together for checkpointing a function being
optimized. Hooking directly into optim would require modifying its C
code, so this seemed the easiest route.  I've wanted more information on
the iterations than is currently provided, so this stuff some info back
in the calling environment (by default).

# wrapper to do checkpointing

# Ross Boylan [EMAIL PROTECTED]
# 06-Jan-2006
# (C) 2006 Regents of University of California
# Distributed under the Gnu Public License v2 or later at your option

# If you want to checkpoint the optimization of a function f
# Use checkpoint(f) instead.  See below for other possible arguments.

# default operation for checkpoint(fnfoo) is to record the iterations
# in fnfoo.trace in the calling environment

# WARNING: Any existing variable with name in argument name
# will be deleted from the indicated frame
checkpoint - function(f,
   name = paste(substitute(f), .trace, sep=),
   fileName = substitute(f),
   nCalls = 1,
   nTime = 60*15,
   frame = parent.frame()) {
  # f is the objective function
  # frame is where to put the variable name
  # name will be a data.frame with rows containing
  #   iteration, time, value, parameters
  # fileName is the stem of the name to save for checkpointing
  #  saving will alternate between files with 0 and 1 appended
  # Saving to disk will happen every nCalls or nTime seconds,
  # whichever comes first
  if (exists(name, where=frame))
  rm(list=name, pos=frame)
  ckpt.lastSave - 0 # alternate 0/1 for file to write to
  ckpt.lastTime - Sys.time()  # last time saved
  function(params, ...) {
p - as.list(params)
names(p) - seq(length(params))
if (exists(name, where=frame, inherits=FALSE)) {
  progress - get(name, pos=frame)
  progress - rbind(progress,
data.frame(row.names=dim(progress)[1]+1,
time=Sys.time(),
val=NA, p), deparse.level=0)
} else
progress - data.frame(row.names=1, time=Sys.time(), val=NA, p)
n - dim(progress)[1]
# write to disk
if (n%%nCalls == 0 || progress[n, 1]- ckpt.lastTime  nTime) {
  ckpt.lastSave - (ckpt.lastSave+1) %% 2
  save(progress, file=paste(fileName, ckpt.lastSave, sep=))
  ckpt.lastTime - progress[n, 1]
}
v - f(params, ...)
progress[n, 2] - v
assign(name, progress, pos=frame)
v
  }
}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] checkpointing

2006-01-02 Thread Ross Boylan

I would like to checkpoint some of my calculations in R, specifically
those using optim.  As far as I can tell, R doesn't have this facility,
and there seems to have been little discussion of it.

checkpointing is saving enough of the current state so that work can
resume where things were left off if, to take my own example, the system
crashes after 8 days of calculation.

My thought is that this could be added as an option to optim as one of
the control parameters.

I thought I'd check here to see if anyone is aware of any work in this
area or has any thoughts about how to proceed.  In particular, is save a
reasonable way to save a few variables to disk?  I could also make the
code available when/if I get it working.
-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] external pointers

2005-12-09 Thread Ross Boylan

I have some C data I want to pass back to R opaquely, and then back to
C.  I understand external pointers are the way to do so.

I'm trying to find how they interact with garbage collection and object
lifetime, and what I need to do so that the memory lives until the
calling R process ends.

Could anyone give me some pointers?  I haven't found much documentation.
An earlier message suggested looking at simpleref.nw, but I can't find
that file.

So the overall pattern, from R, would look like
opaque - setup(arg1, arg2, )  # setup calls a C fn
docompute(arg1, argb, opaque)  # many times. docompute also calls C
# and then when I return opaque and  the memory it's wrapping get
#cleaned up.  If necessary I could do
teardown(opaque)  # at the end

C is actually C++ via a C interface, if that matters.  In particular,
the memory allocated will likely be from the C++ run-time, and needs C++
destructors.

-- 
Ross Boylan  wk:  (415) 514-8146
185 Berry St #5700   [EMAIL PROTECTED]
Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
University of California, San Francisco
San Francisco, CA 94107-1739 hm:  (415) 550-1062

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] problem with \eqn (PR#8322)

2005-11-23 Thread Ross Boylan

On Mon, 2005-11-21 at 10:27 +, Hin-Tak Leung wrote:
 Kurt Hornik wrote:
 snipped
  Definitely a problem in Rdconv.
  
  E.g.,
  
  $ cat foo.Rd 
  \description{
\eqn{{A}}{B}
  }
  [EMAIL PROTECTED]:~/tmp$ R-d CMD Rdconv -t latex foo.Rd | grep eqn
  \eqn{{A}}{A}{{B}
  
  shows what is going on.
 
 There is a work-around - putting extra spaces between the two braces:
 
 $ cat foo.Rd
 \description{
\eqn{ {A} }{B}
 }
 
 $R CMD Rdconv -t latex foo.Rd
 \HeaderA{}{}{}
 \begin{Description}\relax
 \eqn{ {A} }{B}
 \end{Description}
 
 
 HT
Terrific!  I can confirm that works for me and, in a way, a work-around
is better than a fix.  With the work-around, I can distribute the
package without needing to require that people get some not-yet-release
version of R that fixes the problem.  I do hope the problem gets fixed
though :)

By the way, I  couldn't see how the perl code excerpted earlier paid any
attention to {}.  But perl is not my native tongue.

Ross

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

1 2 >

1 - 100 of 101 matches

Mail list logo