[R-pkg-devel] how to document method arguments that aren't in the signature
What's the best way to document an S4 method that takes arguments beyond those in the signature? Consider setGeneric("sim", function(simP, dataP, ...) standardGeneric("sim")) setMethod("sim", signature="SimParameters", function(simP, dataP) { lapply(seq(simP@NIter), function(i) do.one(simP, dataP, i)) } ) For which promptClass generates \section{Methods}{ \describe{ \item{sim}{\code{signature(simP = "SimParameters")}: ... } } } I turned that into \item{sim}{\code{signature(simP = "SimParameters", datap)}: ... } which seems a little funny since the real signature only mentions one argument. R CMD check does not complain about it, however. Since omitted arguments are effectively class "ANY", one alternative is \item{sim}{\code{signature(simP = "SimParameters", datap = "ANY")}: ... } I also considered adding the non-signature arguments in the text. Finally, although datap is formally untyped, there are requirements on what kind of object it can be. In practice it is only likely to be from one of two classes, but I want to allow the users to make their own. Thanks for your thoughts. Ross Boylan P.S. And what would I do if a particular method actually used an argument in ..., e.g., setMethod("sim", signature="SimParameters", function(simP, dataP, bar, ...) ? How would one document the bar argument? __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[Rd] modifying a persistent (reference) class
I saved objects that were defined using several reference classes. Later I modified the definition of reference classes a bit, creating new functions and deleting old ones. The total number of functions did not change. When I read them back I could only access some of the original data. I asked on the user list and someone suggested sticking with the old class definitions, creating new classes, reading in the old data, and converting it to the new classes. This would be awkward (I want the new classes to have the same name as the old ones), and I can probably just leave the old definitions and define the new functions I need outside of the reference classes. Are there any better alternatives? On reflection, it's a little surprising that changing the code for a reference class makes any difference to an existing instance, since all the function definitions seem to be attached to the instance. One problem I've had in the past was precisely that redefining a method in a reference class did not change the behavior of existing instances. So I've tried to follow the advice to keep the methods light-weight. In this case I was trying to move from a show method (that just printed) to a summary method that returned a summary object. So I wanted to add a summary method and redefine the show to call summary in the base class, removing all the subclass definitions of show. Regular S4 classes are obviously not as sensitive since they usually don't include the functions that operate on them, but I suppose if you changed the slots you'd be in similar trouble. Some systems keep track of versions of class definitions and allow one to write code to migrate old to new forms automatically when the data are read in. Does R have anything like that? The system on which I encountered the problems was running R 2.15. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] modifying a persistent (reference) class
On Fri, 2014-08-01 at 14:42 -0400, Brian Lee Yung Rowe wrote: Ross, This is generally a hard problem in software systems. The only language I know that explicitly addresses it is Erlang. Ultimately you need a system upgrade process, which defines how to update the data in your system to match a new version of the system. You could do this by writing a script that 1) loads the old version of your library 2) loads your data/serialized reference classes 3) exports data to some intermediate format (eg a list) 4) loads new version of library 5) imports data from intermediate format My recollection is that in Gemstone's smalltalk database you can define methods associated with a class that describe how to change an instance from one version to another. You also have the choice of upgrading all persistent objects at once or doing so lazily, i.e., as they are retrieved. The brittleness of the representation depends partly on the details. If a class has 2 slots, a and b, and the only thing on disk is the contents of a and the contents of b, almost any change will screw things up. However, if the slot name is persisted with the instance it's much easier to reconstruct the instance of the class changes (if slot c is added and not on disk, set it to nil; if b is removed, throw it out when reading from disk). Once could also persist the class definition, or key elements of it, with individual instances referring to the definition. I don't know which, if any of these strategies, R uses for reference or other classes. Once you've gone through the upgrade process, arguably it's better to persist the data in a format that is decoupled from your objects since then future upgrades would simply be 1) load new library 2) import data from intermediate format Arguably :) As I said, some representations could do this automatically. And there are still issues such as a change in the type of a slot, or rules for filling new slots, that would require intervention. In my experience with other object systems, usually methods are attributes of the class. For R reference classes they appear to be attributes of the instance, potentially modifiable on a per-instance basis. Ross which is no different from day-to-day operation of your app/system (ie you're always writing to and reading from the intermediate format). Warm regards, Brian • Brian Lee Yung Rowe Founder, Zato Novo Professor, M.S. Data Analytics, CUNY On Aug 1, 2014, at 1:54 PM, Ross Boylan r...@biostat.ucsf.edu wrote: I saved objects that were defined using several reference classes. Later I modified the definition of reference classes a bit, creating new functions and deleting old ones. The total number of functions did not change. When I read them back I could only access some of the original data. I asked on the user list and someone suggested sticking with the old class definitions, creating new classes, reading in the old data, and converting it to the new classes. This would be awkward (I want the new classes to have the same name as the old ones), and I can probably just leave the old definitions and define the new functions I need outside of the reference classes. Are there any better alternatives? On reflection, it's a little surprising that changing the code for a reference class makes any difference to an existing instance, since all the function definitions seem to be attached to the instance. One problem I've had in the past was precisely that redefining a method in a reference class did not change the behavior of existing instances. So I've tried to follow the advice to keep the methods light-weight. In this case I was trying to move from a show method (that just printed) to a summary method that returned a summary object. So I wanted to add a summary method and redefine the show to call summary in the base class, removing all the subclass definitions of show. Regular S4 classes are obviously not as sensitive since they usually don't include the functions that operate on them, but I suppose if you changed the slots you'd be in similar trouble. Some systems keep track of versions of class definitions and allow one to write code to migrate old to new forms automatically when the data are read in. Does R have anything like that? The system on which I encountered the problems was running R 2.15. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] modifying a persistent (reference) class
On Fri, 2014-08-01 at 16:06 -0400, Brian Lee Yung Rowe wrote: Ross, Ah I didn't think about Smalltalk. Doesn't surprise me that they supported upgrades of this sort. That aside I think the question is whether it's realistic for a language like R to support such a mechanism automatically. Smalltalk and Erlang both have tight semantics that would be hard to establish in R (given the multiple object systems and dispatching systems). I'm a functional guy so to me it's natural to separate the data from the functions/methods. Having spent years writing OOP code I walked away concluding that OOP makes things more complicated for the sake of being OOP (eg no first class functions). In smalltalk everything is an object, and that includes functions, including class methods. Obviously that's changing, and in a language like R it's less of an issue. However, something like object serialization smells suspiciously similar. If you know that serializing objects is brittle, why not look for an alternative approach as opposed to chasing that rainbow? My immediate problem is/was that I have serialized objects representing weeks of CPU time. I have to work with them, not some other representation they might have. And it's much more natural to work with R's native persistence than some other scheme I cook up. I think persistence requires serialization. The serialization can be more or less brittle, but I don't think there is an alternative to serialization. Since I just worked around my immediate problem a few minutes ago (by retaining the original class definitions and using setMethod to create summary methods), my interests are a bit more theoretical. First, I'd like to understand more about exactly what is saved to disk for reference and other classes, in particular how much meta-information they contain. And my mental model for reference class persistence is clearly wrong, because in that model instances based on old definitions come back intact (albeit not with the new method definitions or other new slots), whereas mine seemed to come back damaged. Second, I'm still hoping for some elegant way around this problem (how to redefine classes and still use saved versions from older definitions) for the future, both with reference and regular classes. Or at least some rules about what changes, if any, are safe to make in class definitions after an instance has been persisted. Third, if changes to R could make things better, I'm hoping some developers might take them up. I realize that is unlikely to happen, for many good reasons, but I can still hope :) Ross Warm regards, Brian • Brian Lee Yung Rowe Founder, Zato Novo Professor, M.S. Data Analytics, CUNY On Aug 1, 2014, at 3:33 PM, Ross Boylan r...@biostat.ucsf.edu wrote: On Fri, 2014-08-01 at 14:42 -0400, Brian Lee Yung Rowe wrote: Ross, This is generally a hard problem in software systems. The only language I know that explicitly addresses it is Erlang. Ultimately you need a system upgrade process, which defines how to update the data in your system to match a new version of the system. You could do this by writing a script that 1) loads the old version of your library 2) loads your data/serialized reference classes 3) exports data to some intermediate format (eg a list) 4) loads new version of library 5) imports data from intermediate format My recollection is that in Gemstone's smalltalk database you can define methods associated with a class that describe how to change an instance from one version to another. You also have the choice of upgrading all persistent objects at once or doing so lazily, i.e., as they are retrieved. The brittleness of the representation depends partly on the details. If a class has 2 slots, a and b, and the only thing on disk is the contents of a and the contents of b, almost any change will screw things up. However, if the slot name is persisted with the instance it's much easier to reconstruct the instance of the class changes (if slot c is added and not on disk, set it to nil; if b is removed, throw it out when reading from disk). Once could also persist the class definition, or key elements of it, with individual instances referring to the definition. I don't know which, if any of these strategies, R uses for reference or other classes. Once you've gone through the upgrade process, arguably it's better to persist the data in a format that is decoupled from your objects since then future upgrades would simply be 1) load new library 2) import data from intermediate format Arguably :) As I said, some representations could do this automatically. And there are still issues such as a change in the type of a slot, or rules for filling new slots, that would require intervention. In my experience with other object systems, usually
Re: [Rd] modifying data in a package [a solution]
On Wed, 2014-03-19 at 19:22 -0700, Ross Boylan wrote: I've tweaked Rmpi and want to have some variables that hold data in the package. One of the R files starts mpi.isend.obj - vector(list, 500) #mpi.request.maxsize()) mpi.isend.inuse - rep(FALSE, 500) #mpi.request.maxsize()) and then functions update those variables with -. When run: Error in mpi.isend.obj[[i]] - .force.type(x, type) : cannot change value of locked binding for 'mpi.isend.obj' I'm writing to ask the proper way to accomplish this objective (getting a variable I can update in package namespace--or at least somewhere useful and hidden from the outside). I've discovered one way to do it: In one of the regular R files mpi.global - new.env() Then at the end of .onLoad in zzz.R: assign(mpi.isend.obj, vector(list, mpi.request.maxsize()), mpi.global) and similary for the logical vector mpi.isend.inuse Access with functions like this: ## Next 2 functions have 3 modes ## foo() returns foo from mpi.global ## foo(request) returns foo[request] from mpi.global ## foo(request, value) set foo[request] to value mpi.isend.inuse - function(request, value) { if (missing(request)) return(get(mpi.isend.inuse, mpi.global)) i - request+1L parent.env(mpi.global) - environment() if (missing(value)) return(evalq(mpi.isend.inuse[i], mpi.global)) return(evalq(mpi.isend.inuse[i] - value, mpi.global)) } # request, if present, must be a single value mpi.isend.obj - function(request, value){ if (missing(request)) return(get(mpi.isend.obj, mpi.global)) i - request+1L parent.env(mpi.global) - environment() if (missing(value)) return(evalq(mpi.isend.obj[[i]], mpi.global)) return(evalq(mpi.isend.inuse[[i]] - value, mpi.global)) } This is pretty awkward; I'd love to know a better way. Some of the names probably should change too: mpi.isend.obj() sounds too much as if it actually sends something, like mpi.isend.Robj(). Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] modifying data in a package
I've tweaked Rmpi and want to have some variables that hold data in the package. One of the R files starts mpi.isend.obj - vector(list, 500) #mpi.request.maxsize()) mpi.isend.inuse - rep(FALSE, 500) #mpi.request.maxsize()) and then functions update those variables with -. When run: Error in mpi.isend.obj[[i]] - .force.type(x, type) : cannot change value of locked binding for 'mpi.isend.obj' I'm writing to ask the proper way to accomplish this objective (getting a variable I can update in package namespace--or at least somewhere useful and hidden from the outside). I think the problem is that the package namespace is locked. So how do I achieve the same effect? http://www.r-bloggers.com/package-wide-variablescache-in-r-packages/ recommends creating an environment and then updating it. Is that the preferred route? (It seems odd that the list should be locked but the environment would be manipulable. I know environments are special.) The comments indicate that 500 should be mpi.request.maxsize(). That doesn't work because mpi.request.maxsize calls a C function, and there is an error that the function isn't loaded. I guess the R code is evaluated before the C libraries are loaded. The packages zzz.R starts .onLoad - function (lib, pkg) { library.dynam(Rmpi, pkg, lib) So would moving the code into .onLoad after that work? In that case, how do I get the environment into the proper scope? Would .onLoad - function (lib, pkg) { library.dynam(Rmpi, pkg, lib) assign(mpi.globals, new.env(), environment(mpi.isend)) assign(mpi.isend.obj, vector(list, mpi.request.maxsize(), mpi.globals) work? mpi.isend is a function in Rmpi. But I'd guess the first assign will fail because the environment is locked. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Does R ever move objecsts in memory?
R objects can disappear if they are garbage collected; can they move, i.e., change their location in memory? I don't see any indication this might happen in Writing R Extensions or R Internals. But I'd like to be sure. Context: Rmpi serializes objects in raw vectors for transmission by mpi. Some send operations (isend) return before transmission is complete and so need the bits to remain untouched until transmission completes. If a preserve a reference to the raw vector in R code that will prevent it from being garbage collected, but if it gets moved that would invalidate the transfer. I was just using the blocking sends to avoid this problem, but the result is significant delays. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] 2 versions of same library loaded
On Thu, 2014-03-13 at 10:46 -0700, Ross Boylan wrote: 1. My premise that R had no references to mpi was incorrect. The logs show 24312: file=libmpi.so.1 [0]; needed by /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so [0] 24312: find library=libmpi.so.1 [0]; searching 24312: search path=/usr/lib64/R/lib:/home/ross/install/lib (LD_LIBRARY_PATH) 24312: trying file=/usr/lib64/R/lib/libmpi.so.1 24312: trying file=/home/ross/install/lib/libmpi.so.1 Except there is no file /usr/lib64/R/lib/libmpi.so.1 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] 2 versions of same library loaded
On Thu, 2014-03-13 at 10:46 -0700, Ross Boylan wrote: It seems very odd that the same Rmpi.so is requiring both the old and new libmpi.so (compare to the first trace in in point 1). There is this code in Rmpi.c: if (!dlopen(libmpi.so.0, RTLD_GLOBAL | RTLD_LAZY) !dlopen(libmpi.so, RTLD_GLOBAL | RTLD_LAZY)){ So I'm still not sure what it's using, or if there is some mishmash of the 2. There is an explanation for the explicit load in the Changelog: 2007-10-24, version 0.5-5: dlopen has been used to load libmpi.so explicitly. This is mainly useful for Rmpi under OpenMPI where one might see many error messages: mca: base: component_find: unable to open osc pt2pt: file not found (ignored) if libmpi.so is not loaded with RTLD_GLOBAL flag. http://www.stats.uwo.ca/faculty/yu/Rmpi/changelogs.htm There is another interesting note about openmpi: It looks like that the option --disable-dlopen is not necessary to install Open MPI 1.6, at least on Debian. This might be R's .onLoad correctly loading dynamic libraries and Open MPI is not required to be compiled with static libraries enabled. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] 2 versions of same library loaded
Can anyone help me understand how I got 2 versions of the same library loaded, how to prevent it, and what the consequences are? Running under Debian GNU/Linux squeeze. lsof and /proc/xxx/map both show 2 copies of several libraries loaded: /home/ross/install/lib/libmpi.so.1.3.0 /home/ross/install/lib/libopen-pal.so.6.1.0 /home/ross/install/lib/libopen-rte.so.7.0.0 /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so /usr/lib/openmpi/lib/libmpi.so.0.0.2 /usr/lib/openmpi/lib/libopen-pal.so.0.0.0 /usr/lib/openmpi/lib/libopen-rte.so.0.0.0 /usr/lib/R/lib/libR.so The system has the old version of MPI installed under /usr/lib. I built a personal, newer copy in my directory, and then rebuilt Rmpi (an R package) against it. ldd on the personal Rmpi.so and libmpi.so shows all references to mpi libraries on personal paths. R was installed from a debian package, and presumably compiled without having MPI around. Before running I set LD_LIBRARY_PATH to ~/install/lib, and then stuffed the same path at the start of LD_LIBRARY_PATH using Sys.setenv in my profile because R seems to prepend some libraries to that path when it starts (I'm curious about that too). I also prepended ~/install/bin to my path, though I'm not sure that's relevant. Does R use ld.so or some other mechanism for loading libraries? Can I assume the highest version number of a library will be preferred? http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#index-Dynamic-loading says If a shared object/DLL is loaded more than once the most recent version is used. I'm not sure if most recent means the one loaded most recently by the program (I don't know which that is) or highest version number. Why is /usr/lib/openmpi being looked at in the first place? How can I stop the madness? Some folks on the openmpi list have indicated I need to rebuild R, telling it where my MPI is, but that seems an awfully big hammer for the problem. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] 2 versions of same library loaded
Comments/questions interspersed below. On Wed, 2014-03-12 at 22:50 -0400, Simon Urbanek wrote: Ross, On Mar 12, 2014, at 5:34 PM, Ross Boylan r...@biostat.ucsf.edu wrote: Can anyone help me understand how I got 2 versions of the same library loaded, how to prevent it, and what the consequences are? Running under Debian GNU/Linux squeeze. lsof and /proc/xxx/map both show 2 copies of several libraries loaded: /home/ross/install/lib/libmpi.so.1.3.0 /home/ross/install/lib/libopen-pal.so.6.1.0 /home/ross/install/lib/libopen-rte.so.7.0.0 /home/ross/Rlib-3.0.1/Rmpi/libs/Rmpi.so /usr/lib/openmpi/lib/libmpi.so.0.0.2 /usr/lib/openmpi/lib/libopen-pal.so.0.0.0 /usr/lib/openmpi/lib/libopen-rte.so.0.0.0 /usr/lib/R/lib/libR.so The system has the old version of MPI installed under /usr/lib. I built a personal, newer copy in my directory, and then rebuilt Rmpi (an R package) against it. ldd on the personal Rmpi.so and libmpi.so shows all references to mpi libraries on personal paths. R was installed from a debian package, and presumably compiled without having MPI around. Before running I set LD_LIBRARY_PATH to ~/install/lib, and then stuffed the same path at the start of LD_LIBRARY_PATH using Sys.setenv in my profile because R seems to prepend some libraries to that path when it starts (I'm curious about that too). I also prepended ~/install/bin to my path, though I'm not sure that's relevant. Does R use ld.so or some other mechanism for loading libraries? R uses dlopen to load package libraries - it is essentially identical to using ld.so for dependencies. Can I assume the highest version number of a library will be preferred? No. Bummer. The fact that Rmpi is not crashing suggests to me it's using the right version of the mpi libraries (it does produce lots of errors if I run it without setting LD_LIBRARY_PATH so only the system mpi libs are in play), but it would be nice to be certain. Or the 2 versions could be combined in a big mess. http://cran.r-project.org/doc/manuals/r-devel/R-exts.html#index-Dynamic-loading says If a shared object/DLL is loaded more than once the most recent version is used. I'm not sure if most recent means the one loaded most recently by the program (I don't know which that is) or highest version number. The former - whichever you load last wins. Note, however, that this refers to explicitly loaded objects since they are loaded into a flat namespace so a load will overwrite all symbols that get loaded. It might be good to clarify that in the manual. If I understand the term, the mpi libraries are loaded implicitly; that is, Rmpi.so is loaded explicitly, and then it pulls in dependencies. What are the rules in that case? Why is /usr/lib/openmpi being looked at in the first place? You'll have to consult your system. The search path (assuming rpath is not involved) is governed by LD_LIBRARY_PATH and /etc/ld.so.conf*. Note that LD_LIBRARY_PATH is consulted at the time of the resolution (when the library is looked up), so you may be changing it too late. Also note that you have to expand ~ in the path (it's not a valid path, it's a shell expansion feature). I just used the ~ as a shortcut; the shell expanded it and the full path ended up in the variable. I assume the loader checks LD_LIBRARY_PATH first; once it finds the mpi libraries there I don't know why it keeps looking. I'm not sure I follow the part about too late, but is it this?: all the R's launched under MPI have the MPI library loaded automatically. If that happens before my profile is read, reseting LD_LIBRARY_PATH will be irrelevant. I don't know whether the profile or Rmpi load happens first. The reseting is just a reordering, and since the other elements in LD_LIBRARY_PATH don't have any mpi libraries I don't think the order matters. R's massaging of the LD_LIBRARY_PATH is typically done in $R_HOME/etc/ldpaths so you may want to check it and/or adjust it as needed. Normally (in stock R), it only prepends its own libraries and Java so it should not be causing any issues, but you may want to check in case Debian scripts add anything else. The extra paths are limited as you describe, and so are probably no threat for loading the wrong MPI library (/usr/lib64/R/lib:/usr/lib/jvm/java-6-openjdk/jre/lib/amd64/server). How can I stop the madness? Some folks on the openmpi list have indicated I need to rebuild R, telling it where my MPI is, but that seems an awfully big hammer for the problem. I would check LD_LIBRARY_PATH and also check at which point are those old libraries loaded to find where they are referenced. How do I tell the point at which the old libraries are loaded? I assume it happens implicitly when Rmpi is loaded, but I don't know which of the 2 versions of the libraries is loaded first, and I don't know how to tell. Thanks for your help. Ross
Re: [Rd] C++ debugging help needed
On Wed, Oct 02, 2013 at 11:05:19AM -0400, Duncan Murdoch wrote: Up to entry #4 this all looks normal. If I go into that stack frame, I see this: (gdb) up #4 Shape::~Shape (this=0x15f8760, __in_chrg=optimized out) at Shape.cpp:13 warning: Source file is more recent than executable. That warning looks suspicious. Are your sure gdb is finding the right source files, and that the object code has been built from them? 13blended(in_material.isTransparent()) (gdb) p this $9 = (Shape * const) 0x15f8760 (gdb) p *this $10 = {_vptr.Shape = 0x72d8e290, mName = 6, mType = { static npos = optimized out, _M_dataplus = {std::allocatorchar = {__gnu_cxx::new_allocatorchar = {No data fields}, No data fields}, _M_p = 0x7f7f7f7f Address 0x7f7f7f7f out of bounds}}, mShapeColor = {mRed = -1.4044474254567505e+306, mGreen = -1.4044477603031902e+306, mBlue = 4.24399170841135e-314, mTransparent = 0}, mSpecularReflectivity = 0.0078125, mSpecularSize = 1065353216, mDiffuseReflectivity = 0.007812501848093234, mAmbientReflectivity = 0} The things displayed in *this are all wrong. Those field names come from the Shape object in the igraph package, not the Shape object in the rgl package. The mixOmics package uses both. My questions: - Has my code somehow got mixed up with the igraph code, so I really do have a call out to igraph's Shape::~Shape instead of rgl's Shape::~Shape, or is this just bad info being given to me by gdb? I don't know, but I think it's possible to give fully qualified type names to gdb to force it to use the right definition. That's assuming that both Shape's are in different namespaces. If they aren't, that's likely the problem. - If I really do have calls to the wrong destructor in there, how do I avoid this? Are you invoking the destructor explicitly? An object should know it's type, which should result in the right call without much effort. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] C++ debugging help needed
On Wed, 2013-10-02 at 16:15 -0400, Duncan Murdoch wrote: On 02/10/2013 4:01 PM, Ross Boylan wrote: On Wed, Oct 02, 2013 at 11:05:19AM -0400, Duncan Murdoch wrote: Up to entry #4 this all looks normal. If I go into that stack frame, I see this: (gdb) up #4 Shape::~Shape (this=0x15f8760, __in_chrg=optimized out) at Shape.cpp:13 warning: Source file is more recent than executable. That warning looks suspicious. Are your sure gdb is finding the right source files, and that the object code has been built from them? I'm pretty sure that's a warning about the fact that igraph also has a file called Shape.cpp, and the Shape::~Shape destructor was in that file, not in my Shape.cpp file. I guess the notion of the right source file is ambiguous in this context. Suppose you have projects A and B each defining a function f in f.cpp. Use A/f() to mean the binary function defined in project A, found in source A/f.cpp. The you have some code that means to invoke A/f() but gets B/f() instead. Probably gdb should associate this with B/f.cpp, but your intent was A/f() and A/f.cpp. If gdb happens to find A/f.cpp, and A was build after B, that could provoke the warning shown. 13blended(in_material.isTransparent()) (gdb) p this $9 = (Shape * const) 0x15f8760 (gdb) p *this $10 = {_vptr.Shape = 0x72d8e290, mName = 6, mType = { static npos = optimized out, _M_dataplus = {std::allocatorchar = {__gnu_cxx::new_allocatorchar = {No data fields}, No data fields}, _M_p = 0x7f7f7f7f Address 0x7f7f7f7f out of bounds}}, mShapeColor = {mRed = -1.4044474254567505e+306, mGreen = -1.4044477603031902e+306, mBlue = 4.24399170841135e-314, mTransparent = 0}, mSpecularReflectivity = 0.0078125, mSpecularSize = 1065353216, mDiffuseReflectivity = 0.007812501848093234, mAmbientReflectivity = 0} The things displayed in *this are all wrong. Those field names come from the Shape object in the igraph package, not the Shape object in the rgl package. The mixOmics package uses both. My questions: - Has my code somehow got mixed up with the igraph code, so I really do have a call out to igraph's Shape::~Shape instead of rgl's Shape::~Shape, or is this just bad info being given to me by gdb? I don't know, but I think it's possible to give fully qualified type names to gdb to force it to use the right definition. That's assuming that both Shape's are in different namespaces. If they aren't, that's likely the problem. Apparently they aren't, even though they are in separately compiled and linked packages. I had been assuming that the fact that rgl knows nothing about igraph meant I didn't need to worry about it. (igraph does list rgl in its Suggests list.) On platforms other than Linux, I don't appear to need to worry about it, but Linux happily loads one, then loads the other and links the call to the wrong .so rather than the local one, without a peep of warning, just an eventual crash. While various OS's and tricks could provide work-arounds for clashing function definitions (I actually had the impression the R dynamic loading machinery might) those wouldn't necessary be right. In principle package A might use some functions defined in package B. In that case the need for namespaces would have become obvious. Supposing I finish my editing of the 100 or so source files and put all of the rgl stuff in an rgl namespace, that still doesn't protect me from what some other developer might do next week, creating their own rgl namespace with a clashing name. Why doesn't the linking step resolve the calls, why does it leave it until load time? I think there is a using namespace directive that might save typing, putting everything into that namespace by default. Maybe just the headers need it. With dynamic loading you don't know til load time if you've got a problem. As I said, the systemm can't simply wall if different libraries, since they may want to call each other. The usual solution for two developers picking the same name is to have an outer level namespace associated with the developer/company/project, with other namespaces nested inside. This reduces the problem, though obviously it can still exist higher up. Ross - If I really do have calls to the wrong destructor in there, how do I avoid this? Are you invoking the destructor explicitly? An object should know it's type, which should result in the right call without much effort. No, this is an implicit destructor call. I'm deleting an object whose class descends from Shape. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Makevars and Makeconf sequencing
http://cran.r-project.org/doc/manuals/R-exts.html#Configure-and-cleanup near the start of 1.2.1 Using Makevars says There are some macros which are set whilst configuring the building of R itself and are stored in R_HOME/etcR_ARCH/Makeconf. That makefile is included as a Makefile after Makevars[.win], and the macros it defines can be used in macro assignments and make command lines in the latter. I'm confused. If Makeconf is included after Makevars, then how can Makevars use macros defined in Makeconf? Or is the meaning only that a variable definition in Makeconf can be overridden in Makevars? Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] problems extracting parts of a summary object
summary(x), where x is the output of lm, produces the expectedd display, including standard errors of the coefficients. summary(x)$coefficients produces a vector (x is r$individual[[2]]): r$individual[[2]]$coefficients tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn -2.449188e+04 -4.143249e+00 4.707007e+04 -3.112334e+01 1.671106e-01 mncigspmkrmnpeldmnsmkpreve 3.580065e+00 2.029721e+05 4.404915e+01 class(r$individual[[2]]$coefficients) [1] numeric rather than the expected matrix like object with a column for the se's. When I trace through the summary method, the coefficients value is a matrix. I'm trying to pull out the standard errors for some rearranged output. How can I do that? And what's going on? I suspect this may be a namespace issue. Thanks. Ross Boylan P.S. I would appreciate a cc because of some mail problems I'm having. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] problems extracting parts of a summary object
On Mon, 2010-03-22 at 16:30 -0600, tpl...@cybermesa.com wrote: Are you sure you're extracting the coefficients component of the summary object and not the lm object? Seems to work ok for me: xm - lm(y ~ x, data=data.frame(y=rnorm(20), x=rnorm(2))) summary(xm)$coefficients Estimate Std. Error t value Pr(|t|) (Intercept) 1.908948 1.707145 1.118210 0.2781794 x 1.292263 1.174608 1.100165 0.2857565 xm$coefficients (Intercept) x 1.9089481.292263 -- Tony Plate class(summary(r$individual[[2]])) [1] summary.lm But maybe I'm not following the question. Ross On Mon, March 22, 2010 4:03 pm, Ross Boylan wrote: summary(x), where x is the output of lm, produces the expectedd display, including standard errors of the coefficients. summary(x)$coefficients produces a vector (x is r$individual[[2]]): r$individual[[2]]$coefficients tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn -2.449188e+04 -4.143249e+00 4.707007e+04 -3.112334e+01 1.671106e-01 mncigspmkrmnpeldmnsmkpreve 3.580065e+00 2.029721e+05 4.404915e+01 class(r$individual[[2]]$coefficients) [1] numeric rather than the expected matrix like object with a column for the se's. When I trace through the summary method, the coefficients value is a matrix. I'm trying to pull out the standard errors for some rearranged output. How can I do that? And what's going on? I suspect this may be a namespace issue. Thanks. Ross Boylan P.S. I would appreciate a cc because of some mail problems I'm having. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] problems extracting parts of a summary object [solved]
On Mon, 2010-03-22 at 16:52 -0600, tpl...@cybermesa.com wrote: what's the output of: summary(r$individual[[2]])$coef my question was basically whether you were doing summary(r$individual[[2]])$coef or r$individual[[2]]$coef (the second was what you appeared to be doing from your initial email) Doh! Thank you; that was it. This was interacting with another error, which is perhaps how I managed to miss it. Ross -- Tony Plate On Mon, March 22, 2010 4:43 pm, Ross Boylan wrote: On Mon, 2010-03-22 at 16:30 -0600, tpl...@cybermesa.com wrote: Are you sure you're extracting the coefficients component of the summary object and not the lm object? Seems to work ok for me: xm - lm(y ~ x, data=data.frame(y=rnorm(20), x=rnorm(2))) summary(xm)$coefficients Estimate Std. Error t value Pr(|t|) (Intercept) 1.908948 1.707145 1.118210 0.2781794 x 1.292263 1.174608 1.100165 0.2857565 xm$coefficients (Intercept) x 1.9089481.292263 -- Tony Plate class(summary(r$individual[[2]])) [1] summary.lm But maybe I'm not following the question. Ross On Mon, March 22, 2010 4:03 pm, Ross Boylan wrote: summary(x), where x is the output of lm, produces the expectedd display, including standard errors of the coefficients. summary(x)$coefficients produces a vector (x is r$individual[[2]]): r$individual[[2]]$coefficients tX(Intercept)tXcigspmkrtXpeldtXsmkprevemn -2.449188e+04 -4.143249e+00 4.707007e+04 -3.112334e+01 1.671106e-01 mncigspmkrmnpeldmnsmkpreve 3.580065e+00 2.029721e+05 4.404915e+01 class(r$individual[[2]]$coefficients) [1] numeric rather than the expected matrix like object with a column for the se's. When I trace through the summary method, the coefficients value is a matrix. I'm trying to pull out the standard errors for some rearranged output. How can I do that? And what's going on? I suspect this may be a namespace issue. Thanks. Ross Boylan P.S. I would appreciate a cc because of some mail problems I'm having. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] y ~ X -1 , X a matrix
While browsing some code I discovered a call to lm that used a formula y ~ X - 1, where X was a matrix. Looking through the documentation of formula, lm, model.matrix and maybe some others I couldn't find this useage (R 2.10.1). Is it anything I can count on in future versions? Is there documentation I've overlooked? For the curious: model.frame on the above equation returns a data.frame with 2 columns. The second column is the whole X matrix. model.matrix on that object returns the expected matrix, with the transition from the odd model.frame to the regular matrix happening in an .Internal call. Thanks. Ross P.S. I would appreciate cc's, since mail problems are preventing me from seeing list mail. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] y ~ X -1 , X a matrix
On Thu, 2010-03-18 at 00:57 +, ted.hard...@manchester.ac.uk wrote: On 17-Mar-10 23:32:41, Ross Boylan wrote: While browsing some code I discovered a call to lm that used a formula y ~ X - 1, where X was a matrix. Looking through the documentation of formula, lm, model.matrix and maybe some others I couldn't find this useage (R 2.10.1). Is it anything I can count on in future versions? Is there documentation I've overlooked? For the curious: model.frame on the above equation returns a data.frame with 2 columns. The second column is the whole X matrix. model.matrix on that object returns the expected matrix, with the transition from the odd model.frame to the regular matrix happening in an .Internal call. Thanks. Ross P.S. I would appreciate cc's, since mail problems are preventing me from seeing list mail. Hmmm ... I'm not sure what is the problem with what you describe. There is no problem in the it doesn't work sense. There is a problem that it seems undocumented--though the help you quote could rather indirectly be taken as a clue--and thus, possibly, subject to change in later releases. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Rgeneric.py assists in rearranging generic function definitions [inline]
On Thu, 2010-01-21 at 11:38 -0800, Ross Boylan wrote: I've attached a script I wrote that pulls all the setGeneric definitions out of a set of R files and puts them in a separate file, default allGenerics.R. I thought it might help others who find themselves in a similar situation. The situation was that I had to change the order in which files in my package were parsed; the scheme in which the generic definition is in the first file that has the corresponding setMethod breaks under re-ordering. So I pulled out all the definitions and put them first. In retrospect, it is clearly preferable to create allGenerics.py from the start. If you didn't, and discover you should have, the script automates the conversion. Thanks to everyone who helped me with my packaging problems. The package finally made it to CRAN as http://cran.r-project.org/web/packages/mspath/index.html. I'll send a public notice of that to the general R list. Ross Boylan Apparently the attachment didn't make it through. I've pasted Rgeneric.py below. #! /usr/bin/python # python 2.5 required for with statement from __future__ import with_statement # Rgeneric.py extracts setGeneric definitions from R sources and # writes them to a special file, while removing them from the # original. # # Context: In a system with several R files, having generic # definitions sprinkled throughout, there are errors arising from the # sequencing of files, or of definitions within files. In general, # changing the order in which files are parsed (e.g., by the Collate: # filed in DESCRIPTION) will break things even when they were # working. For example, a setMethod may occur before the # corresponding setGeneric, and then fail. Given that it is not safe # to call setGeneric twice for the same function, the cleanest # solution may be to move all the generic definitions to a separate # file that will be read before any of the setMethod's. Rgeneric.py # helps automate that process. # # It is, of course, preferable not to get into this situation in the # first place, for example by creating an allGenerics.R file as you # go. # Typical useage: ./Rgeneric.py *.R # Will create allGenerics.R with all the extracted generic # definitions, including any preceding comments. # Rewrites the *.R files, replacing the setGeneric's with comments # indicating the generic has moved to allGenerics.py. # *.R.old has the original .R files. # # The program does not work for all conceivable styles. In # particular, it assumes that #1. setGeneric is immediately followed by an open parenthesis and # a quoted name of the function. Subsequent parts of the # definition may be split across lines and have interspersed # comments. # #2. Comments precede the definition. They are optional, and will # be left in place in the .R file and copied to allGenerics.R. # #3. If you first define an ordinary function foo, and then do # setGeneric(foo) the setGeneric will be moved to # allGenerics.R. It will not work properly there; you should # make manual adjustments such as moving it back to the # original. The code at the bottom reports on all such # definitions, and then lists all the generic functions processed. # #4. allGenerics.R will contain generic definitions in the order of # files examined, and in the order they are defined within the # file. This is to preserve context for the comments, in # particular for comments which apply to a block of # definitions. If you would like something else, e.g., # alphabetical ordering, you should post-process the AllForKey # object created at the bottom of this file. # # There are program (not command line) options to do a read-only scan, # and a class to hold the results, which can be inspected in various # ways. # Copyright 2010 Regents of University of California # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # See http://www.gnu.org/licenses/ for the full license. # Author: Ross Boylan r...@biostat.ucsf.edu # # Revision History: # # 1.0 2010-01-21 Initial release. import os, os.path, re, sys class ParseGeneric: Extract setGeneric functions and preceding comments in one file. states of the parser: findComment -- look for start of comment inComment -- found comment; accumulate and look for end inGeneric -- extract setGeneric definition. Typical use: p = ParseGeneric() results = p.parse(myfile.R
[Rd] Rgeneric.py assists in rearranging generic function definitions
I've attached a script I wrote that pulls all the setGeneric definitions out of a set of R files and puts them in a separate file, default allGenerics.R. I thought it might help others who find themselves in a similar situation. The situation was that I had to change the order in which files in my package were parsed; the scheme in which the generic definition is in the first file that has the corresponding setMethod breaks under re-ordering. So I pulled out all the definitions and put them first. In retrospect, it is clearly preferable to create allGenerics.py from the start. If you didn't, and discover you should have, the script automates the conversion. Thanks to everyone who helped me with my packaging problems. The package finally made it to CRAN as http://cran.r-project.org/web/packages/mspath/index.html. I'll send a public notice of that to the general R list. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] calling setGeneric() twice
Is it safe to call setGeneric twice, assuming some setMethod's for the target function occur in between? By safe I mean that all the setMethod's remain in effect, and the 2nd call is, effectively, a no-op. ?setGeneric says nothing explicit about this behavior that I can see. It does say that if there is an existing implicity generic function it will be (re?)used. I also tried ?Methods, google and the mailing list archives. I looked at the code for setGeneric, but I'm not confident how it behaves. It doesn't seem to do a simple return of the existing value if a generic already exists, although it does have special handling for that case. The other problem with looking at the code--or running tests--is that they only show the current behavior, which might change later. This came up because of some issues with the sequencing of code in my package. Adding duplicate setGeneric's seems like the smallest, and therefore safest, change if the duplication is not a problem. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] calling setGeneric() twice
On Tue, 2010-01-19 at 10:05 -0800, Seth Falcon wrote: This came up because of some issues with the sequencing of code in my package. Adding duplicate setGeneric's seems like the smallest, and therefore safest, change if the duplication is not a problem. I'm not sure of the answer to your question, but I think it is the wrong question :-) Perhaps you can provide more detail on why you are using multiple calls to setGeneric. That seems like a very odd thing to do. My system is defined in a collection of .R files, most of which are organized around classes. So the typical file has a setClass(), setGeneric()'s, and setMethod()'s. If files that were read in later in the sequence extended an existing generic, I omitted the setGeneric(). I had to resequence the order in which the files were read to avoid some undefined slot classes warnings. The resequencing created other problems, including some cases in which I had a setMethod without a previous setGeneric. I have seen the advice to sequence the files so that class definitions, then generic definitions, and finally function and method definitions occur. I am trying not to do that for two reasons. First, I'm trying to keep the changes I make small to avoid introducing errors. Second, I prefer to keep all the code related to a single class in a single file. Some of the files were intended for free-standing use, and so it would be useful if they could retain setGeneric()'s even if I also need an earlier setGeneric to make the whole package work. I am also working on a python script to extract all the generic function defintions (that is, setGeneric()), just in case. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] calling setGeneric() twice (don't; documentation comments)
On Tue, 2010-01-19 at 12:55 -0800, Seth Falcon wrote: I would expect setGeneric to create a new generic function and nuke/mask methods associated with the generic that it replaces. I tried a test in R 2.7.1, and that is the behavior. I think it would be worthwhile to document it in ?setGeneric. Also, ?setGeneric advocates first defining a regular function (e.g., bar) and then doing a simple setGeneric(bar). I think the advice for package developers is different, so perhaps some changes there would be a good idea too. I thought I was defining setGeneric twice for a few functions, and thus that it did work OK. It turns out I have no duplicate definitions. Here's the test: setClass(A, representation(z=ANY)) [1] A setClass(B, representation(y=ANY)) [1] B setGeneric(foo, function(x) standardGeneric(foo)) [1] foo setMethod(foo, signature(x=A), function(x) return(foo for A)) [1] foo a=new(A) b=new(B) foo(a) [1] foo for A foo(b) Error in function (classes, fdef, mtable) : unable to find an inherited method for function foo, for signature B setGeneric(foo, function(x) standardGeneric(foo)) [1] foo setMethod(foo, signature(x=B), function(x) return(foo for B)) [1] foo setGeneric(foo, function(x) standardGeneric(foo)) [1] foo setMethod(foo, signature(x=B), function(x) return(foo for B)) [1] foo foo(a) # here's where the disappearance of the prior setMethod shows Error in function (classes, fdef, mtable) : unable to find an inherited method for function foo, for signature A foo(b) [1] foo for B So I guess I am going to pull the setGeneric's out. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] optional package dependency
On Sat, 2010-01-16 at 07:49 -0800, Seth Falcon wrote: Package authors should be responsible enough to test their codes with and without optional features. It seems unlikely most package authors will have access to a full range of platform types. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] optional package dependency
On Fri, 2010-01-15 at 09:19 +0100, Kurt Hornik wrote: The idea is that maintainers typically want to fully check their functionality, suggesting to force suggests by default. This might be the nub of the problem. There are different audiences, even for R CMD check. The maintainer probably wants to check all functionality. Even then, there is an issue if functionality differs by platform. CRAN probably wants to check all functionality. An individual user just wants to check the functionality they use. For example, if someone doesn't want to run my package distributed, but wants to see if it works (R CMD check), they need to be able to avoid the potentially onerous requirement to install MPI. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] optional package dependency (enhances)
On Fri, 2010-01-15 at 10:48 +, Benilton Carvalho wrote: How about using: Enhances: Rmpi ? b The main reason is that enhances seems a peculiar way to describe the relation between a package that (optionally) uses a piece of infrastructure and the infrastructure. Similarly, I would not say that a car enhances metal. The example given in the R extension documentation (e.g., by providing methods for classes from these packages) seems more in line with the usual meaning of enhance. A secondary reason is that I can not tell from the documentation exactly what putting a package in enhances does. The example of adding functionality to a class suggests that packages that are enhanced are required. However, clearly one could surround code that enhanced a class from another package with a conditional, so that if the code was skipped if the enhanced package was absent. Even that logic isn't quite right if the enhanced package is added later. My package only loads/verifies the presence of rmpi if one attempts to use the distributed features, so the relation is at run time, not load time. Ross On Fri, Jan 15, 2010 at 6:00 AM, Ross Boylan r...@biostat.ucsf.edu wrote: I have a package that can use rmpi, but works fine without it. None of the automatic test code invokes rmpi functionality. (One test file illustrates how to use it, but has quit() as its first command.) What's the best way to handle this? In particular, what is the appropriate form for upload to CRAN? When I omitted rmpi from the DESCRITPION file R CMD check gave quote * checking R code for possible problems ... NOTE alldone: no visible global function definition for ‘mpi.bcast.Robj’ alldone: no visible global function definition for ‘mpi.exit’ quote followed by many more warnings. When I add Suggests: Rmpi in DESCRIPTION the check stops if the package is not installed: quote * checking package dependencies ... ERROR Packages required but not available: Rmpi /quote Rmpi is not required, but I gather from previous discussion on this list that suggests basically means required for R CMD check. NAMESPACE seems to raise similar issues; I don't see any mechanism for optional imports. Also, I have not used namespaces, and am not eager to destabilize things so close to release. At least, I hope I'm close to release :) Thanks for any advice. Ross Boylan P.S. Thanks, Duncan, for your recent advice on my help format problem with R 2.7. I removed the nested \description, and now things look OK. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] optional package dependency (suggestions/wishes)
On Fri, 2010-01-15 at 12:34 -0500, Simon Urbanek wrote: On Jan 15, 2010, at 12:18 , Ross Boylan wrote: On Fri, 2010-01-15 at 09:19 +0100, Kurt Hornik wrote: The idea is that maintainers typically want to fully check their functionality, suggesting to force suggests by default. This might be the nub of the problem. There are different audiences, even for R CMD check. The maintainer probably wants to check all functionality. Even then, there is an issue if functionality differs by platform. CRAN probably wants to check all functionality. An individual user just wants to check the functionality they use. For example, if someone doesn't want to run my package distributed, but wants to see if it works (R CMD check), they need to be able to avoid the potentially onerous requirement to install MPI. ... that what's why you can decide to run check without forcing suggests - it's entirely up to you / the user as Kurt pointed out ... Cheers, Simon This prompts a series of increasing ambitious suggestions: 1. DOCUMENTATION CHANGE I suggest this info about _R_CHECK_FORCE_SUGGESTS_=false be added to R CMD check --help. Until Kurt's email I was unaware of the facility, and it seems to me the average package user will be even less likely to know. My concern is that they would run R CMD check; it would fail because a package such as rmpi is absent; and the user will throw up their hands and give up. I did find a Perl variable with similar name in section 1.3.3 of Writing R Extensions, but that section does not mention environment variables. It would also be unnatural for a package user to refer to it. Considering there are many variables, maybe the interactive help should just note that customizing variables (without naming particular ones) are available, and point to appropriate documentation 2. NEW BEHAVIOR/OPTIONS On even more exotic wish would be to allow a list of suggested packages to check. That way, someone use some, but not all, optional facilities could check the ones of interest. Again, even with better documentation it seems likely most people would be unaware of the feature. 3. SIGNIFICANTLY CHANGED BEHAVIOR I think the optimal behavior would be for the check environment to attempt to load all suggested packages, but continue even if some are missing. It would then be up to package authors to code appropriate conditional tests for the presence or absence of suggested packages; actually, that's probably true even now. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] optional package dependency
I have a package that can use rmpi, but works fine without it. None of the automatic test code invokes rmpi functionality. (One test file illustrates how to use it, but has quit() as its first command.) What's the best way to handle this? In particular, what is the appropriate form for upload to CRAN? When I omitted rmpi from the DESCRITPION file R CMD check gave quote * checking R code for possible problems ... NOTE alldone: no visible global function definition for ‘mpi.bcast.Robj’ alldone: no visible global function definition for ‘mpi.exit’ quote followed by many more warnings. When I add Suggests: Rmpi in DESCRIPTION the check stops if the package is not installed: quote * checking package dependencies ... ERROR Packages required but not available: Rmpi /quote Rmpi is not required, but I gather from previous discussion on this list that suggests basically means required for R CMD check. NAMESPACE seems to raise similar issues; I don't see any mechanism for optional imports. Also, I have not used namespaces, and am not eager to destabilize things so close to release. At least, I hope I'm close to release :) Thanks for any advice. Ross Boylan P.S. Thanks, Duncan, for your recent advice on my help format problem with R 2.7. I removed the nested \description, and now things look OK. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ?setGeneric garbled (PR#14153)
On Thu, 2009-12-17 at 15:24 +0100, Martin Maechler wrote: Ross Boylan r...@biostat.ucsf.edu on Thu, 17 Dec 2009 02:15:12 +0100 (CET) writes: Full_Name: Ross Boylan Version: 2.10.0 OS: Windows XP Submission from: (NULL) (198.144.201.14) Some of the help for setGeneric seems to have been garbled. In the section Basic Use, 5th paragraph (where the example counts as a single line 3rd paragraph) it says quote Note that calling 'setGeneric()' in this form is not strictly necessary before calling 'setMethod()' for the same function. If the function specified in the call to 'setMethod' is not generic, 'setMethod' will execute the call to 'setGeneric' itself. Declaring explicitly that you want the function to be generic can be considered better programming style; the only difference in the result, however, is that not doing so produces a You cannot (and never need to) create an explicit generic version of the primitive functions in the base package. quote The stuff after the semi-colon of the final sentence is garbled, or at least unparseable by me. Probably something got deleted by mistake. That's very peculiar. The corresponding methods/man/setGeneric.Rd file has not been changed in a while, but I don't see your problem. The help from R launched directly from the R shortcut on my desktop looks fine, in both 2.10 and 2.8. I closed all my emacs sessions and restarted, but ?setGeneric produces the same garbled text. I also tried telling ESS to use a different working directory when launching R; it didn't help. The last sentence of this paragraph is also garbled: quote The description above is the effect when the package that owns the non-generic function has not created an implicit generic version. Otherwise, it is this implicit generic function that is us_same_ version of the generic function will be created each time. /quote Weird. P.S. http://bugs.r-project.org was extremely sluggish, even timing out, both yesterday and today for me. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] group generics
Thanks for your help. I had two concerns about using as: that it would impose some overhead, and that it would require me to code an explicit conversion function. I see now that the latter is not true; I don't know if the overhead makes much difference. On Thu, 2009-12-03 at 13:00 -0800, Martin Morgan wrote: setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) { new(B, xb=e1...@xb, callGeneric(e1, as(e2, A))) }) Things were getting too weird, so I punted and used explicitly named function calls for the multiplication operation that was causing trouble. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] group generics
On Thu, 2009-12-03 at 14:25 -0800, John Chambers wrote: I missed the earlier round of this discussion and only am commenting now to say that this doesn't seem weird at all, if I understand what you're trying to do. Martin's basic suggestion, v - callGeneric(e1, as(e2, A)) seems the simplest solution. You just want to make another call to the actual generic function, with new arguments, and let method selection take place. In fact, it's pretty much the standard way to use group generics. John There were 2 weird parts. Mainly I was referring to the fact that identical code (posted earlier) worked sometimes and not others. I could not figure out what the differences were between the 2 scenarios, nor could I create a non-working scenario reliably. The second part that seemed weird was that the code looked as if it should work all the time (the last full version I posted, which used callNextMethod() rather than callGeneric()). Finally, I felt somewhat at sea with the group generics, since I wasn't sure exactly how they worked, how they interacted with primitives, or how they interacted with callNextMethod, selectMethod, etc. I did study what I thought were the relevant help entries. Ross Ross Boylan wrote: Thanks for your help. I had two concerns about using as: that it would impose some overhead, and that it would require me to code an explicit conversion function. I see now that the latter is not true; I don't know if the overhead makes much difference. On Thu, 2009-12-03 at 13:00 -0800, Martin Morgan wrote: setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) { new(B, xb=e1...@xb, callGeneric(e1, as(e2, A))) }) Things were getting too weird, so I punted and used explicitly named function calls for the multiplication operation that was causing trouble. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] group generics
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Martin Morgan wrote: Hi Ross -- Ross Boylan r...@biostat.ucsf.edu writes: I have classes A and B, where B contains A. In the implementation of the group generic for B I would like to use the corresponding group generic for A. Is there a way to do that? setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) { # the next line does not work right v - selectMethod(callGeneric, signature=c(numeric, A))(e1, e2) v - callGeneric(e1, as(e2, A)) or probably v - callNextMethod(e1, e2) Martin A different error this time, one that looks a lot like the report from stephen.p...@ubs.com on 2007-12-24 concerning callNextMethod:, except this is with callGeneric. HOWEVER, the problem is erratic; when I started from scratch and took this code into a workspace and executed the commands, they worked as expected. I had various false starts and revisions, as well as the real code on which the example is based, when the error occurred. I tried taking in the real code (which defines generics with Arith from my actual classes, and which also fails as below), and the example still worked. My revised code: setClass(A, representation=representation(xa=numeric) ) setMethod(Arith, signature(e1=numeric, e2=A), function(e1, e2) { new(A, xa=callGeneric(e1, e...@xa)) } ) setClass(B, representation=representation(xb=numeric), contains=c(A) ) setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) { new(B, xb=e1...@xb, callNextMethod()) } ) Results: options(error=recover) tb - new(B, xb=1:3, new(A, xa=10)) 3*tb Error in get(fname, envir = envir) : object '.nextMethod' not found Enter a frame number, or 0 to exit 1: 3 * tb 2: 3 * tb 3: test.R#16: new(B, xb = e1 * e...@xb, callNextMethod()) 4: initialize(value, ...) 5: initialize(value, ...) 6: callNextMethod() 7: .nextMethod(e1 = e1, e2 = e2) 8: test.R#6: new(A, xa = callGeneric(e1, e...@xa)) 9: initialize(value, ...) 10: initialize(value, ...) 11: callGeneric(e1, e...@xa) 12: get(fname, envir = envir) Selection: 0 The callGeneric in frame 11 is trying to get the primitive for multiplying numeric times numeric. Quoting from Pope's analysis: [The primitive...] does not get the various magic variables such as .Generic, .Method, etc. defined in its frame. Thus, callGeneric() fails when, failing to find .Generic then takes the function symbol for the call (which callNextMethod() has constructed to be .nextMethod) and attempts to look it up, which of course also fails, leading to the resulting error seen above. I'm baffled, and hoping someone on the list has an idea. I'm running R 2.10 under ESS (in particular, I use c-c c-l in the code file to read in the code) on XP. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAksUTcQACgkQTEwcvZWfjMgEdwCfYt/bmsXG76rq3BpbByBYNjLY ubsAoKnBnBMbd+OlBL2YOg3vWslL35Zg =D58x -END PGP SIGNATURE- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] group generics
I have classes A and B, where B contains A. In the implementation of the group generic for B I would like to use the corresponding group generic for A. Is there a way to do that? I would also appreciate any comments if what I'm trying to do seems like the wrong approach. Here's a stripped down example: setClass(A, representation=representation(xa=numeric) ) setMethod(Arith, signature(e1=numeric, e2=A), function(e1, e2) { new(A, ax=e1...@xa) } ) setClass(B, representation=representation(xb=numeric), contains=c(A) ) setMethod(Arith, signature(e1=numeric, e2=B), function(e1, e2) { # the next line does not work right v - selectMethod(callGeneric, signature=c(numeric, A))(e1, e2) print(v) new(B, v, xb=e1...@xb) } ) Results: t1 - new(B, new(A, xa=4), xb=2) t1 An object of class “B” Slot xb: [1] 2 Slot xa: [1] 4 3*t1 Error in getGeneric(f, !optional) : no generic function found for callGeneric Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] bug in heatmap?
Using R 2.10 on WinXP heatmap(mymap, Rowv=NA, Colv=NA) with mymap values of 0 1 2 3 4 0 NaN 0.0 0.00621118 0.000 NaN 10 0.0 0.01041667 0.125 NaN 20 0.004705882 0.02105263 0.333 NaN 30 0.004081633 0.0222 0.500 0 40 0.0 0.01923077 0.167 NaN 60 0.0 0. 0.000 NaN 10 0 0.002840909 0. 0.000 NaN 20 0 0.002159827 0. NaN NaN 40 NaN 0.009433962 0. NaN NaN (the first row and column are labels, not data) produces a plot in which all of the row labelled 6 (all 0's and NaN) is white. This is the same color showing for the NaN values. In contrast, all other 0 values appear as dark red. Have I missed some subtlety, or is this a bug? Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mysteriously persistent generic definition
Here's a self-contained example of the problem: foo - function(obj) {return(3);} setGeneric(foo) [1] foo removeGeneric(foo) [1] TRUE foo - function(x) {return(4);} args(foo) function (x) NULL setGeneric(foo) [1] foo args(foo) function (obj) NULL R 2.7.1. I get the same behavior whether or not I use ESS. The reason this is more than a theoretical problem: setMethod(foo, signature(x=numeric), function(x) {return(x+4);}) Error in match.call(fun, fcall) : unused argument(s) (x = numeric) Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mysteriously persistent generic definition
R 2.8.1 on Windows behaves as I expected, i.e., the final args(foo) returns a function of x. The previous example (below) was on Debian GNU/Linux. On Wed, 2009-10-28 at 12:14 -0700, Ross Boylan wrote: Here's a self-contained example of the problem: foo - function(obj) {return(3);} setGeneric(foo) [1] foo removeGeneric(foo) [1] TRUE foo - function(x) {return(4);} args(foo) function (x) NULL setGeneric(foo) [1] foo args(foo) function (obj) NULL R 2.7.1. I get the same behavior whether or not I use ESS. The reason this is more than a theoretical problem: setMethod(foo, signature(x=numeric), function(x) {return(x+4);}) Error in match.call(fun, fcall) : unused argument(s) (x = numeric) Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mysteriously persistent generic definition
Originally I made a function yearStop that took an argument object. I made a generic, but later changed the argument to x. R keeps resurrecting the old definition. Could anyone explain what is going on, or how to fix it? Note particularly the end of the transcript below: I remove the generic, verify that the symbol is undefined, make a new function, and then make a generic. But the generic does not use the argument of the new function definition. quote args(yearStop) function (obj) NULL yearStop - function(x) x...@yearstop args(yearStop) function (x) NULL setGeneric(yearStop) [1] yearStop args(yearStop) function (obj) NULL removeGeneric(yearStop) [1] TRUE args(yearStop) Error in args(yearStop) : object yearStop not found yearStop - function(x) x...@yearstop setGeneric(yearStop) [1] yearStop args(yearStop) function (obj) NULL /quote R 2.7.1. I originally read the definitions in from a file with ^c^l in ESS; however, I typed the commands above by hand. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] user supplied random number generators
On Sun, 2009-08-16 at 21:24 +0200, Petr Savicky wrote: Dear Ross Boylan: Some time ago, you sent an email to R-devel with the following. I got into this because I'm trying to extend the rsprng code; sprng returns its state as a vector of bytes. Converting these to a vector of integers depends on the integer length, hence my interest in the exact definiton of integer. I'm interested in lifetime because I believe those bytes are associated with the stream and become invalid when the stream is freed; furthermore, I probably need to copy them into a buffer that is padded to full wordlength. This means I allocate the buffer whose address is returned to the core R RNG machinery. Eventually somebody needs to free the memory. Far more of my rsprng adventures are on http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng. Feel free to read, correct, or extend it. I am interested to know, what is the current state of your project. I did figure out some of the lifetime issues; SPRNG does allocate memory when you ask it for its state. I also realized that for several reasons it would not be appropriate to hand that buffer to R. I've reworked the page extensively since it had the section you quote. See particularly the Getting and Setting Stream State section near the bottom. I submitted patches to hook rsprng into R's standard machinery for stream state (the user visible part of which is .Random.seed). The package developer has reservations about applying them. As a practical matter, I shifted my package's C code to call back to R to get random numbers. If rsprng is loaded and activated, my code will use it. I also eliminated all attempts to set the seed in my code. For rsprng, in its current form, the R set.seed() function is a no-op and you have to use an rsprng function to set the seed (generally when activating the library). There is a package rngwell19937 with a random number generator, which i develop and use for several parallel processes. Setting a seed may be done by a vector, one of whose components is the process number. The initialization then provides unrelated sequences for different processes. That sounds interesting; thanks for pointing it out. Seeding by a vector is also available in the initialization of Mersenne Twister from 2002. See mt19937ar.c (ar for array) at http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html Unfortunately, seeding by a vector is not available in R base. R uses Mersenne Twister, but with an initialization by a single number. I think that one could write to .Random.seed directly to set a vector for many of the generators. ?.Random.seed does not recommend this and notes various limits and hazards of this strategy. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] user supplied random number generators
?Random.user says (in svn trunk) Optionally, functions \code{user_unif_nseed} and \code{user_unif_seedloc} can be supplied which are called with no arguments and should return pointers to the number of seeds and to an integer array of seeds. Calls to \code{GetRNGstate} and \code{PutRNGstate} will then copy this array to and from \code{.Random.seed}. And it offers as an example void user_unif_init(Int32 seed_in) { seed = seed_in; } int * user_unif_nseed() { return nseed; } int * user_unif_seedloc() { return (int *) seed; } First question: what is the lifetime of the buffers pointed to by the user_unif-* functions, and who is responsible for cleaning them up? In the help file they are static variables, but in general they might be allocated on the heap or might be in structures that only persist as long as the generator does. Since the example uses static variables, it seems reasonable to conclude the core R code is not going to try to free them. Second, are the types really correct? The documentation seems quite explicit, all the more so because it uses Int32 in places. However, the code in RNG.c (RNG_Init) says ns = *((int *) User_unif_nseed()); if (ns 0 || ns 625) { warning(_(seed length must be in 0...625; ignored)); break; } RNG_Table[kind].n_seed = ns; RNG_Table[kind].i_seed = (Int32 *) User_unif_seedloc(); consistent with the earlier definition of RNG_Table entries as typedef struct { RNGtype kind; N01type Nkind; char *name; /* print name */ int n_seed; /* length of seed vector */ Int32 *i_seed; } RNGTAB; This suggests that the type of user_unif_seedloc is Int32*, not int *. It also suggests that user_unif_nseed should return the number of 32 bit integers. The code for PutRNGstate(), for example, uses them in just that way. While the dominant model, even on 64 bit hardware, is probably to leave int as 32 bit, it doesn't seem wise to assume that is always the case. I got into this because I'm trying to extend the rsprng code; sprng returns its state as a vector of bytes. Converting these to a vector of integers depends on the integer length, hence my interest in the exact definiton of integer. I'm interested in lifetime because I believe those bytes are associated with the stream and become invalid when the stream is freed; furthermore, I probably need to copy them into a buffer that is padded to full wordlength. This means I allocate the buffer whose address is returned to the core R RNG machinery. Eventually somebody needs to free the memory. Far more of my rsprng adventures are on http://wiki.r-project.org/rwiki/doku.php?id=packages:cran:rsprng. Feel free to read, correct, or extend it. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] user supplied random number generators
On Thu, 2009-07-30 at 12:32 +0200, Christophe Dutang wrote: This suggests that the type of user_unif_seedloc is Int32*, not int *. It also suggests that user_unif_nseed should return the number of 32 bit integers. The code for PutRNGstate(), for example, uses them in just that way. While the dominant model, even on 64 bit hardware, is probably to leave int as 32 bit, it doesn't seem wise to assume that is always the case. You can test the size of an int with a configure script. see for example the package foreign, the package randtoolbox (can be found in Rmetrics R forge project) I maintain with Petr Savicky. http://cran.r-project.org/doc/manuals/R-admin.html#Choosing-between-32_002d-and-64_002dbit-builds says All current versions of R use 32-bit integers. Also, sizeof(int) works at runtime. But my question was really about whether code for user defined RNGs should be written using Int32 or int as the target type for the state vector. The R core code suggests to me one should use Int32, but the documentation says int. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] beginner's guide to C++ programming with R packages?
On Fri, 2009-06-26 at 16:17 -0400, Whit Armstrong wrote: But this draws me back to the basic question. I don't want to run R CMD INSTALL 20 times per hour. How do developers actually test their code? check out RUnit for tests. http://cran.r-project.org/web/packages/RUnit/index.html as for testing c++ code. I have taken an approach which is probably different than most. I try to build my package as a c++ library that can be used independent of R. Then you can test your library with whatever c++ test suite that you prefer. Once you are happy, then I also have C++ tests that operate separately from R, though I have a very small library of stub R functions to get the thing to build. There have been some tricky issues with R (if one links to the regular R library) and the test framework fighting over who was main. I think that's why I switched to the stub. Working only with R level tests alone does not permit the kind of lower level testing that you can get by running your own unit tests. I use the boost unit test framework. Of course, you want R level tests too. Some of my upper level C++ tests are mirror images of R tests; this can help identify if a problem lies at the interface. For C++ tests I build my code in conjunction with a main program. I think I also have or had a test building it as a library, but I don't use that much. For R, my modules get built into a library. It's usually cleaner to build the R library from a fresh version of the sources; otherwise scraps of my other builds tend to end up in the R package. Thanks, Whit, for the pointers to Rcpp and RAbstraction. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] S4 class redefinition
I haven't found much on S4 class redefinition; the little I've seen indicates the following is to be expected: 1. setClass(foo, ) 2. create objects of class foo. 3. execute the same setClass(foo, ...) again (source the same file). 4. objects from step 2 are now NULL. Is that the expected behavior (I ran under R 2.7.1)? Assuming it is, it's kind of unfortunate. I can wrap my setClass code like this if (! isClass(foo)) setClass(foo, ) to avoid this problem. I've seen this in other code; is that the standard solution? I thought that loading a library was about the same as executing the code in its source files (assuming R only code). But if this were true my saved objects would be nulled out each time I loaded the corresponding library. That does not seem to be the case. Can anyone explain that? Do I need to put any protections around setMethod so that it doesn't run if the method is already defined? At the moment I'm not changing the class defintion but am changing the methods, so I can simply avoid running setClass. But if I want to change the class, most likely by adding a slot, what do I do? At the moment it looks as if I'd need to make a new class name, define some coerce methods, and then locate and change the relevant instances. Is there a better way? Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 class redefinition
On Tue, 2009-06-30 at 12:58 -0700, Ross Boylan wrote: I haven't found much on S4 class redefinition; the little I've seen indicates the following is to be expected: 1. setClass(foo, ) 2. create objects of class foo. 3. execute the same setClass(foo, ...) again (source the same file). 4. objects from step 2 are now NULL. I'm sorry; step 4 is completely wrong. The objects seem to be preserved. Some slightly modified questions remain. Is it safe to reexecute identical code for setClass or setMethod when you have existing objects of the class around? Is there any protection, such as checking for existing definitions, that is recommended before executing setClass or setMethod? If you want to change a class or method, and have existing objects, how do you do that? Can scoping rules lead to situations in which some functions or methods end up with references to the older version of the methods? One example is relevant to class constructors, and shows they can: Here's a little test trivial - function() 3 # stand in for a class constructor maker - function(c=trivial) + function(x) x+c() oldf - maker() oldf(4) [1] 7 trivial - function() 20 oldf(4) [1] 7 newf - maker() newf(8) [1] 28 So the old definition is frozen in the inner function, for which it was captured by lexical scope. Although definition of maker is not redone after trivial is redefined, maker's default argument does get the new value of trivial. Methods add another layer. I'm hoping those with a deeper understanding than mine can clarify where the danger spots are, and how to deal with them. Thanks. Ross Is that the expected behavior (I ran under R 2.7.1)? Assuming it is, it's kind of unfortunate. I can wrap my setClass code like this if (! isClass(foo)) setClass(foo, ) to avoid this problem. I've seen this in other code; is that the standard solution? I thought that loading a library was about the same as executing the code in its source files (assuming R only code). But if this were true my saved objects would be nulled out each time I loaded the corresponding library. That does not seem to be the case. Can anyone explain that? Do I need to put any protections around setMethod so that it doesn't run if the method is already defined? At the moment I'm not changing the class defintion but am changing the methods, so I can simply avoid running setClass. But if I want to change the class, most likely by adding a slot, what do I do? At the moment it looks as if I'd need to make a new class name, define some coerce methods, and then locate and change the relevant instances. Is there a better way? Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] could not find function in R CMD check [solved, but is this an R bug?]
On Wed, 2007-09-12 at 11:31 +0200, Uwe Ligges wrote: Perhaps Namespace issues? But no further ideas. You might want to make your package available at some URL so that people can look at it and help... Uwe Ligges Thanks. The problem lay elsewhere. I was able to fix it by adding library(mspath) to the top of the .R file in data/ that defined some data using the package's functions In other words --- library(mspath) ### from various papers on errors in reading fibrosis scores rousselet.jr - readingError(c(.91, .09, 0, 0, 0, .11, .78, .11, 0, 0, 0, .17, .75, .08, 0, 0, .06, .44, .50, 0, 0, 0, 0, .07, .93), byrow=TRUE, nrow=5, ncol=5) - works as data/readingErrorData.R, but without the library() call I get the error shown in my original message (see below). readingError() is a function defined in my package. Does any of this inndicate a bug or undesirable feature in R? First, it seems a little odd that I need to include loading the library in data that is defined in the same library. I think I've noticed similar behavior in other places, maybe the code snippets that accompany the documentation pages (that is, one needs library(mypackage) in order for the snippets to check out). Second, should R CMD check fail so completely and opaquely in this situation? Ross Boylan wrote: During R CMD check I get this: ** building package indices ... Error in eval(expr, envir, enclos) : could not find function readingError Execution halted ERROR: installing package indices failed The check aborts there. readingError is a function I just added; for reference setClass(readingError, contains=matrix) readingError - function(...) new(readingError, matrix(...)) which is in readingError.R in the project's R subdirectory. Some code in the data directory invokes readingError, and the .Rd file includes \alias{readingError}, \alias{readingError-class}, \name{readingError-class} and an example invoking readingError. I'm using R 2.5.1 as packaged for Debian GNU/Linux. Does anyone have an idea what's going wrong here, or how to fix or debug it? The code seems to work OK when I use it from ESS. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] could not find function in R CMD check
During R CMD check I get this: ** building package indices ... Error in eval(expr, envir, enclos) : could not find function readingError Execution halted ERROR: installing package indices failed The check aborts there. readingError is a function I just added; for reference setClass(readingError, contains=matrix) readingError - function(...) new(readingError, matrix(...)) which is in readingError.R in the project's R subdirectory. Some code in the data directory invokes readingError, and the .Rd file includes \alias{readingError}, \alias{readingError-class}, \name{readingError-class} and an example invoking readingError. I'm using R 2.5.1 as packaged for Debian GNU/Linux. Does anyone have an idea what's going wrong here, or how to fix or debug it? The code seems to work OK when I use it from ESS. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] codetools really optional for R CMD check?
After upgrading to R 2.5.1 on Debian, R CMD check gives * checking Rd cross-references ... WARNING Error in .find.package(package, lib.loc) : there is no package called 'codetools' Execution halted * checking for missing documentation entries ... WARNING etc The NEWS file says (for 2.5.0; I was on 2.4 before the recent upgrade) o New recommended package 'codetools' by Luke Tierney provides code-analysis tools. This can optionally be used by 'R CMD check' to detect problems, especially symbols which are not visible. This sounds as if R CMD check should run OK without the package, and it doesn't seem to. Have I misunderstood something, or is their a problem with R CMD check's handling of the case with missing codetools. I don't have codetools installed because the Debian r-recommended package was missing a dependency; I see that's already been fixed (wow!). Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Reported invalid memory references
While testing for leaks in my own code I noticed some reported memory problems from valgrind, invoked with $ R --vanilla -d valgrind --leak-check=full This is on Debian GNU/Linux (testing aka lenny) with a 2.6 kernel, R package version 2.4.1-2. I was running in an emacs shell. The immediate source of all the problems before I get to the prompt is the system dynamic loader ld-2.5.so, invoked from R. Then, when I exit, there are a bunch of reported leaks, some of which appear to be more directly from R (though some involve, e.g., readline). Are these reported errors actually problems? If so, do they indicate problems in R or some other component (e.g., ld.so). Put more practically, should I file one or more bugs, and if so, against what? Thanks. Ross Boylan ==30551== Invalid read of size 4 ==30551==at 0x4016503: (within /lib/ld-2.5.so) ==30551==by 0x4006009: (within /lib/ld-2.5.so) ==30551==by 0x40084F5: (within /lib/ld-2.5.so) ==30551==by 0x40121D4: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4414494: __libc_dlopen_mode (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF73E: __nss_lookup_function (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43F1595: __nss_passwd_lookup (in /lib/i686/cmov/libc-2.5.so) ==30551== Address 0x4EFB560 is 32 bytes inside a block of size 34 alloc'd ==30551==at 0x40234B0: malloc (vg_replace_malloc.c:149) ==30551==by 0x4008AF3: (within /lib/ld-2.5.so) ==30551==by 0x40121D4: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4414494: __libc_dlopen_mode (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF73E: __nss_lookup_function (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43F1595: __nss_passwd_lookup (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x439D87D: getpwuid_r (in /lib/i686/cmov/libc-2.5.so) ==30551== ==30551== Invalid read of size 4 ==30551==at 0x4016530: (within /lib/ld-2.5.so) ==30551==by 0x4006009: (within /lib/ld-2.5.so) ==30551==by 0x40084F5: (within /lib/ld-2.5.so) ==30551==by 0x400C616: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x400CBDA: (within /lib/ld-2.5.so) ==30551==by 0x4012234: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4414494: __libc_dlopen_mode (in /lib/i686/cmov/libc-2.5.so) ==30551== Address 0x4EFB8A8 is 24 bytes inside a block of size 27 alloc'd ==30551==at 0x40234B0: malloc (vg_replace_malloc.c:149) ==30551==by 0x4008AF3: (within /lib/ld-2.5.so) ==30551==by 0x400C616: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x400CBDA: (within /lib/ld-2.5.so) ==30551==by 0x4012234: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4414494: __libc_dlopen_mode (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF73E: __nss_lookup_function (in /lib/i686/cmov/libc-2.5.so) ==30551== ==30551== Conditional jump or move depends on uninitialised value(s) ==30551==at 0x400B3CC: (within /lib/ld-2.5.so) ==30551==by 0x401230B: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4414494: __libc_dlopen_mode (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF73E: __nss_lookup_function (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43EF82F: (within /lib/i686/cmov/libc-2.5.so) ==30551==by 0x43F1595: __nss_passwd_lookup (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x439D87D: getpwuid_r (in /lib/i686/cmov/libc-2.5.so) ==30551==by 0x439D187: getpwuid (in /lib/i686/cmov/libc-2.5.so) ==30551== ==30551== Conditional jump or move depends on uninitialised value(s) ==30551==at 0x400B0CA: (within /lib/ld-2.5.so) ==30551==by 0x401230B: (within /lib/ld-2.5.so) ==30551==by 0x400E255: (within /lib/ld-2.5.so) ==30551==by 0x4011C5D: (within /lib/ld-2.5.so) ==30551==by 0x44142E1: (within /lib/i686/cmov/libc-2.5.so
[Rd] undefined symbol: Rf_rownamesgets
I get the error undefined symbol: Rf_rownamesgets when I try to load my package, which include C++ code that calls that function. This is particularly strange since the code also calls Rf_classgets, and it loaded OK with just that. Can anyone tell me what's going on? For the record, I worked around this with the general purpose attribute setting commands and R_RowNamesSymbol. I discovered that even with that I wasn't constructing a valid data.frame, and fell back to returning a list of results. I notice Rinternals.h defines LibExtern SEXP R_RowNamesSymbol; /* row.names */ twice in the same block of code. I'm using R 2.4.1 on Debian. The symbol seems to be there: $ nm -D /usr/lib/R/lib/libR.so | grep classgets 00032e70 T Rf_classgets $ nm -D /usr/lib/R/lib/libR.so | grep namesgets 00031370 T Rf_dimnamesgets 00034500 T Rf_namesgets The source includes #define R_NO_REMAP 1 #include R.h #include Rinternals.h and later #include memory // I think this is why I needed R_NO_REMAP I realize this is not a complete example, but I'm hoping this will ring a bell with someone. I encountered this while running R CMD check. The link line generated was g++ -shared -o mspath.so AbstractTimeStepsGenerator.o Coefficients.o CompositeHistoryComputer.o CompressedTimeStepsGenerator.o Covariates.o Data.o Environment.o Evaluator.o FixedTimeStepsGenerator.o LinearProduct.o Manager.o Model.o ModelBuilder.o Path.o PathGenerator.o PrimitiveHistoryComputer.o SimpleRecorder.o Specification.o StateTimeClassifier.o SuccessorGenerator.o TimePoint.o TimeStepsGenerator.o mspath.o mspathR.o -L/usr/lib/R/lib -lR Thanks. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] undefined symbol: Rf_rownamesgets
On Tue, Apr 17, 2007 at 11:07:12PM -0400, Duncan Murdoch wrote: On 4/17/2007 10:43 PM, Ross Boylan wrote: I get the error undefined symbol: Rf_rownamesgets when I try to load my package, which include C++ code that calls that function. This is particularly strange since the code also calls Rf_classgets, and it loaded OK with just that. Can anyone tell me what's going on? For the record, I worked around this with the general purpose attribute setting commands and R_RowNamesSymbol. I discovered that even with that I wasn't constructing a valid data.frame, and fell back to returning a list of results. I notice Rinternals.h defines LibExtern SEXP R_RowNamesSymbol; /* row.names */ twice in the same block of code. I'm using R 2.4.1 on Debian. The symbol seems to be there: $ nm -D /usr/lib/R/lib/libR.so | grep classgets 00032e70 T Rf_classgets $ nm -D /usr/lib/R/lib/libR.so | grep namesgets 00031370 T Rf_dimnamesgets 00034500 T Rf_namesgets I don't see Rf_rownamesgets there, or in the R Externals manual among the API entry points listed. You're right; sorry. So does this function just not exist? If so, it would be good to remove the corresponding entries in Rinternals.h. Can't you use the documented dimnamesgets? I did one better and didn't use anything! I thought presence in Rinternals.h constituted (terse) documentation, since the R Externals manual says (Handling R objects in C) --- There are two approaches that can be taken to handling R objects from within C code. The first (historically) is to use the macros and functions that have been used to implement the core parts of R through `.Internal' calls. A public subset of these is defined in the header file `Rinternals.h' ... - So is relying on Rinternals.h a bad idea? In this case, accessing the row names through dimnamesgets looks a little awkward, since it requires navigating to the right spot in dimnames. I would need Rf_dimnamesgets since I disabled the shortcut names. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] future plans for missing() in inner functions
Currently, if one wants to test if an argument to an outer function is missing from within an inner function, this works: g5 - function(a) { + inner - function(a) { + if (missing(a)) + outer arg is missing + else + found outer arg! + } + inner(a) + } g5(3) [1] found outer arg! g5() [1] outer arg is missing While if inner is defined as function() (without arguments) one gets an error: 'missing' can only be used for arguments. However, ?missing contains a note that this behavior is subject to change. I'm particularly interested in whether the code shown above will continue to work. While it does what I want in this case, the behavior seems a bit surprising since textually the call to inner does provide an argument. So it seems possible that might change. Can anyone provide more insight into how things may change? Thanks. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Replacing slot of S4 class in method of S4 class?
On Fri, Mar 30, 2007 at 10:45:38PM +0200, cstrato wrote: Dear all, Assume that I have an S4 class MyClass with a slot myname, which is initialized to: myname= in method(initialize): myclass - new(MyClass, myname=) Assume that class MyClass has a method mymethod: mymethod.MyClass - function(object, myname=character(0), ...) { [EMAIL PROTECTED] - myname; #or:myName(object) - myname } setMethod(mymethod, MyClass, mymethod.MyClass); Furthermore, I have a replacement method: setReplaceMethod(myName, signature(object=MyClass, value=character), function(object, value) { [EMAIL PROTECTED] - value; return(object); } ) I know that it is possible to call: myName(myclass) - newname However, I want to replace the value of slot myname for object myclass in method mymethod: mymethod(myclass, myname=newname) Sorrowly, the above code in method mymethod does not work. Is there a possibility to change the value of a slot in the method of a class? Yes, but to make the effect persistent (visible might be a more accurate description) that method must return the object being updated, and you must use the return value. R uses call by value semantics, so in the definition of mymethod.MyClass when you change object you only change a local copy. It needs to be mymethod.MyClass - function(object, myname=character(0), ...) { [EMAIL PROTECTED] - myname; object } Further, if you invoke it with mymethod(myclass, new name) you will discover myclass is unchanged. You need myclass - mymethod(myclass, new name) You might consider using the R.oo package, which probably has semantics closer to what you're expecting. Alternately, you could study more about R and functional programming. Ross Boylan P.S. Regarding the follow up saying that this is the wrong list, the guide to mailing lists says of R-devel This list is intended for questions and discussion about code development in R. Questions likely to prompt discussion unintelligible to non-programmers or topics that are too technical for R-help's audience should go to R-devel, The question seems to fall under this description to me, though I am not authoritative. It is true that further study would have disclosed what is going on. Since the same thing tripped me up too, I thought I'd share the answer. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Rmpi and OpenMPI ?
On Fri, Mar 30, 2007 at 03:01:19PM -0500, Dirk Eddelbuettel wrote: On 30 March 2007 at 12:48, Ei-ji Nakama wrote: | Prof. Nakano(ism Japan) and I wrestled in Rmpi on HP-MPI. | Do not know a method to distinguish MPI well? | It is an ad-hoc patch at that time as follows. There are some autoconf snippets for figuring out how to compile various MPI versions; it's not clear to me they are much help in figuring out which version you've got. Perhaps they are some help: http://autoconf-archive.cryp.to/ax_openmp.html http://autoconf-archive.cryp.to/acx_mpi.html Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check ignores .Rbuildignore?
On Mon, Mar 19, 2007 at 11:33:37AM +0100, Martin Maechler wrote: RossB == Ross Boylan [EMAIL PROTECTED] on Sun, 18 Mar 2007 12:39:14 -0700 writes: RossB The contents of .Rbuildignore seems to affect RossB R CMD build RossB but not RossB R CMD check. RossB I'm using R 2.4.0 on Debian. RossB Is my understanding correct? yes. That's why it's called 'buildignore'. It's a big feature for me as package developer: It's more of a bug for me :(. I was thinking of check as answering the question If I build this package, will it work? E.g., I can have extra tests (which e.g. only apply on my specific platform) in addition to those which are use when the package is built (to be checked on all possible platforms). Some have proposed to additionally define a '.Rcheckignore' but they haven't been convincing enough. How about an option to have check use the buildignore file? If there are 2 separate files, there's always the risk they will get out of sync. Of course, in your case, you want them out of sync... RossB And is there anything I can do about it? First build, then check is one way; Something which is recommended anyway in some cases, e.g., if you have an (Sweave - based) vignette. Kurt Hornick, offlist, also advised this, as well as noting that using R CMD check directly on the main development directory isn't really supported. From my perspective, needing to do a build before a check is extra friction, which would reduce the amount of checking I do during development. Also, doesn't build do some of the same checks as check? Minimally, I think some advice in the R Extensions manual needs to be qualified: In 1.3 Checking and building packages == Before using these tools, please check that your package can be installed and loaded. `R CMD check' will _inter alia_ do this, but you will get more informative error messages doing the checks directly. --- There seem to be a couple of problems with this advice, aside from the fact that it says to check first. One problem is that the advice seems internally inconsistent. Before using these tools seems to refer to the build and check tools in the section. Since check is one of the tools, it can't be used before using the tools. Also, I still can't figure out what doing the checks directly refers to. Another section: 1.3.2 Building packages --- [2 paragraphs skipped] Run-time checks whether the package works correctly should be performed using `R CMD check' prior to invoking the build procedure. Since this advice, check then build, is the exact opposite of the current recommendations, build then check, it probably needs to be changed. R CMD check --help (and other spots) refer to the command as Check R packages from package sources, which can be directories or gzipped package 'tar' archives with extension '.tar.gz' or '.tgz'. I read this as referring to my source directory, although I guess other readings are possible (i.e., package source = the source bundle as distributed). RossB In my case, some of the excluded files contain references to other RossB libraries, so linking fails under R CMD check. I realize I could add RossB the library to the build (with Makevars, I guess), but I do not want RossB to introduce the dependency. It depends on the circumstances on how I would solve this problem [Why have these files as part of the package sources at all?] I have 2 kinds of checks, those at the R level, which get executed as part of R CMD check, and C++ unit tests, which I execute separately. The latter use the boost test library, part of which is a link-time library that ordinary users should not need. All the C++ sources come from a web (as in Knuth's web) file, so the main files and the testing files all get produced at once. I worked around the problem by using the top level config script to delete the test files. I suppose I could also look into moving those files into another directory when they are produced, but that would further complicate my build system. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check ignores .Rbuildignore? [correction]
On Mon, Mar 19, 2007 at 10:38:02AM -0700, Ross Boylan wrote: Kurt Hornick, offlist, also advised this, as well as noting that using Sorry. That should be Kurt Hornik. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R/C++/memory leaks
On Mon, 2007-02-26 at 16:08 +, Ernest Turro wrote: Thanks for your comments Ross. A couple more comments/queries below: On 26 Feb 2007, at 06:43, Ross Boylan wrote: [details snipped] The use of the R api can be confined to a wrapper function. But I can think of no reason that a change to the alternate approach I outlined would solve the apparent leaking you describe. I'm not sure I see how a wrapper function using the R API would suffice. Example: It doesn't sound as if it would suffice. I was responding to your original remark that Since this is a standalone C++ program too, I'd rather use the R API as little as possible... But I will look at your solution if I find it is really necessary.. Thanks I thought that was expressing a concern about using the alternate approach I outlined because it would use the R API. If you need to use that API for other reasons, you're still stuck with it :) During heavy computation in the C++ function I need to allow interrupts from R. This means that R_CheckUserInterrupt needs to be called during the computation. Therefore, use of the R API can't be confined to just the wrapper function. In fact, I'm worried that some of the libraries I'm using are failing to release memory after interrupt and that that is the problem. I can't see what I could do about that... E.g. #include valarray valarraydouble foo; // I don't know 100% that the foo object hasn't allocated some memory. if the program is interrupted it wouldn't be released That's certainly possible, but you seem to be overlooking the possibility that all the code is releasing memory appropriately, but the process's memory footprint isn't going down correspondingly. In my experience that's fairly typical behavior. In that case, depending on your point of view, you either don't have a problem or you have a hard problem. If you really want the memory released back to the system, it's a hard problem. If you don't care, as long as you have no leaks, all's well. I find it's very unfortunate that R_CheckUserInterrupt doesn't return a value. If it did (e.g. if it returned true if an interrupt has occurred), I could just branch off somewhere, clean up properly and return to R. Any ideas on how this could be achieved? I can't tell from the info page what function gets called in R if there is an interrupt, but it sounds as you could do the following hack: The R interrupt handler gets a function that calls a C function of your devising. The C function sets a flag meaning interrupt requested. Then in your main code, you periodically call R_CheckUserInterrupt. When it returns you check the flag; if it's set, you cleanup and exit. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R/C++/memory leaks
On Sun, Feb 25, 2007 at 05:37:24PM +, Ernest Turro wrote: Dear all, I have wrapped a C++ function in an R package. I allocate/deallocate memory using C++ 'new' and 'delete'. In order to allow user interrupts without memory leaks I've moved all the delete statements required after an interrupt to a separate C++ function freeMemory(), which is called using on.exit() just before the .C() call. Do you mean that you call on.exit() before the .C, and the call to on.exit() sets up the handler? Your last sentence sounds as if you invoke freeMemory() before the .C call. Another approach is to associate your C objects with an R object, and have them cleaned up when the R object gets garbage collected. However, this requires switching to a .Call interface from the more straightforward .C interface. The finalizer call I used doesn't assure cleanup on exit. The optional argument to R_RegisterCFinalizerEx might provide such assurance, but I couldn't tell what it really does. Since all memory should be released by the OS, when the process ends, I wasn't so worried about that. Here's the pattern: // I needed R_NO_REMAP to avoid name collisions. You may not. #define R_NO_REMAP 1 #include R.h #include Rinternals.h extern C { // returns an |ExternalPtr| SEXP makeManager( @makeManager args@); // user should not need to call // cleanup void finalizeManager(SEXP ptr); } SEXP makeManager( @makeManager args@){ // stuff Manager* pmanager = new Manager(pd, pm.release(), *INTEGER(stepNumerator), *INTEGER(stepDenominator), (*INTEGER(isexact)) != 0); // one example didn't use |PROTECT()| SEXP ptr; Rf_protect(ptr = R_MakeExternalPtr(pmanager, R_NilValue, R_NilValue)); R_RegisterCFinalizer(ptr, (R_CFinalizer_t) finalizeManager); Rf_unprotect(1); return ptr; } void finalizeManager(SEXP ptr){ Manager *pmanager = static_castManager *(R_ExternalPtrAddr(ptr)); delete pmanager; R_ClearExternalPtr(ptr); } I'd love to hear from those more knowledgeable about whether I did that right, and whether the FinalizerEx call can assure cleanup on exit. Make manager needes to be called from R like this mgr - .Call(makeManager, args) I am concerned about the following. In square brackets you see R's total virtual memory use (VIRT in `top`): 1) Load library and data [178MB] (if I run gc(), then [122MB]) 2) Just before .C [223MB] 3) Just before freeing memory [325MB] So you explicitly call your freeMemory() function? 4) Just after freeing memory [288MB] There are at least 3 possibilities: * your C++ code is leaking * C++ memory is never really returned (Commonly, at least in C, the amount of memory allocated to the process never goes down, even if you do a free. This may depend on the OS and the specific calls the program makes. * You did other stuff in R that's still around. After all you went up +45MB between 1 and 2; maybe it's not so odd you went up +65MB between 2 and 4. 5) After running gc() [230MB] So although the freeMemory function works (frees 37MB), R ends up using 100MB more after the function call than before it. ls() only returns the data object so no new objects have been added to the workspace. Do any of you have any idea what could be eating this memory? Many thanks, Ernest PS: it is not practical to use R_alloc et al because C++ allocation/ deallocation involves constructors/destructors and because the C++ code is also compiled into a standalone binary (I would rather avoid maintaining two separate versions). I use regular C++ new's too (except for the external pointer that's returned). However, you can override the operator new in C++ so that it uses your own allocator, e.g., R_alloc. I'm not sure about all the implications that might make that dangerous (e.g., can the memory be garbage collected? can it be moved?). Overriding new is a bit tricky since there are several variants. In particular, there is one with and one without an exception. Also, invdividual classes can define their own new operators; if you have any, you'd need to change those too. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R/C++/memory leaks
Here are a few small follow-up comments: On Sun, Feb 25, 2007 at 11:18:56PM +, Ernest Turro wrote: On 25 Feb 2007, at 22:21, Ross Boylan wrote: On Sun, Feb 25, 2007 at 05:37:24PM +, Ernest Turro wrote: Dear all, I have wrapped a C++ function in an R package. I allocate/deallocate memory using C++ 'new' and 'delete'. In order to allow user interrupts without memory leaks I've moved all the delete statements required after an interrupt to a separate C++ function freeMemory(), which is called using on.exit() just before the .C() call. Do you mean that you call on.exit() before the .C, and the call to on.exit() sets up the handler? Your last sentence sounds as if you invoke freeMemory() before the .C call. 'on.exit' records the expression given as its argument as needing to be executed when the current function exits (either naturally or as the result of an error). This means you call on.exit() somewhere at the top of the function. You are guaranteed the expression you pass to on.exit() will be executed before the function returns. So, even though you call on.exit () before .C(), the expression you pass it will actually be called after .C(). This means you can be sure that freeMemory() is called even if an interrupt or other error occurs. Another approach is to associate your C objects with an R object, and have them cleaned up when the R object gets garbage collected. However, this requires switching to a .Call interface from the more straightforward .C interface. [details snipped] Since this is a standalone C++ program too, I'd rather use the R API as little as possible... But I will look at your solution if I find it is really necessary.. Thanks The use of the R api can be confined to a wrapper function. But I can think of no reason that a change to the alternate approach I outlined would solve the apparent leaking you describe. I am concerned about the following. In square brackets you see R's total virtual memory use (VIRT in `top`): 1) Load library and data [178MB] (if I run gc(), then [122MB]) 2) Just before .C [223MB] 3) Just before freeing memory [325MB] So you explicitly call your freeMemory() function? This is called thanks to on.exit() 4) Just after freeing memory [288MB] There are at least 3 possibilities: * your C++ code is leaking The number of news and deletes are the same, and so is their branching... I don't think it is this. * C++ memory is never really returned (Commonly, at least in C, the amount of memory allocated to the process never goes down, even if you do a free. This may depend on the OS and the specific calls the program makes. OK, but the memory should be freed after the process completes, surely? Most OS's I know will free memory when a process finishes, except for shared memory. But is that relevant? I assume the process doesn't complete until you exit R. Your puzzle seems to involve different stages within the life of a single process. * You did other stuff in R that's still around. After all you went up +45MB between 1 and 2; maybe it's not so odd you went up +65MB between 2 and 4. Yep, I do stuff before .C and that accounts for the increase before .C. But all the objects created before .C go out of scope by 4) and so, after gc(), we should be back to 122MB. As I mentioned, ls () after 5) returns only the data loaded in 1). In principle (and according to ?on.exit) the expression registered by on.exit is evaluated when the relevant function is exited. In principle garbage collection reclaims all unused space (though with no guarantee of when). It may be that the practice is looser than the principle. For example, Python always nominally managed memory for you, but I think for quite awhile it didn't really reclaim the memory (because garbage collection didn't exist or had been turned off). 5) After running gc() [230MB] So although the freeMemory function works (frees 37MB), R ends up using 100MB more after the function call than before it. ls() only returns the data object so no new objects have been added to the workspace. Do any of you have any idea what could be eating this memory? Many thanks, Ernest PS: it is not practical to use R_alloc et al because C++ allocation/ deallocation involves constructors/destructors and because the C++ code is also compiled into a standalone binary (I would rather avoid maintaining two separate versions). I use regular C++ new's too (except for the external pointer that's returned). However, you can override the operator new in C++ so that it uses your own allocator, e.g., R_alloc. I'm not sure about all the implications that might make that dangerous (e.g., can the memory be garbage collected? can it be moved?). Overriding new is a bit tricky since there are several variants. In particular, there is one with and one without an exception
Re: [Rd] trying to understand condition handling
On Tue, Feb 20, 2007 at 07:35:51AM +, Prof Brian Ripley wrote: Since you have not told us what 'the documents' are (and only vaguely named one), do you not think your own documentation is inadequate? I mean the command description produced by ?tryCatch. There are documents about the condition system on developer.r-project.org: please consult them. OK, though I would hope the user level documentation would suffice. I quess 'ctl-C' is your private abbreviation for 'control C' (and not a type of cancer): that generates an interrrupt in most (but not all) R ports. Where it does, you can set up interrupt handlers (as the help page said) My P.S. concerned whether the code that was interrupted could continue from the point of interruption. As far as I can tell from ?tryCatch there is not, On Mon, 19 Feb 2007, Ross Boylan wrote: I'm confused by the page documenting tryCatch and friends. I think it describes 3 separate mechanisms: tryCatch (in which control returns to the invoking tryCatch), withCallHandlers (in which control goes up to the calling handler/s but then continues from the point at which signalCondition() was invoked), and withRestarts (I can't tell where control ends up). For tryCatch the docs say the arguments ... provide handlers, and that these are matched to the condition. It appears that matching works by providing entries in ... as named arguments, and the handler matches if the name is one of the classes of the condition. Is that right? I don't see the matching rule explicitly stated. And then the handler itself is a single argument function, where the argument is the condition? My reading is that if some code executes signalCondition and it is running inside a tryCatch, control will not return to the line after the signalCondition. Whereas, if the context is withCallHandlers, the call to signalCondition does return (with a NULL) and execution continues. That seems odd; do I have it right? Also, the documents don't explicitly say that the abstract subclasses of 'error' and 'warning' are subclasses of 'condition', though that seems to be implied and true. It appears that for tryCatch only the first matching handler is executed, while for withCallHandlers all matching handlers are executed. And, finally, with restarts there is again the issue of how the name in the name=function form gets matched to the condition, and the more basic question of what happens. My guess is that control stays with the handler, but then this mechanism seems very similar to tryCatch (with the addition of being able to pass extra arguments to the handler and maybe a more flexible handler specification). Can anyone clarify any of this? P.S. Is there any mechanism that would allow one to trap an interrupt, like a ctl-C, so that if the user hit ctl-C some state would be changed but execution would then continue where it was? I have in mind the ctl-C handler setting a time to finish up flag which the maini code checks from time to time. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] trying to understand condition handling (interrupts)
[resequencing and deleting for clarity] On Tue, Feb 20, 2007 at 01:15:25PM -0600, Luke Tierney wrote: On Tue, 20 Feb 2007, Ross Boylan wrote: P.S. Is there any mechanism that would allow one to trap an interrupt, like a ctl-C, so that if the user hit ctl-C some state would be changed but execution would then continue where it was? I have in mind the ctl-C handler setting a time to finish up flag which the maini code checks from time to time. On Tue, Feb 20, 2007 at 07:35:51AM +, Prof Brian Ripley wrote: ... I quess 'ctl-C' is your private abbreviation for 'control C' (and not a [yes, ctl-C = control C, RB] type of cancer): that generates an interrrupt in most (but not all) R ports. Where it does, you can set up interrupt handlers (as the help page said) My P.S. concerned whether the code that was interrupted could continue from the point of interruption. As far as I can tell from ?tryCatch there is not, Currently interrupts cannot be handled in a way that allows them to continue at the point of interruption. On some platforms that is not possible in all cases, and coming close to it is very difficult. So for all practical purposes only tryCatch is currently useful for interrupt handling. At some point disabling interrupts will be possible from the R level but currently I believe it is not. Best, luke I had suspected that, since R is not thread-safe, handling asynchronous events might be challenging. I tried the following experiment on Linux: h-function(e) print(Got You!) f-function(n, delay) for (i in seq(n)) {Sys.sleep(delay); print(i)} withCallingHandlers(f(7,1), interrupt=h) [1] 1 [1] Got You! So in this case the withCallingHandlers acts like a tryCatch, in that control does not return to the point of interruption. However, sys.calls within h does show where things were just before the interrupt: h-function(e) {print(Got You!); print(sys.calls());} withCallingHandlers(f(7,1), interrupt=h) [1] 1 [1] 2 [1] 3 [1] Got You! [[1]] withCallingHandlers(f(7, 1), interrupt = h) [[2]] f(7, 1) [[3]] Sys.sleep(delay) [[4]] function (e) { print(Got You!) print(sys.calls()) }(list()) Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] trying to understand condition handling
Thanks; your response is very helpful. This message has some remarks on my questions relative to the developer docs, one additional question, and some documentation comments. I'm really glad to hear you plan to revise the exception/condition docs. since I found the existing ones a bit murky. Below, [1] means http://www.stat.uiowa.edu/~luke/R/exceptions/simpcond.html, one of the documents Prof Ripley referred to. That page also has a nice illustration of using the restart facility. On Tue, Feb 20, 2007 at 01:40:11PM -0600, Luke Tierney wrote: On Mon, 19 Feb 2007, Ross Boylan wrote: I'm confused by the page documenting tryCatch and friends. I think it describes 3 separate mechanisms: tryCatch (in which control returns to the invoking tryCatch), withCallHandlers (in which should have been withCallingHandlers control goes up to the calling handler/s but then continues from the point at which signalCondition() was invoked), unless a handler does a non-local exit, typically by invoking a restart and withRestarts (I can't tell where control ends up). at the withRestarts call For tryCatch the docs say the arguments ... provide handlers, and that these are matched to the condition. It appears that matching works by providing entries in ... as named arguments, and the handler matches if the name is one of the classes of the condition. Is that right? I don't see the matching rule explicitly stated. And then the handler itself is a single argument function, where the argument is the condition? From [1], while discussing tryCatch, Handlers are specified as name = fun where name specifies an exception class and fun is a function of one argument, the condition that is to be handled. ... Also, the documents don't explicitly say that the abstract subclasses of 'error' and 'warning' are subclasses of 'condition', though that seems to be implied and true. The class relations are explicit in [1]. It appears that for tryCatch only the first matching handler is executed, while for withCallHandlers all matching handlers are executed. All handlers are executed, most recently established first, until there are none left or there is a transfer of control. Conceptually, exiting handlers established with tryCatch execute a transfer of control and then run their code. Here's the one point of clarification: does the preceding paragraph about all handlers are executed apply only to withCallingHandlers, or does it include tryCatch as well? Rereading ?tryCatch, it still looks as if the first match only will fire. Hopefully a more extensive document on this will get written in the next few months; for now the notes available off the developer page may be useful. best, luke Great. FWIW, here are some suggestions about the documentation: I would find a presentation that provided an overall orientation and then worked down easiest to follow. So, goiing from the top down: 1. there are 3 forms of exception handling: try/catch, calling handlers and restarts. 2. the characteristic behavior of each is ... (i.e., what's the flow of control). Maybe give a snippet of typical uses of each. 3. the details (exact calling environment of the handler(s), matching rules, syntax...) 4. try() is basically a convenient form of tryCatch. 5. Other relations between these 3 forms: what happens if they are nested; how restarts alter the standard control flow of the other forms. I also found the info that the restart mechanism is the most general and complicated useful for orientation (that might go under point 1). It might be appropriate to document each form on a separate manual page; I'm not sure if they are too linked (particularly by the use of conditions and the control flow of restart) to make that a good idea. I notice that some of the outline above is not the standard R manual format; maybe the big picture should go in the language manual or on a concept page (?Exceptions maybe). Be explicit about the relations between conditions (class inheritance relations). Be explicit about how handlers are chosen and which forms they take. It might be worth mentioning stuff that is a little surprising. The fact that the value of the finally is not the value of the tryCatch was a little surprising, since usually the value of a series of statements or expression is that of the last one. The fact that signalCondition can participate in two different flows of control (discussion snipped above) was also surprising to me. In both cases the current ?tryCatch is pretty explicit already, so that's not the issue. I found the current language (for ?tryCatch) about the calling context of different handlers a bit obscure. For example, discussing tryCatch: If a handler is found then control is transferred to the 'tryCatch' call that established the handler, the handler found and all more recent handlers are disestablished, the handler is called
[Rd] trying to understand condition handling
I'm confused by the page documenting tryCatch and friends. I think it describes 3 separate mechanisms: tryCatch (in which control returns to the invoking tryCatch), withCallHandlers (in which control goes up to the calling handler/s but then continues from the point at which signalCondition() was invoked), and withRestarts (I can't tell where control ends up). For tryCatch the docs say the arguments ... provide handlers, and that these are matched to the condition. It appears that matching works by providing entries in ... as named arguments, and the handler matches if the name is one of the classes of the condition. Is that right? I don't see the matching rule explicitly stated. And then the handler itself is a single argument function, where the argument is the condition? My reading is that if some code executes signalCondition and it is running inside a tryCatch, control will not return to the line after the signalCondition. Whereas, if the context is withCallHandlers, the call to signalCondition does return (with a NULL) and execution continues. That seems odd; do I have it right? Also, the documents don't explicitly say that the abstract subclasses of 'error' and 'warning' are subclasses of 'condition', though that seems to be implied and true. It appears that for tryCatch only the first matching handler is executed, while for withCallHandlers all matching handlers are executed. And, finally, with restarts there is again the issue of how the name in the name=function form gets matched to the condition, and the more basic question of what happens. My guess is that control stays with the handler, but then this mechanism seems very similar to tryCatch (with the addition of being able to pass extra arguments to the handler and maybe a more flexible handler specification). Can anyone clarify any of this? P.S. Is there any mechanism that would allow one to trap an interrupt, like a ctl-C, so that if the user hit ctl-C some state would be changed but execution would then continue where it was? I have in mind the ctl-C handler setting a time to finish up flag which the maini code checks from time to time. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] pinning down symbol values (Scoping/Promises) question
I would like to define a function using symbols, but freeze the symbols at their current values at the time of definition. Both symbols referring to the global scope and symbols referring to arguments are at issue. Consider this (R 2.4.0): k1 - 5 k [1] 100 a - function(z) function() z+k a1 - a(k1) k1 - 2 k - 3 a1() [1] 5 k - 10 k1 - 100 a1() [1] 12 First, I'm a little surprised that that the value for k1 seems to get pinned by the initial evaluation of a1. I expected the final value to be 110 because the z in z+k is a promise. Second, how do I pin the values to the ones that obtain when the different functions are invoked? In other words, how should a be defined so that a1() gets me 5+100 in the previous example? I have a partial solution (for k), but it's ugly. With k = 1 and k1 = 100, a - eval(substitute(function(z) function() z+x, list(x=k))) k - 20 a1 - a(k1) a1() [1] 101 (by the way, I thought a - eval(substitute(function(z) function() z+k)) would work, but it didn't). This seems to pin the passed in argument as well, though it's even uglier: a - eval(substitute(function(z) { z; function() z+x}, list(x=k))) a1 - a(k1) k1 - 5 a1() [1] 120 -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R Language Manual: possible error
The R Language manual, section 4.3.4 (Scope), has f - function(x) { y - 10 g - function(x) x + y return(g) } h - f() h(3) ... When `h(3)' is evaluated we see that its body is that of `g'. Within that body `x' and `y' are unbound. Is that last sentence right? It looks to me as if x is a bound variable, and the definitions given in the elided material seem to say so too. I guess there is hidden, outer, x that is unbound. Maybe the example was meant to be g - function(a) a + y? The front page of the manual says The current version of this document is 2.4.0 (2006-11-25) DRAFT. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem using ofstream in C++ class in package for MacOS X
On Sun, Feb 04, 2007 at 10:47:37PM +0100, cstrato wrote: Seth Falcon wrote: cstrato [EMAIL PROTECTED] writes: Thank you for your fast answer. Sorrowly, I don´t know how to use a debugger on MacOS X, I am using old-style print commands. You should be able to use gdb on OS X (works for me, YMMV). So you could try: R -d gdb run # source a script that causes crash # back in gdb, use backtrace, etc. + seth Dear Seth Thank you for this tip, I just tried it and here is the result: Welcome to MyClass writeFileCpp(myout_fileCpp.txt) [1] outfile = myout_fileCpp.txt Writing file myout_fileCpp.txt using C++ style. ---MyClassA::MyClassA()- ---MyClassA::WriteFileCpp- Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_PROTECTION_FAILURE at address: 0x0006 0x020fe231 in std::ostream::flush (this=0x214f178) at /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc:395 395 /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc: No such file or directory. in /Builds/unix/o403/i686-apple-darwin8/libstdc++-v3/include/bits/ostream.tcc (gdb) It seems that it cannot find ostream.tcc, whatever this extension means. Best regards Christian I also don't see what the problem is, but have a couple of thoughts. Under OS-X there is an environment variable you can define to get the dynamic linker to load debug versions of libraries. I can't remember what it is, but maybe something like DYLD_DEBUG (but probably DEBUG is part of the value of the variable). For that, or the tracing above, to be fully informative you need to have installed the appropriate debugging libraries and sources. You may need to set an explicit source search path in gdb to pick up the source files. Try stepping through the code from write before the crash to determine exactly where it runs into trouble. Does the output file you are trying to create exist? Unfortunately, none of this really gets at your core bug, but it might help track it down. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] One possible use for threads in R
I have been using R on a cluster with some work that does not parallelize neatly because the time individual computations take varies widely and unpredictably. So I've considered implementing a work-stealing arrangement, in which idle nodes grab tasks from busy nodes. It might also be useful for nodes to communicate results with each other. My first thought on handling this was to have one R thread that managed the communication, and 2 that managed computation (each node is dual-processor). Previous discussion has noted that R is not multi-threaded, and also asked what use cases multi-threading might address. So here's a use case. The advantage of having R doing the communication is that it's easy to pass R-level objects around using, e.g., Rmpi. The advantage of having the communicator and the calculators share the same thread is that work and information the communicator got would be immediately available to the calculators. Other comments suggested IPC is fast (though one comment referred specifically to Linux, and the cluster is OS-X), so it may be quite workable to have each thread in a separate process. I'm not at all sure the implementation I sketched above is the best approach to this problem (or even that it would be if R were multi-threaded), but it does seem to me this might be one area where threads would be handy in R. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problem using ofstream in C++ class in package for MacOS X
On Thu, Feb 08, 2007 at 11:53:21PM +0100, cstrato wrote: ... Maybe there's some subtle linker problem, or a problem with the representation of strings What do you mean with linker problem? Nothing very specific, but generically wrong options, wrong objects/libraries, or wrong order of the first 2. Wrong includes omitting something that should be there or including something that shouldn't. Linking on OS-X is unconventional relative to other systems I have used. In particular, one usually gets lots of errors about duplicate symbols (which can be turned off, at some risk) and needs to specify flat rather than 2-level namespace. There's lots more if you look at the linker page (man ld). Similar issues can arise at the compiler phase too. Another fun thing on OS-X is that they have a libtool that is different from the GNU libtool, and your project might use both. So you need to be sure to get the right one. But it's unlikely you could even build if that were an issue. If different parts (e.g., R vs your code) are built with different options, that can cause trouble. For example, my Makefile has MAINCXXFLAGS := $(shell R CMD config --cppflags) -std=c++98 -Wall -I$(TRUESRCDIR) This relies on GNU make features. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] encoding issues even w/o accents (background on single quotes)
On Wed, Jan 17, 2007 at 11:56:15PM -0800, Ross Boylan wrote: An earlier thread (in 10/2006) discussed encoding issues in the context of R data and the desire to represent accented characters. It matters in another setting: the output generated by R and the seemingly order character ' (single quote). In particular, R CMD ^^^ should be ordinary check runs test code and compares the generated output to a saved file of expected output. This does not work reliably across encoding schemes. This is unfortunate, since it seems the expected output files will necessarily be wrong for someone. The problem for me was triggered by the single-quote character '. On my older systems, this is encoded by 0x27, a perfectly fine ASCII character. That is on a Debian GNU/Linux system with LANG=en_US. On a newer system I have LANG=en_US.UTF-8. I don't recall whether this was a deliberate choice on my part, or simply reflects changing defaults for the installer. (Note the earlier thread referred to the Debian-derived Ubuntu systems as having switched to UTF-8). Under UTF-8 the same character is encoded in the 3-byte sequence 0xE28098 (which seems odd; I thought the point of UTF-8 was that ASCII was a legitimate subset). Apparently quoting, particularly single quotes, is a can of worms: http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html When Unicode is available (which would be the case with UTF-8), particular non-ASCII characters are recommended for single quoting. The 3 byte sequence is the UTF-8 encoding of x2018, the recommended left single quote mark. See http://en.wikipedia.org/wiki/UTF-8 on UTF-8 encoding. This is more than I or, probably, you ever wanted to know about this issue! Ross The coefficient printing methods in the stats package use the single-quote in the key explaining significance levels: Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 I suppose one possible work-around for R CMD check would be to set the encoding to some standard value before it runs tests, but that has some drawbacks. It doesn't work for packages needing a different encoding (but perhaps the package could specify an encoding to use by default?)(*), It will leave the output files looking weird on systems with a different encoding. It will get messed up if one generates the files under the wrong encoding. And none of this addresses stuff beyond the context of output file comparison in R CMD check. Any thoughts? Ross Boylan * From the R Extensions document, discussing the DESCRIPTION file: If the `DESCRIPTION' file is not entirely in ASCII it should contain an `Encoding' field specifying an encoding. This is currently used as the encoding of the `DESCRIPTION' file itself, and may in the future be taken as the encoding for other documentation in the package. Only encoding names `latin1', `latin2' and `UTF-8' are known to be portable. I would not expect that the test output files be considered documentation, but I suppose that's subject to interpretation. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] C vs. C++ as learning and development tool for R
On Fri, Jan 19, 2007 at 03:55:30AM -0500, Kimpel, Mark William wrote: I have 3 years of experience with R and have an interest in becoming a better programmer so that I might someday be able to contribute packages. Other than R, my only experience was taking Lisp from Daniel Friedman in the 1970's. I would like to learn either C or C++ for several reasons: To gain a better concept of object oriented programming so that I can begin to use S4 methods in R. To perhaps speed up some things I do repeatedly in R To be able to contribute a package someday. I have been doing some reading and from what I can tell R is more compatible with C, but C++ has much greater capabilities for OO programming. I have just started reading The C++ Programming Language: Special Edition by Bjarne Stroustrup http://search.barnesandnoble.com/booksearch/results.asp?ATH=Bjarne+Stro ustrupz=y , he recommends first learning C++ and then then C if necessary, but as a developer of C++, he is probably biased. I would greatly appreciate the advice of the R developers and package contributors on this subject. C or C++? To echo several other comments, if your goal is to work in R, it would be best to go straight to R. I haven't used lisp much, but I believe it is much closer to R than most other languages you could pick. It has a functional style, and I recall reading the R's scoping rules were directly inspired by Scheme, a Lisp variant. In fact, I didn't feel I fully grasped them until I looked at Abelson and Sussman's Structure and Interpretation of Computer Languages (which uses Scheme). The functional OO of R is significantly different from the class-based OO found in most languages calling themselves object oriented, including C++, Java, Python and smalltalk. Learning those other languages to understand R could actually interfere with learning R. If and when you need speed, you can program in any language that supports Fortran or C interfaces, which is almost all of them. If you're just doing general education I use C++ in R, and I have to say that programming in C++ is a wretched experience. You have to make a major committment to learning the language, which is a minefield of gotcha's, to use it in full OO style. As others on this list and Stroustrup suggest, you can use it and just incrementally add features over what you would do in C. It can also be speedy and powerful (to run, not to program in!), which is why I'm using it. For pure OO, I think you can't beat smalltalk, which is freely available at www.squeak.org (also there is a GNU and several commerical versions). The language rules and syntax fit on one page. The catch is that to use it you need to learn the environment and the class library; these too are big tasks. Objective-C is a much more lightweight C'ish OO than C++ (the author moved smalltalk concepts into C). It's available as part of the GNU compilers. Unlike smalltalk, you might use it if you cared about performance, and it's the native language of Mac OS-X. It has a relatively small learning curve. Python and Java are other choices for OO, both significantly simpler than C++. I find Python to be simple and elegant; it's also nifty for scripting random tasks. Java's widely used on the web and in the enterprise. Eiffel is also interesting. I can't say much about libraries already on other machines, but the C runtime is probably the one you can count on being there the most. Of course, another route would be to explore other functional languages, a terrain I barely know: Haskell, ML, OCaml... In particular, some of them have lazy evaluation of arguments, which R also employs. And there are the functional/object languages like CLOS (I think the O in OCaml is Object). Anyway, this risks becoming a general language thread. My main point, as someone who's been there, is don't use C++ unless you have a compelling reason and a lot of time! Ross Boylan (Among the languages listed, the ones I've used extensively are C, C++, Objective-C, Python, R, and smalltalk.) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Problems with checking documentation vs data, and a proposal
I have a single data file inputs.RData that contains 3 objects. I generated an Rd page for each object using prompt(). When I run R CMD check I get * checking for code/documentation mismatches ... WARNING Warning in utils::data(list = al, envir = data_env) : data set 'gold' not found (gold is one of the objects). This appears to be coming from the codocData function defined in src/library/tools/R/QC.R (this is in the Debianised source 2.4.1, so the path might be a little different). According to the help on this function, it will only attempt a match when there is a single alias in the documentation file, although I'm not sure that's what the code does (it seems to check only if there is more than one format section). At any rate, the central logic appears to gather up names of data objects and then to load them with ## Try loading the data set into data_env. utils::data(list = al, envir = data_env) if(exists(al, envir = data_env, mode = list, inherits = FALSE)) { al - get(al, envir = data_env, mode = list) } Since there is no gold.RData, this is failing. This leads to 2 issues: what should I do now, and how might this work better in the future. Taking the future first, how about having the code first load all the data files that it finds somewhere near the beginning? If it did so, the code ## Try finding the variable or data set given by the alias. al - aliases[i] if(exists(al, envir = code_env, mode = list, inherits = FALSE)) { which precedes the earlier snippet, would find the symbol was defined and be happy. I suppose the data could be loaded into code_env, although using it seems to risk deciding that a data symbol is defined when the symbol refers to a code object. I'm not sure if attempting to load the data objects individually should still be attempted under this scenario, if the symbol is not already present. What can I do in the short run, particularly since I would like to have the code pass R CMD check with versions of R that don't include this possible enhancement, what can I do? I see several options, none of them beautiful: 1) Delete inputs.RData and create 3 separate data objects. However, I have code that relies on inputs being present, and the 3 data items go together naturally. 2) Make a single document describing inputs.RData. First problem: the page would be awkward combining all 3 things. Second, it looks as if codocData might still try loading the individual data objects, since it tries to pull data names out of the documentation, even out of individual item inside \describe. 3) Attempt to disable the checks by adding multiple aliases or something else to be revealed by closer inspection of the code. This is a hack that bypasses the checking altogether (unless it turns out I still get a complaint about missing documentation). 4) Create gold.RData and others as symlinks to inputs.RData. Fragile across operating systems, version control systems, and versions of tar. Might get errors about multiple data definitions. Usual caveats: this is all based on my imperfect understanding of the code. So, any comments on the possible modification to codocData or the work-arounds? -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Problems with checking documentation vs data, and a proposal
On Tue, 2007-01-16 at 14:03 -0800, Ross Boylan wrote: I have a single data file inputs.RData that contains 3 objects. I generated an Rd page for each object using prompt(). When I run R CMD check I get * checking for code/documentation mismatches ... WARNING Warning in utils::data(list = al, envir = data_env) : data set 'gold' not found (gold is one of the objects). . What can I do in the short run, particularly since I would like to have the code pass R CMD check with versions of R that don't include this possible enhancement, what can I do? I see several options, none of them beautiful: ... 4) Create gold.RData and others as symlinks to inputs.RData. Fragile across operating systems, version control systems, and versions of tar. Might get errors about multiple data definitions. Option 4 worked, though the symlinks were converted to regular files by R CMD check. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Am I missing something about debugging?
On Thu, 2007-01-04 at 17:06 +1100, [EMAIL PROTECTED] wrote: It is possible to do some of these things with the 'debug' package-- the article in R-news 2003 #3 shows a few of the tricks. Suppose 'b1' calls 'c1'. If 'c1' exists as permanent function defined outside 'b1' (which I generally prefer, for clarity), then you can call 'mtrace( c1)' and 'c1' will be invoked whenever it's called-- you don't have to first 'mtrace' 'b1' and then manually call 'mtrace(c1)' while inside 'b1'. Is the effect of mtrace permanent? For example, if b1 - function() { # stuff c1() # stuff c1() } And you mtrace(c1), will both calls to c1, as well as any outside of b1, bring up the debugger? I ask because sometimes the normal step semantics in debugging is more useful, i.e., debug into the next call to c1 only. As I understand it, the debug package can put a one-time only breakpoint (with go), but only in the body of the currently active function. Am I correct that both the debug package and the regular debug require explicitly removing debugging from a function to turn off debugging? Even if 'c1' is defined inside the body of 'b1', you can get something similar by using conditional breakpoints, like this mtrace( b1) # whatever you type to get 'b1' going D(17) # now look at the code window for 'b1' and find the line just after the definition of 'c1' D(17) # ... say that's on line 11 D(17) bp( 11, {mtrace( c1);FALSE}) # which will auto-mtrace 'c1' without stopping; of course you could hardwire this in the code of 'b1' too If you invoke b1 multiple times, will the previous procedure result in wrapping c1 multiple times, e.g., first time through c1 is replaced by mtrace(c1); second time rewrites the already rewritten fucntion? Is that a problem? the point is that you can stick all sorts of code inside a conditional breakpoint to do other things-- if the expression returns FALSE then the breakpoint won't be triggered, but the side-effects will still happen. You can also use conditional breakpoints and 'skip' command to patch code on-the-fly, but I generally find it's too much trouble. Note also the trick of D(17) bp(1,F) which is useful if 'b1' will be called again within the lifetime of the current top-level expression and you actually don't want to stop. Is bp(1, FALSE) equivalent to mtrace(f, false), if one is currently debugging in f? The point about context is subtle because of R's scoping rules-- should one look at lexical scope, or at things defined in calling functions? The former happens by default in the 'debug' package (ie if you type the name of something that can be seen from the current function, then the debugger will find it, even if it's not defined in the current frame). For the latter, though, if you are currently inside c1, then one way to do it is to use 'sys.parent()' or 'sys.parent(2)' or whatever to figure out the frame number of the context you want, then you could do e.g. D(18) sp - sys.frame( sys.parent( 2)) D(18) evalq( ls(), sp) etc which is not too bad. It's worth experimenting with sys.call etc while inside my debugger, too-- I have gone to some lengths to try to ensure that those functions work the way that might be expected (even though they actually don't... long story). sys.parent and friends didn't seem to work for me in vanilla R debugging, so this sounds really useful to me. If you are 'mtrace'ing one of the calling functions as well, then you can also look at the frame numbers in the code windows to work out where to 'evalq'. I thought the frame numbers shown in the debugger are numbered successively for the call stack, and that these are not necessarily the same as the frame numbers in R. My understanding is that the latter are not guaranteed to be consecutive (relative to the call stack). From the description of sys.parent: The parent frame of a function evaluation is the environment in which the function was called. It is not necessarily numbered one less than the frame number of the current evaluation, The current 'debug' package doesn't include a watch window (even though it's something I rely on heavily in Delphi, my main other language) mainly because R can get stuck figuring out what to display in that window. Just out of curiosity, how does that problem arise? I'd expect showing the variables in a frame to be straightforward. It's not that hard to do (I used ot have one in the Splus version of my debugger) and I might add one in future if demand is high enough. It would help if there was some way to time-out a calculation-- e.g. a 'time.try' function a la result - time.try( { do.some.big.calculation}, 0.05) which would return an object of class too-slow if the calculation takes more than 0.05s. I'm certainly willing to consider adding other features to the 'debug' package if they are easy enough and demand is high enough! [And if I have time, which I mostly
Re: [Rd] Which programming paradigm is the most used for make R packages?
On Wed, Jan 03, 2007 at 11:46:16AM -0600, Ricardo Rios wrote: Hi wizards, does somebody know Which programming paradigm is the most used for make R packages ? Thanks in advance. You need to explain what you mean by the question, for example what paradigms you have in mind. R is a functional language; as I've discovered, this means some standard OO programming approaches don't carry over too naturally. In particular, functions don't really belong to classes. R purists would probably want that to say class-based 00 programming doesn't fit, since R is function-based OO. There is a package that permits a more traditional (class-based) OO style; I think it's called R.oo. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Am I missing something about debugging?
I would like to be able to trace execution into calls below the current function, or to follow execution as calls return. This is roughly the distinction between step and next in many debuggers. I would also like to be able to switch to a location further up the call stack than the location at which I enter the debugger, to see the context of the current operations. Are there ways to do these things with the R debugger? I've studied the man pages and FAQ's, and looked at the debug package, but I don't see a way except for manually calling debug on the function that is about to be called if I want to descend. That's quite awkward, particularly since it must be manually undone (the debug package may be better on that score). I'm also not entirely sure that such recursion (essentially, debugging within the debugger) is OK. I tried looking up the stack with things like sys.calls(), from within the browser, but they operated as if I were at the top level (e.g., sys.function(-1) gets an error that it can't go there). I was doing this in ess, and there's some chance the can't write .Last.value error (wording approximate) cause by having an old version is screwing things up). Since R is interpreted I would expect debugging to be a snap, but these limitations make me suspect there is something about the language design that makes implementing these facilities hard. For example, the browser as documented in the Green book has up and down functions to change the frame (p. 265); these are conspicuously absent in R. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Am I missing something about debugging?
On Tue, 2007-01-02 at 17:24 -0500, Duncan Murdoch wrote: I don't think you're missing anything with the debug() function. It needs updating. Bummer! I don't think there's any structural reason why you shouldn't be able to do the things you're talking about in R, but they haven't been implemented. That's good to know. I was wondering if the lexical scoping was complicating things. At least the way I think of it, every call has two sets of (potentially) nested environments: the lexical scopes of the function definition and the dynamic scopes of the call. But since the dynamic scopes are available, using them seems possible. Mark Bravington put together a package (called debug) that does more than debug() does, but I haven't used it much, and I don't know if it does what you want. It looked to me as if it was some help, but no advance on the investigating dynamic frames front. I recently added things to the R parser to keep track of connections between R code and source files; that was partly meant as a first step towards improving the debugging facilities. I'd be happy to help anyone who wants to do the hard work, but I don't think I'll be able to work on it before next summer. (If you do decide to work on it, please let me know, just in case I do get a chance: no point duplicating effort.) I didn't even realize such a facility was needed, which shows how much I know! Working on the debugger is probably not in my job description, unless I get really annoyed. The smalltalk debugger is the standard by which I judge all others; it's just amazing. You can go up and down the stack, graphically examine variables (and follow links), and change code in the middle of debugging and then continue. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Capturing argument values
I would like to preserve the values of all the arguments to a function in a results object. foo - function(a, b=1) foo(x, 3) match.call() looks promising, but it records that a is x, while I want the value of x (in the calling frame). Also, if the invocation is foo(x), then match.call doesn't record that b is 1. So I tried this (inside the function definition): myargs - lapply(names(formals()), function(x) eval(as.name(x))) That's pretty close. However, my function has an optional argument in this sense: bar - function(x, testing) ... where code in the body is if (! missing(testing)) do stuff When the eval in the previous lapply runs for a function call in which testing is not supplied, I get Error in eval(expr, envir, enclos) : argument testing is missing, with no default exposing a weakness in both my implementation and problem specification. I think I could simply screen testing out of the formals and be happy, but are there better ways of handling this situation? I realize I could capture the function's entire local frame, but that has quite a bit of stuff I don't want in it. I suspect some of the items in it might be promises, and so would not have the values I needed as well. (Also the frame could later change, though I guess I could convert it to a list to avoid that problem.) Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] setGeneric and file order in a package
Are there any assumptions I can make about the order in which different files in the R package subdirectory will be evaluated? For example, will it be alphabetical in the file names? Will case distinctions be ignored? I ask because I would like to use setGeneric, as in setGeneric(foo, function(x) standardGeneric(foo)) and am wondering where that should go. I realize I could explicitly test for the existence of the generic in each spot, and execute setGeneric as needed. That seems wasteful and error prone. Is there other recommended practice in setting generics? For example, should I test for existence of a generic in the one spot I create it? Since that seems like a half-measure (if a generic exists it may well have different arguments) I suppose I should use namespaces... Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] promptClass misses methods
On Sat, Dec 02, 2006 at 05:11:22PM +0100, Martin Maechler wrote: RossB == Ross Boylan [EMAIL PROTECTED] on Fri, 1 Dec 2006 11:33:21 -0800 writes: RossB On Fri, Dec 01, 2006 at 11:37:46AM +0100, Martin RossB Maechler wrote: RossB == Ross Boylan [EMAIL PROTECTED] on Thu, 30 Nov 2006 22:29:06 -0800 writes: RossB I've had repeated problems with promptClass missing RossB methods, usually telling me a class has no methods RossB when it does. RossB In my current case, I've defined an S4 class RossB mspathCoefficients with a print method RossB setMethod(print, signature(x=mspathCoefficients), RossB function(x, ...) { # etc You should *not* define print methods for S4 classes; rather you should define show methods. RossB Is that because print is used by the S3 system? no, not really. RossB And is the general rule to avoid using S3 methods for S4 RossB classes? Well your wording is murky, but no, you *should* define (S4) methods for S3 generics very well. The S3 generics are automagically promoted to S4 generics as soon as you define an S4 method for it. That answers my question. The meaning was if foo is an S3 method, should one avoid defining foo as an S4 method. And the answer is no, it's OK. I assume one should strive to use the same argument names, although since S3 methods don't need to use the same argument names I'm not sure how that works (e.g. for S3 foo.class1 - function(x, a, b) but foo.class2 - function(x, c) ). print() is just a big exception. How come? Is it an exception in the sense that it is not automatically used to display the object, or in the sense that one should never define S4 print methods at all? (Looks like the first alternative based on the example later.) The print methods I have seem to work OK, provided I don't expect them to be called automatically and provided I don't expect promptClass to pick them up. RossB For example, http://www.omegahat.org/RSMethods/Intro.pdf, which is RossB referenced in the package help for methods, discusses RossB show, print and plot as 3 alternatives in S4 (p. 9, RossB though a footnote says that at that time--2001--R RossB didn't recognize formal methods for printing RossB objects.) 2001 is way in the past concerning S4 implementation in R. Perhaps the reference should be removed then. Maybe the newer http://developer.r-project.org/howMethodsWork.pdf would be better? However, that is pitched more toward the internals, and is already referenced in ?Methods. Specifically, using S4 in R; we'd **very strongly** recommend R 2.4.0 (and ideally even R-patched) because of several recent good developments. Fortunately that's what I'm using. I wonder if this is so important I should require R = 2.4 for my package. It was working fine in earlier versions. The main user visible changes I'm aware of are those in the object forms (i.e., binary incompatibility) and some improvements in the algorithm for choosing which method to dispatch to (semantically, sometimes a different method gets called; it sounds faster too). I'm not distributing any data files with S4 class objects, and don't have any corner cases on method dispatch. RossB I've been unable to locate much information about RossB combining S3 and S4 methods, though I recall seeing a RossB note saying this issue was still to be addressed in RossB R. Perhaps it has been now, with setOldClass? At RossB any rate, the help for that method addresses classes RossB rather than methods, and I didn't see anything in RossB ?Methods, ?setMethod, or ?setGeneric. RossB show() raises two additional issues for me. First, RossB it takes a single argument, and I want to be able to RossB pass in additional arguments via ... . That's not possible currently. In the expected use of show(), namely automatically showing an object, additional arguments don't make sense (since there's no chance to provide them). If the only problem in my use of print is that it's not called automatically, then perhaps I should leave it as is and define a show method that invokes print(). That seems to be the pattern in the example you provided below. And I agree that in certain cases, I would want to have the flexibility of print(..) there; One case is for printing/showing fitted LMER objects; the following code is used : ## This is modeled a bit after print.summary.lm : printMer - function(x, digits = max(3, getOption(digits) - 3), correlation = TRUE, symbolic.cor = x$symbolic.cor, signif.stars = getOption(show.signif.stars), ...) { ... ... invisible(x) } setMethod(print, mer, printMer) setMethod(show, mer, function(object) printMer(object)) RossB Second, I read
Re: [Rd] promptClass misses methods
On Fri, Dec 01, 2006 at 11:37:46AM +0100, Martin Maechler wrote: RossB == Ross Boylan [EMAIL PROTECTED] on Thu, 30 Nov 2006 22:29:06 -0800 writes: RossB I've had repeated problems with promptClass missing RossB methods, usually telling me a class has no methods RossB when it does. RossB In my current case, I've defined an S4 class RossB mspathCoefficients with a print method RossB setMethod(print, signature(x=mspathCoefficients), RossB function(x, ...) { # etc You should *not* define print methods for S4 classes; rather you should define show methods. Is that because print is used by the S3 system? And is the general rule to avoid using S3 methods for S4 classes? For example, http://www.omegahat.org/RSMethods/Intro.pdf, which is referenced in the package help for methods, discusses show, print and plot as 3 alternatives in S4 (p. 9, though a footnote says that at that time--2001--R didn't recognize formal methods for printing objects.) I've been unable to locate much information about combining S3 and S4 methods, though I recall seeing a note saying this issue was still to be addressed in R. Perhaps it has been now, with setOldClass? At any rate, the help for that method addresses classes rather than methods, and I didn't see anything in ?Methods, ?setMethod, or ?setGeneric. show() raises two additional issues for me. First, it takes a single argument, and I want to be able to pass in additional arguments via ... . Second, I read some advice somewhere, that I no longer can find, that show methods should return an object and that object in turn should be the thing that is printed. I don't understand the motivation for that rule, at least in this case, because my object is already a results object. RossB The file promptClass creates has no methods in it. showMethods(classes=mspathCoefficients) RossB Function: initialize (package methods) RossB .Object=mspathCoefficients (inherited from: RossB .Object=ANY) so it's just inherited from ANY RossB Function: print (package base) RossB x=mspathCoefficients that's the one So why isn't promptClass picking it up? RossB Function: show (package methods) RossB object=mspathCoefficients RossB (inherited from: object=ANY) so it's just inherited from ANY Ross, it would really be more polite to your readers if you followed the posting guide and posted complete fully-reproducible code... I thought it might be overkill in this case. At any rate, it sounds as if I may be trying to do the wrong thing, so I'd appreciate guidance on what the right thing to do is. Here's a toy example: setClass(A, representation(x=numeric)) setMethod(print, signature(x=A), function(x, ...) print([EMAIL PROTECTED], ...) ) promptClass(A) The generated file has no print method. getGeneric(print) RossB standardGeneric for print defined from package RossB base RossB function (x, ...) standardGeneric(print) RossB environment: 0x84f2d88 Methods may be defined for RossB arguments: x RossB I've looked through the code for promptClass, but RossB nothing popped out at me. RossB It may be relevant that I'm running under ESS in RossB emacs. However, I get the same results running R RossB from the command line. RossB Can anyone tell me what's going on here? This is RossB with R 2.4, and I'm not currently using any namespace RossB for my definitions. [and not a package either?] The code is part of a package, but I'm developing code snippets in ESS without loading the whole package. I'm very willing to look at this, once you've provided what the posting guide asks for, see above. Regards, Martin Thank you. For completeness, here's some system info: sessionInfo() R version 2.4.0 (2006-10-03) i486-pc-linux-gnu locale: LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C attached base packages: [1] methods stats graphics grDevices utils datasets [7] base The Debian package is r-base-core 2.4.0.20061103-1. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] printing coefficients with text
On Fri, Dec 01, 2006 at 10:34:45AM +0100, Martin Maechler wrote: RossB == Ross Boylan [EMAIL PROTECTED] on Thu, 30 Nov 2006 12:17:55 -0800 writes: RossB I want to print the coefficient estimates of a model RossB in a way as consistent with other output in R as RossB possible. stats provides the printCoefmat function RossB for doing this, but there is one problem. I have an RossB additional piece of textual information I want to put RossB on the line with the other info on each coefficient. that's not a real problem, see below RossB The documentation for printCoefmat says the first RossB argument must be numeric, which seems to rule this out. it does say that (it says x: a numeric matrix like object which includes data frames with factors) but you are right that it does preclude a column of character. Having gone through the code, it's clear the code itself requires all numerics. RossB I just realized I might be able to cheat by inserting RossB the text into the name of the variable (fortunately RossB there is just one item of text). I think that's in RossB the names of the matrix given as the first argument RossB to the function. yes; it's the rownames(); i.e., you'd do something like rownames(myx) - paste(rownames(myx), format(mytext_var))) which seems simple enough to me, but it only works when the text is the first column This actually worked out great for me. RossB Are there any better solutions? Obviously I could RossB just copy the method and modify it, but that creates RossB duplicate code and loses the ability to track future RossB changes to printCoefmat. As original author of printCoefmat(), I'm quite willing to accept and incorporate a patch to the current function definition (in https://svn.r-project.org/R/trunk/src/library/R/anova.R), if it's well written. As a matter of fact, I think already see how to generalize printCoefmat() to work for the case of data frame with character columns Yes, that seems as if it would be a good generalization. However, there is code that makes inferences based on the number of columns of data, and I'm not sure how that should work. Probably it should ignore the non-numeric data. [I would not want a character matrix however; since that would mean going numeric - character - numeric - formatting (i.e character) for the 'coefficients' themselves]. Can you send me a reproducible example? You mean of an input data frame? Or something else? The input isn't currently a data frame, but I could certainly make one. Do you think generalizing to other types (factor, logical) would make sense too? or at least an *.Rda file of a save()d such data frame? RossB Thanks. Ross Boylan You're welcome, Martin Maechler, ETH Zurich __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Web site link problems (PR#9401)
On Thu, Nov 30, 2006 at 10:59:13AM +0100, Peter Dalgaard wrote: [EMAIL PROTECTED] wrote: 2. http://www.r-project.org/posting-guide.html includes in the section Surprising behavior and bugs, make sure you read R Bugs in the R-faq. The latter is the link http://cran.r-project.org/doc/FAQ/R-FAQ.html#R%20Bugs, which takes me to the page but not the section. The link on the FAQ page to that section is http://cran.r-project.org/doc/FAQ/R-FAQ.html#R-Bugs (i.e., no %20). Desired state: update the link. Yes. (It's only a half-page scroll plus an extra click though...) The risk is that someone will just conclude it's a bad link and stop there. You also might want to consider footers on your web pages saying to report problems with this web page do x. The pages I looked at didn't have this info, as far as I can tell. Maybe, if it is easy. The whole bug repository is overdue for replacement, so things that are not critical and/or easy to fix may be left alone... I hope this is an appropriate place to let you know! It'll do. Don't report other website issues to the bug repository though. OK. Where should such reports go? Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] printing coefficients with text
I want to print the coefficient estimates of a model in a way as consistent with other output in R as possible. stats provides the printCoefmat function for doing this, but there is one problem. I have an additional piece of textual information I want to put on the line with the other info on each coefficient. The documentation for printCoefmat says the first argument must be numeric, which seems to rule this out. I just realized I might be able to cheat by inserting the text into the name of the variable (fortunately there is just one item of text). I think that's in the names of the matrix given as the first argument to the function. Are there any better solutions? Obviously I could just copy the method and modify it, but that creates duplicate code and loses the ability to track future changes to printCoefmat. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] promptClass misses methods
I've had repeated problems with promptClass missing methods, usually telling me a class has no methods when it does. In my current case, I've defined an S4 class mspathCoefficients with a print method setMethod(print, signature(x=mspathCoefficients), function(x, ...) { # etc The file promptClass creates has no methods in it. showMethods(classes=mspathCoefficients) Function: initialize (package methods) .Object=mspathCoefficients (inherited from: .Object=ANY) Function: print (package base) x=mspathCoefficients Function: show (package methods) object=mspathCoefficients (inherited from: object=ANY) getGeneric(print) standardGeneric for print defined from package base function (x, ...) standardGeneric(print) environment: 0x84f2d88 Methods may be defined for arguments: x I've looked through the code for promptClass, but nothing popped out at me. It may be relevant that I'm running under ESS in emacs. However, I get the same results running R from the command line. Can anyone tell me what's going on here? This is with R 2.4, and I'm not currently using any namespace for my definitions. Thanks. Ross Boylan __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] promptClass misses methods (addendum)
On Thu, Nov 30, 2006 at 10:29:06PM -0800, Ross Boylan wrote: I've had repeated problems with promptClass missing methods, usually telling me a class has no methods when it does. In my current case, I've defined an S4 class mspathCoefficients with a print method setMethod(print, signature(x=mspathCoefficients), function(x, ...) { # etc It may also be relevant that there is a mspathCoefficients function, which constructs a member of the class. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Missing values for S4 slots [One Solution]
On Fri, Nov 24, 2006 at 11:23:14AM -0800, Ross Boylan wrote: Using R 2.4, the following fails: setClass(testc, representation(a=ANY)) makeC - function(myarg) new(testc, a=myarg) makeC() - Error in initialize(value, ...) : argument myarg is missing, with no default I suspect there's something I could do to get the constructor arguments, modify the list (i.e., delete args that were missing and insert new ones), and do.call(new, myArgList). Not only am I unsure how to do that (or if it would work), I'm hoping there's a better way. I didn't find a way to get all the arguments easily(*), but manually constructing the list works. Here are fragments of the code: mspathCoefficients - function( aMatrix, params, offset=0, baseConstrVec, covLabels # other args omitted ) { # innerArgs are used with do.call(new, innerArgs) innerArgs - list( Class = mspathCoefficients, aMatrix = aMatrix, baseConstrVec = as.integer(baseConstrVec), params = params ) # the next block inserts the covLabels argument # only if it is non-missing if (missing(covLabels)) { # } else { innerArgs$covLabels - covLabels } #... map - list() # fill in the map # add it to the arguments innerArgs$map - map do.call(new, innerArgs) } This calls new(mspathCoefficients, ...) with just the non-missing arguments. The constructed object has appropriately missing values in the slots (e.g., character(0) given a slot of type character). (*) Inside the function, arg - list(...) will capture the unnamed arguments, but I don't have any. as.list(sys.frame(sys.nframe()) is closer to what I was looking for, though all the values are promises. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] tail recursion in R
Apparently Scheme is clever and can turn certain apparently recursive function calls into into non-recursive evaluations. Does R do anything like that? I could find no reference to it in the language manual. What I'm wondering is whether there are desirable ways to express recursion in R. Thanks. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] \link to another package
In the documentation for my package I would like to reference the Rmpi documentation. I started with \link{Rmpi}, which caused R CMD check to complain that it could not resolve the link. Since Rmpi wasn't loaded, this isn't surprising. Ideally the user would see Rmpi, but the link would go to Rmpi's Rmpi-pkg. It's not clear to me if this is possible. I've combined two documented features to say \link[Rmpi=Rmpi-pkg]{Rmpi}. Is that OK? According to the 2.3.1 documentation \link[Rmpi]{Rmpi} gets me the Rmpi link in the Rmpi package and \line[=Rmpi-pkg]{Rmpi} gets me Rmpi-pkg. There is no mention of combining these two syntaxes. There is also a discussion of \link[Rmpi:bar]{Rmpi}, but that appears to refer to a *file* bar.html rather than an internal value (i.e., \alias{bar}). Not only R CMD check but some users of the package may not have Rmpi loaded, so I'd like the documentation to work gracefully in that situation. Gracefully means that if Rmpi is not loaded the help still shows; it does not mean that clicking on the link magically produces the Rmpi documentation. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] documenation duplication and proposed automatic tools
I've been looking at documenting S4 classes and methods, though I have a feeling many of these issues apply to S3 as well. My impression is that the documentation system requires or recommends creating basically the same information in several places. I'd like to explain that, see if I'm correct, and suggest that a more automated framework might make life easier. PROBLEM Consider a class A and a method foo that operates on A. As I understand it, I must document the generic function foo (or ?foo will not produce a response) and the method foo (or methods ? foo will not produce a response). Additionally, it appears to be recommended that I document foo in the Methods section of the documentation for class A. Finally, I may want to document the method foo with specific arguments (particularly if if uses unusual arguments, but presumably also if the semantics are different in a class that extends A). This seems like a lot of work to me, and it also seems error prone and subject to synchronization errors. R CMD check checks vanilla function documentation for agreement with the code, but I'm not sure that method documentation in other contexts gets much help. To complete the picture, suppose there is a another function, bar, that operates on A. B extends A, and reimplements foo, but not bar. I think the suggestion is that I go back and add the B-flavored method foo to the general methods documentation for foo. I also have a choice whether I should mention bar in the documentation for the class B. If I mention it, it's easier for the reader to grasp the whole interface that B presents. However, I make it harder to determine which methods implement new functionality. SOLUTION There are a bunch of things users of OO systems typically want to know: 1) the relations between classes 2) the methods implemented by a class (for B, just foo) 3) the interface provided by a class (for B, foo and bar) 4) the various implementations of a particular method All of these can be discovered dynamically by the user. The problem is that current documentation system attempts to reproduce this dynamic information in static pages. prompt, promptClass and promptMethods functions generate templates that contain much of the information (or at least there supposed to; they seem to miss stuff for me, for example saying there are no methods when there are methods). This is helpful, but has two weaknesses. First, the class developer must enter very similar information in multiple places (specifically, function, methods, and class documentation). Second, that information is likely to get dated as the classes are modified and extended. I think it would be better if the class developer could enter the information once, and the documentation system assemble it dynamically when the user asks a question. For example, if the user asks for documentation on a class, the resulting page would be contstructed by pulling together the class description, appropriate method descriptions, and links to classes the focal class extends (as well, possibly, as classes that extend it). Similarly, a request for methods could assemble a page out of the snippets documenting the individual methods, including links to the relevant classes. I realize that implementing this is not trivial, and I'm not necessarily advocating it as a priority. But I wonder how it strikes people. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 accessors
I'm trying to understand what the underlying issues are here--with the immediate goal of how that affects my design and documentation decisions. On Wed, Sep 27, 2006 at 02:08:34PM -0400, John Chambers wrote: Seth Falcon wrote: John Chambers [EMAIL PROTECTED] writes: There is a point that needs to be remembered in discussions of accessor functions (and more generally). We're working with a class/method mechanism in a _functional_ language. Simple analogies made from class-based languages such as Java are not always good guides. In the example below, a function foo that only operates on that class is not usually a meaningful concept in R. The sense of meaningful here is hard for me to pin down, even with the subsequent discussion. I think the import is more than formal: R is not strongly typed, so you can hand any argument to any function and the language will not complain. If foo is a generic and the only method defined is for class Bar, then the statement seems meaningful enough? This is not primarily a question about implementation but about what the user understands. IMO, a function should have an intuitive meaning to the user. Its name is taking up a global place in the user's brain, and good software design says not to overload users with too many arbitrary names to remember. It's true that clashing uses of the same name may lead to confusion, but that need not imply that functions must be applicable to all objects. Many functions only make sense in particular contexts, and sometimes those contexts are quite narrow. One of the usual motivations for an OO approach is precisely to limit the amount of global space taken up by, for example, functions that operate on the class (global in both the syntactic sense and in the inside your brain sense). Understanding a traditional OO system, at least for me, is fundamentally oriented to understanding the objects first, with the operations on them as auxiliaries. As you point out, this is just different from the orientation of a functional language, which starts with the functions. To be a bit facetious, if flag is a slot in class Bar, it's really not a good idea to define the accessor for that slot as flag - function(object)[EMAIL PROTECTED] Nor is the situation much improved by having flag() be a generic, with the only method being for class Bar. We're absconding with a word that users might think has a general meaning. OK, if need be we will have different flag() functions in different packages that have _different_ intuitive interpretations, but it seems to me that we should try to avoid that problem when we can. OTOH, it's not such an imposition to have accessor functions with a syntax that includes the name of the slot in a standardized way: get_flag(object) (I don't have any special attachment to this convention, it's just there for an example) I don't see why get_flag differs from flag; if flag lends itself to multiple interpretations or meanings, wouldn't get_flag have the same problem? Or are you referring to the fact that flag sounds as if it's a verb or action? That's a significant ambiguity, but there's nothing about it that is specific to a functional approach. Functions are first-class objects and in principle every function should have a function, a purpose. Methods implement that purpose for particular combinations of arguments. If this is a claim that every function should make sense for every object, it's asking too much. If it's not, I don't really see how a function can avoid having a purpose. The purpose of accessor functions is to get or set the state of the object. Accessor functions are therefore a bit anomalous. How? A given accessor function has the purpose of returning the expected data contained in an instance. It provides an abstract interface that decouples the structure of the class from the data it needs to provide to users. See above. That's true _if_ the name or some other syntactic sugar makes it clear the this is indeed an accessor function, but not otherwise. Aside from the fact that I don't see why get_flag is so different from flag, the syntactic sugar argument has another problem. The usually conceived purpose of accessors is to hide from the client the internals of the object. To take an example that's pretty close to one of my classes, I want startTime, endTime, and duration. Internally, the object only needs to hold 2 of these quantities to get the 3rd, but I don't want the client code to be aware of which choice I made. In particular, I don't what the client code to change from duration to get_duration if I switch to a representation that stored the duration as a slot. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 accessors
On Tue, 2006-09-26 at 10:43 -0700, Seth Falcon wrote: Ross Boylan [EMAIL PROTECTED] writes: If anyone else is going to extend your classes, then you are doing them a disservice by not making these proper methods. It means that you can control what happens when they are called on a subclass. My style has been to define a function, and then use setMethod if I want to redefine it for an extension. That way the original version becomes the generic. So I don't see what I'm doing as being a barrier to adding methods. Am I missing something? You are not, but someone else might be: suppose you release your code and I would like to extend it. I am stuck until you decide to make generics. This may be easier to do concretely. I have an S4 class A. I have defined a function foo that only operates on that class. You make a class B that extends A. You wish to give foo a different implementation for B. Does anything prevent you from doing setMethod(foo, B, function(x) blah blah) (which is the same thing I do when I make a subclass)? This turns my original foo into the catchall method. Of course, foo is not appropriate for random objects, but that was true even when it was a regular function. Originally I tried defining the original using setMethod, but this generates a complaint about a missing function; that's one reason I fell into this style. You have to create the generic first if it doesn't already exist: setGeneric(foo, function(x) standardGeneric(foo)) I wonder if it might be worth changing setMethod so that it does this by default when no existing function exists. Personally, that would fit the style I'm using better. For accessors, I like to document them in the methods section of the class documentation. This is for accessors that really are methods, not my fake function-based accessors, right? Which might be a further argument not to have the distinction in the first place ;-) To me, simple accessors are best documented with the class. If I have an instance, I will read help on it and find out what I can do with it. If you use foo as an accessor method, where do you define the associated function (i.e., \alias{foo})? I believe such a definition is expected by R CMD check and is desirable for users looking for help on foo (?foo) without paying attention to the fact it's a method. Yes you need an alias for the _generic_ function. You can either add the alias to the class man page where one of its methods is documented or you can have separate man pages for the generics. This is painful. S4 documentation, in general, is rather difficult and IMO this is in part a consequence of the more general (read more powerful) generic function based system. As my message indicates, I too am struggling with an appropriate documentation style for S4 classes and methods. Since Writing R Extensions has said Structure of and special markup for documenting S4 classes and methods are still under development. for as long as I cam remember, perhaps I'm not the only one. Some of the problem may reflect the tension between conventional OO and functional languages, since R remains the latter even under S4. I'm not sure if it's the tools or my approach that is making things awkward; it could be both! Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] S4 accessors
I have a small S4 class for which I've written a page grouping many of the accessors and replacement functions together. I would be interested in people comments on the approach I've taken. The code has a couple of decisions for which I could imagine alternatives. First, even simple get/set operations on class elements are wrapped in functions. I suppose I could just use [EMAIL PROTECTED] to do some of these operations, though that is considered bad style in more traditional OO contexts. Second, even though the functions are tied to the class, I've defined them as free functions rather than methods. I suppose I could create a generic that would reject most arguments, and then make methods appropriately. For the documentation, I've created a single page that groups many of the functions together. This is a bit awkward, since the return values are necessarily the same. Things are worse for replacement functions; as I understand it, they must use value for their final argument, but the value has different meanings and types in different contexts. Any suggestions or comments? I've attached the .Rd file in case more specifics would help. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 \name{runTime-accessors} \alias{startTime} \alias{endTime} \alias{wallTime} \alias{waitTime} \alias{cpuTime} \alias{mpirank} \alias{mpirank-} \alias{remoteTime-} \title{ Accessors for runTime class} \description{ Set and get runTime related information. } \usage{ startTime(runTime) endTime(runTime) wallTime(runTime) waitTime(runTime) cpuTime(runTime) mpirank(runTime) mpirank(runTime) - value remoteTime(runTime) - value } %- maybe also 'usage' for other objects documented here. \arguments{ \item{runTime}{a \code{runTime} object} \item{value}{for \code{mpirank}, the MPI rank of the associated job. For \code{remoteTime}, a vector of statistics from the remote processor: user cpu, system cpu, wall clock time for main job, wall clock time waiting for the root process.} } \details{ All times are measured from start of job. The sequence of events is that the job is created locally, started remotely, finished remotely, and completed locally. Scheduling and transmission delays may occur. \item{startTime}{When the job was created, locally.} \item{endTime}{When job finished locally.} \item{wallTime}{How many seconds between local start and completion.} \item{cpuTime}{Remote cpu seconds used, both system and user.} \item{waitTime}{Remote seconds waiting for response from the local system after the remote computation finished.} \item{mpirank}{The rank of the execution unit that handled the remote computation.} } \value{ Generally seconds, at a system-dependent resolution. \code{mpirank} is an integer. Replacement functions return the \code{runTime} object itself. } \author{Ross Boylan} \note{Clients that use replacement functions should respect the semantics above. } \seealso{\code{\link{runTime-class}}} \keyword{programming} \keyword{environment} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 accessors (corrected)
On Tue, 2006-09-26 at 00:20 +, Ross Boylan wrote: I have a small S4 class for which I've written a page grouping many of the accessors and replacement functions together. I would be interested in people comments on the approach I've taken. The code has a couple of decisions for which I could imagine alternatives. First, even simple get/set operations on class elements are wrapped in functions. I suppose I could just use [EMAIL PROTECTED] to do some of these operations, though that is considered bad style in more traditional OO contexts. Second, even though the functions are tied to the class, I've defined them as free functions rather than methods. I suppose I could create a generic that would reject most arguments, and then make methods appropriately. For the documentation, I've created a single page that groups many of the functions together. This is a bit awkward, since the return values are NOT necessarily the same. Things are worse for replacement functions; as I understand it, they must use value for their final argument, but the value has different meanings and types in different contexts. Any suggestions or comments? I've attached the .Rd file in case more specifics would help. Sorry! __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 classes and C
On Fri, 2006-05-19 at 11:46 +0200, Martin Maechler wrote: Seth == Seth Falcon [EMAIL PROTECTED] on Thu, 18 May 2006 12:22:36 -0700 writes: Seth Ross Boylan [EMAIL PROTECTED] writes: Is there any good source of information on how S4 classes (and methods) work from C? Hmm, yes; there's nothing in the Writing R Extensions manual, and there's not so much in the ``The Green Book'' (Chambers 1998), which is prominently cited by Doug Bates (and Saikat Debroy) in the paper given at DSC 2003 (mentioned earlier in this thread, http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/BatesDebRoy.pdf ) Particularly relevant is Appendix A.6 of the Green book. The index isn't particularly helpful finding it. Of course, Appendix A can't answer the question of how the R implementation might differ. For example, A.6 discusses GET_SLOT_OFFSET, which does not appear to be available in R. Thanks for the pointers. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 classes and C
On Thu, 2006-05-18 at 13:53 -0400, McGehee, Robert wrote: I believe the paper on which those lecture notes were based can be found here: http://www.ci.tuwien.ac.at/Conferences/DSC-2003/Drafts/BatesDebRoy.pdf Thank you. It looks as if it has some useful stuff in it. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Recommended style with calculator and persistent data
I have some calculations that require persistent state. For example, they retain most of the data across calls with different parameters. They retain parameters across calls with different subsets of the cases (this is for distributed computation). They retain early analysis of the problem to speed later computations. I've created an S4 object, and the stylized code looks like this calc - makeCalculator(a, b, c) calc - setParams(calc, d, e) calc - compute(calc) results - results(calc) The client code (such as that above) must remember to do the assignments, not just invoke the functions. I notice this does not seem to be the usual style, which is more like results - compute(calc) and possibly using assignment operators like params(calc) - x (actually, I have a call like that, but some of the updates take multiple arguments). Another route would be to use lexical scoping to bundle all the functions together (umm, I'm not sure how that would work with S4 methods) to ensure persistence without requiring assignment by the client code. Obviously this would decrease portability to S, but I'm already using lexical scope a bit. Is there a recommended R'ish way to approach the problem? My current approach works, but I'm concerned it is non-standard, and so would be unnatural for users. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] S4 documentation
1. promptClass generated a file that included \section{Methods}{ No methods defined with class mspathDistributedCalculator in the signature. } Yet there are such methods. Is this a not-working yet feature, or is something funny going on (maybe I have definitions in the library and in the global workspace...)? 2. Is the \code{\link{myS4class-class}} the proper way to cross-reference a class? \code{\link{myS4method-method}} the right way to refer to a method? I looked for something like \class or \method to correspond to \code, but didn't see anything. 3. This question goes beyond documentation. I have been approaching things like this: setClass(A, ) foo - function(a) setClass(B, ...) setMethod(foo, B, ) so the first foo turns into the default function for the generic. This was primarily motivated by discovering that setMethod(foo, A) where I have the first function definition produced an error. Is this a reasonable way to proceed? Then do I document the generic with standard function documentation for foo? Are there some examples I could look at? When I want to refer to the function generically, how do I do that? I'm using R 2.2.1, and I've found the documentation on documenting S4 a bit too brief (even after looking at the recommended links, which in some cases don't have much on documentation). Since the docs say it's a work in progress, I'm hoping to get the latest word here. Thanks. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] checkpointing
Here's some code I put together for checkpointing a function being optimized. Hooking directly into optim would require modifying its C code, so this seemed the easiest route. I've wanted more information on the iterations than is currently provided, so this stuff some info back in the calling environment (by default). # wrapper to do checkpointing # Ross Boylan [EMAIL PROTECTED] # 06-Jan-2006 # (C) 2006 Regents of University of California # Distributed under the Gnu Public License v2 or later at your option # If you want to checkpoint the optimization of a function f # Use checkpoint(f) instead. See below for other possible arguments. # default operation for checkpoint(fnfoo) is to record the iterations # in fnfoo.trace in the calling environment # WARNING: Any existing variable with name in argument name # will be deleted from the indicated frame checkpoint - function(f, name = paste(substitute(f), .trace, sep=), fileName = substitute(f), nCalls = 1, nTime = 60*15, frame = parent.frame()) { # f is the objective function # frame is where to put the variable name # name will be a data.frame with rows containing # iteration, time, value, parameters # fileName is the stem of the name to save for checkpointing # saving will alternate between files with 0 and 1 appended # Saving to disk will happen every nCalls or nTime seconds, # whichever comes first if (exists(name, where=frame)) rm(list=name, pos=frame) ckpt.lastSave - 0 # alternate 0/1 for file to write to ckpt.lastTime - Sys.time() # last time saved function(params, ...) { p - as.list(params) names(p) - seq(length(params)) if (exists(name, where=frame, inherits=FALSE)) { progress - get(name, pos=frame) progress - rbind(progress, data.frame(row.names=dim(progress)[1]+1, time=Sys.time(), val=NA, p), deparse.level=0) } else progress - data.frame(row.names=1, time=Sys.time(), val=NA, p) n - dim(progress)[1] # write to disk if (n%%nCalls == 0 || progress[n, 1]- ckpt.lastTime nTime) { ckpt.lastSave - (ckpt.lastSave+1) %% 2 save(progress, file=paste(fileName, ckpt.lastSave, sep=)) ckpt.lastTime - progress[n, 1] } v - f(params, ...) progress[n, 2] - v assign(name, progress, pos=frame) v } } __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] checkpointing
I would like to checkpoint some of my calculations in R, specifically those using optim. As far as I can tell, R doesn't have this facility, and there seems to have been little discussion of it. checkpointing is saving enough of the current state so that work can resume where things were left off if, to take my own example, the system crashes after 8 days of calculation. My thought is that this could be added as an option to optim as one of the control parameters. I thought I'd check here to see if anyone is aware of any work in this area or has any thoughts about how to proceed. In particular, is save a reasonable way to save a few variables to disk? I could also make the code available when/if I get it working. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] external pointers
I have some C data I want to pass back to R opaquely, and then back to C. I understand external pointers are the way to do so. I'm trying to find how they interact with garbage collection and object lifetime, and what I need to do so that the memory lives until the calling R process ends. Could anyone give me some pointers? I haven't found much documentation. An earlier message suggested looking at simpleref.nw, but I can't find that file. So the overall pattern, from R, would look like opaque - setup(arg1, arg2, ) # setup calls a C fn docompute(arg1, argb, opaque) # many times. docompute also calls C # and then when I return opaque and the memory it's wrapping get #cleaned up. If necessary I could do teardown(opaque) # at the end C is actually C++ via a C interface, if that matters. In particular, the memory allocated will likely be from the C++ run-time, and needs C++ destructors. -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 [EMAIL PROTECTED] Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] problem with \eqn (PR#8322)
On Mon, 2005-11-21 at 10:27 +, Hin-Tak Leung wrote: Kurt Hornik wrote: snipped Definitely a problem in Rdconv. E.g., $ cat foo.Rd \description{ \eqn{{A}}{B} } [EMAIL PROTECTED]:~/tmp$ R-d CMD Rdconv -t latex foo.Rd | grep eqn \eqn{{A}}{A}{{B} shows what is going on. There is a work-around - putting extra spaces between the two braces: $ cat foo.Rd \description{ \eqn{ {A} }{B} } $R CMD Rdconv -t latex foo.Rd \HeaderA{}{}{} \begin{Description}\relax \eqn{ {A} }{B} \end{Description} HT Terrific! I can confirm that works for me and, in a way, a work-around is better than a fix. With the work-around, I can distribute the package without needing to require that people get some not-yet-release version of R that fixes the problem. I do hope the problem gets fixed though :) By the way, I couldn't see how the perl code excerpted earlier paid any attention to {}. But perl is not my native tongue. Ross __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel