[Rd] S4 Objects [Sec=Unclassified]

2009-06-02 Thread Troy Robertson
I am new to R programming but have dived into a medium sized modelling software 
development project.

Having come from a Java OO background I have a couple of questions about S4 
objects.



Is there a way to make S4 slots (and methods) private and hence force the use 
of accessor methods?



Is there a straight-forward way to implement pass-by-reference for method 
parameters?

I am currently returning and overwritting updated objects which is clunky and 
costly and would like a more efficient way of doing this.



Can anyone point me to some useful texts on S4 programming apart from the 
following:

Chambers - Software for Data Analysis: Programming with R

Venables - S Programming



Thanks heaps



Troy






___

Australian Antarctic Division - Commonwealth of Australia
IMPORTANT: This transmission is intended for the addressee only. If you are not 
the
intended recipient, you are notified that use or dissemination of this 
communication is
strictly prohibited by Commonwealth law. If you have received this transmission 
in error,
please notify the sender immediately by e-mail or by telephoning +61 3 6232 
3209 and
DELETE the message.
Visit our web site at http://www.antarctica.gov.au/
___

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reference counting bug: overwriting for loop 'seq' variable

2009-06-02 Thread Wacek Kusnierczyk
William Dunlap wrote:
 It looks like the 'seq' variable to 'for' can be altered from
 within the loop, leading to incorrect answers.  E.g., in
 the following I'd expect 'sum' to be 1+2=3, but R 2.10.0
 (svn 48686) gives 44.5.

 x = c(1,2);  sum = 0; for (i in x) { x[i+1] = i + 42.5; sum = sum +
 i }; sum
[1] 44.5
 or, with a debugging cat()s,
 x = c(1,2);  sum = 0; for (i in x) { cat(before, i=, i, \n);
 x[i+1] = i + 42.5; cat(after, i=, i,\n); sum = sum + i }; sum
before, i= 1
after, i= 1
before, i= 43.5
after, i= 43.5
[1] 44.5
  
 If I force the for's 'seq' to be a copy of x by adding 0 to it, then I
 do get the expected answer.

 x = c(1,2);  sum = 0; for (i in x+0) { x[i+1] = i + 42.5; sum = sum
 + i }; sum
b[1] 3

 It looks like an error in reference counting. 
   

indeed;  seems like you've hit the issue of when r triggers data
duplication and when it doesn't, discussed some time ago in the context
of names() etc.  consider:

x = 1:2
for (i in x)
   x[i+1] = i-1
x
# 1 0 1

y = c(1, 2)
for (i in y)
   y[i+1] = i-1
y
# -1 0


vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] formal argument envir matched by multiple actual arguments

2009-06-02 Thread hpages

In fact reg.finalizer() looks like a dangerous feature.

If the finalizer itself triggers (implicitely or
explicitely) garbage collection, then bad things happen.
In the following example, garbage collection is triggered
explicitely (using R-2.9.0):

   setClass(B, representation(bb=environment))

   newB - function()
   {
 ans - new(B, bb=new.env())
 reg.finalizer(a...@bb,
   function(e)
   {
   gc()
   cat(cleaning, class(ans), object...\n)
   }
 )
 return(ans)
   }

for (i in 1:500) {cat(i, \n); b1 - newB()}
   1
   2
   3
   4
   5
   6
   ...
   13
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   14
   ...
   169
   170
   171
   Error: not a weak reference
   Error: not a weak reference
   [repeat the above line thousands of times]
   ...
   Error: not a weak reference
   Error: not a weak reference
   cleaning B object...
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   [repeat the above line thousands of times]
   ...
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   172
   ...
   246
   247
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...

*** caught segfault ***
   address 0x41, cause 'memory not mapped'

   Traceback:
1: gc()
2: function (e) {gc()cat(cleaning, class(ans),  
object...\n)}(environment)


   Possible actions:
   1: abort (with core dump, if enabled)
   2: normal R exit
   3: exit R without saving workspace
   4: exit R saving workspace
   Selection: 2
   Save workspace image? [y/n/c]: n
   Segmentation fault

So apparently, if the finalizer triggers garbage collection,
then we can end up with a corrupted session. Then anything can
happen, from the strange 'formal argument envir matched by
multiple actual arguments' error I reported in the previous post,
to a segfault. In the worse case, nothing apparently happens but
the output produced by the code is wrong.

Maybe garbage collection requests should be ignored during the
execution of the finalizer? (and more generally during garbbage
collection itself)

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] The default position of plot title

2009-06-02 Thread Ronggui Huang
Dear R-developers,

It seems to me that the position of title is usually at the bottom of
a plot in sociological and political science books and articles. I
wonder if the same convention applies in other disciplines. If yes, is
it reasonable to change the default position of main title of plot
function?

-- 
HUANG Ronggui, Wincent
PhD Candidate
Dept of Public and Social Administration
City University of Hong Kong
Home page: http://asrr.r-forge.r-project.org/rghuang.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] formal argument envir matched by multiple actual arguments

2009-06-02 Thread Henrik Bengtsson
Hi.

2009/6/1 Hervé Pagès hpa...@fhcrc.org:
 Hi list,

 This looks similar to the problem reported here
  https://stat.ethz.ch/pipermail/r-devel/2006-April/037199.html
 by Henrik Bengtsson a long time ago. It is very sporadic and
 non-reproducible.
 Henrik, do you remember if your code was using reg.finalizer()?
 I tend to suspect it but I'm not sure.

Yes.  This was/is observed with object extending the Object class of
R.oo, and the constructor of Object use reg.finalizer() [which then
calls finalize() that can be overloaded].  The fact that the garbage
collector is involved could explain why this bug(?) is hard to
reproduce.

It's been a while since I saw this problem (and we do instantiate way
more Object:s these days).  Looking at my source code comments and the
post you refers to, I suspect that I manage to circumvent the issue by
the following trick (looking at my code, I have several of those
statements):

envir2 - envir
get(name, envir=envir2)

Also, on March 6, 2008 I reported to R-devel on a related problem with '%in%':

  http://tolstoy.newcastle.edu.au/R/e4/devel/08/03/0708.html

That one I circumvent by now only using is.element(a,b) instead of a %in% b.

Maybe this gives you further clues.

/Henrik

BTW. You need to be careful when you register a finalizer and that
uses code in a package, which may have been detached.  This may cause
an error in the finalizer which can give further side effects.  See
here:

  http://tolstoy.newcastle.edu.au/R/e2/devel/07/08/4251.html



 I've been hunting this bug for months but today, and we the help of other
 Bioconductor users, I was able to isolate it and to write some code that
 seems to almost reproduce it (i.e. not systematically but most of the
 times).

 (Just to put some context to the code below: it's a simplified version
 of some more complex code that we use in Bioconductor to manage memory
 caching of some big objects stored on disk. The idea is that objects of
 class A can be named. All A objects with the same name form a group.
 The code below implements a simple mechanism to trigger some action when
 a group is completely removed from memory i.e. when the last object in
 a group is garbage collected.)


  setClassUnion(environmentORNULL, c(environment, NULL))

  setClass(A,
    representation(
      aa=integer,
      groupname=character,
      groupanchor=environmentORNULL
    )
  )

  .A.group.sizes - new.env(hash=TRUE, parent=emptyenv())

  .inc.A.group.size - function(groupname)
  {
    group.size - 1L
    if (exists(groupname, envir=.A.group.sizes, inherits=FALSE))
        group.size - group.size +
                      get(groupname, envir=.A.group.sizes, inherits=FALSE)
    assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE)
  }

  .dec.A.group.size - function(groupname)
  {
    group.size - get(groupname, envir=.A.group.sizes, inherits=FALSE) - 1L
    assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE)
    return(group.size)
  }

  newA - function(groupname=)
  {
    a - new(A, groupname=groupname)
    if (!identical(groupname, )) {
        .inc.A.group.size(groupname)
        groupanchor - new.env(parent=emptyenv())
        reg.finalizer(groupanchor,
                      function(e)
                      {
                          group.size - .dec.A.group.size(groupname)
                          if (group.size == 0L) {
                              cat(no more object of group,
                                  groupname, in memory\n)
                              # take some action
                          }
                      }
        )
       �...@groupanchor - groupanchor
    }
    return(a)
  }


 The following commands seem to trigger the problem:

   for (i in 1:2000) {a1 - newA(group1)}
   as.list(.A.group.sizes)
   gc()
   as.list(.A.group.sizes)
   for (i in 1:2000) {a2 - newA(group2)}
  Error in assign(.Method, method, envir = envir) :
    formal argument envir matched by multiple actual arguments

 If it doesn't, then adding more rounds should finally do it:

  gc()
  for (i in 1:2000) {a3 - newA(group3)}
  gc()
  for (i in 1:2000) {a4 - newA(group4)}

  etc...

 Thanks in advance for any help with this!

 H.

 sessionInfo()
 R version 2.9.0 (2009-04-17)
 x86_64-unknown-linux-gnu

 locale:
 LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base


 --
 Hervé Pagès

 Program in Computational Biology
 Division of Public Health Sciences
 Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N, M2-B876
 P.O. Box 19024
 Seattle, WA 98109-1024

 E-mail: hpa...@fhcrc.org
 Phone:  (206) 667-5791
 Fax:    (206) 667-1319

 __
 R-devel@r-project.org mailing list
 

[Rd] Bug in so_strsplit (PR#13742)

2009-06-02 Thread waku
Full_Name: Wacek Kusnierczyk
Version: 2.10.0 r48689
OS: Ubuntu 8.04 Linux 32b
Submission from: (NULL) (129.241.199.78)


src/main/character.c:435-438 (do_strsplit) contains the following code:

for (i = 0; i  tlen; i++)
if (getCharCE(STRING_ELT(tok, 0)) == CE_UTF8) use_UTF8 = TRUE;
for (i = 0; i  len; i++)
if (getCharCE(STRING_ELT(x, 0)) == CE_UTF8) use_UTF8 = TRUE;

both loops iterate over loop-invariant expressions and statements.
either the loops are redundant, or the fixed index '0' is copied over from some
other place and should be replaced with 'i'.

the bug can be fixed with 

for (i = 0; i  tlen; i++)
if (getCharCE(STRING_ELT(tok, i)) == CE_UTF8) {
use_UTF8 = TRUE;
break; }
for (i = 0; i  len; i++)
if (getCharCE(STRING_ELT(x, i)) == CE_UTF8) {
use_UTF8 = TRUE;
break; }

or with

   #define CHECK_CE(CHARACTER, LENGTH, USEUTF8) \
  for (i = 0; i  (LENGTH); i++) \
 if (getCharCE(STRING_ELT((CHARACTER), i)) == CE_UTF8) { \
(USEUTF8) = TRUE; \
break; }
   CHECK_CE(tok, tlen, use_UTF8)
   CHECK_CE(x, len, use_UTF8)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] formal argument envir matched by multiple actual arguments

2009-06-02 Thread Henrik Bengtsson
Nice case - I think you're onto something. /Henrik

2009/6/2  hpa...@fhcrc.org:
 In fact reg.finalizer() looks like a dangerous feature.

 If the finalizer itself triggers (implicitely or
 explicitely) garbage collection, then bad things happen.
 In the following example, garbage collection is triggered
 explicitely (using R-2.9.0):

   setClass(B, representation(bb=environment))

   newB - function()
   {
     ans - new(B, bb=new.env())
     reg.finalizer(a...@bb,
                   function(e)
                   {
                       gc()
                       cat(cleaning, class(ans), object...\n)
                   }
     )
     return(ans)
   }

    for (i in 1:500) {cat(i, \n); b1 - newB()}
   1
   2
   3
   4
   5
   6
   ...
   13
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   14
   ...
   169
   170
   171
   Error: not a weak reference
   Error: not a weak reference
   [repeat the above line thousands of times]
   ...
   Error: not a weak reference
   Error: not a weak reference
   cleaning B object...
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   [repeat the above line thousands of times]
   ...
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
   172
   ...
   246
   247
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...
   cleaning B object...

    *** caught segfault ***
   address 0x41, cause 'memory not mapped'

   Traceback:
    1: gc()
    2: function (e) {    gc()    cat(cleaning, class(ans),
 object...\n)}(environment)

   Possible actions:
   1: abort (with core dump, if enabled)
   2: normal R exit
   3: exit R without saving workspace
   4: exit R saving workspace
   Selection: 2
   Save workspace image? [y/n/c]: n
   Segmentation fault

 So apparently, if the finalizer triggers garbage collection,
 then we can end up with a corrupted session. Then anything can
 happen, from the strange 'formal argument envir matched by
 multiple actual arguments' error I reported in the previous post,
 to a segfault. In the worse case, nothing apparently happens but
 the output produced by the code is wrong.

 Maybe garbage collection requests should be ignored during the
 execution of the finalizer? (and more generally during garbbage
 collection itself)

 Cheers,
 H.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] cryptic error message from R CMD check

2009-06-02 Thread Marco Scutari
Dear R developers,

I've run into a very cryptic error message from R CMD check
while working on a new package. This is the relevant output:

[fiz...@~/Rmap]:R CMD check Rmap
* checking for working pdflatex ... OK
* using log directory '/home/fizban/Rmap/Rmap.Rcheck'
* using R version 2.9.0 (2009-04-17)
* using session charset: UTF-8
* checking for file 'Rmap/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'Rmap' version '0.1'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking for .dll and .exe files ... OK
* checking whether package 'Rmap' can be installed ... ERROR
Installation failed.
See '/home/fizban/Rmap/Rmap.Rcheck/00install.out' for details.
[fiz...@~/Rmap]:cat /home/fizban/Rmap/Rmap.Rcheck/00install.out
* Installing *source* package ‘Rmap’ ...
** R
** preparing package for lazy loading
** help
*** installing help indices
Error in `[.data.frame`(M, , 4) : undefined columns selected
* Removing ‘/home/fizban/Rmap/Rmap.Rcheck/Rmap’

R CMD build + INSTALL fails in the same way:

[fiz...@~/Rmap]:R CMD build Rmap
* checking for file 'Rmap/DESCRIPTION' ... OK
* preparing 'Rmap':
* checking DESCRIPTION meta-information ... OK
* removing junk files
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
WARNING: directory 'Rmap/man' is empty
* building 'Rmap_0.1.tar.gz'
[fiz...@~/Rmap]:sudo R CMD INSTALL Rmap_0.1.tar.gz
* Installing to library ‘/usr/local/lib/R/site-library’
* Installing *source* package ‘Rmap’ ...
** R
** preparing package for lazy loading
** help
*** installing help indices
Error in `[.data.frame`(M, , 4) : undefined columns selected
* Removing ‘/usr/local/lib/R/site-library/Rmap’

The error is easily reproducible, as it's caused by the lack
of Rd documentation in the man directory; adding even one
Rd file solves the problem. It's clearly a corner case (yes. I'm
lazy and should have written the documentation a long time
ago), but if you think it's worth the time it would be better to
have a clearer error message from R CMD check.

P.S.: this is on an updated Debian Sid:

[fiz...@~/Rmap]:dpkg -l | grep ii  r- | grep -v cran
ii  r-base   2.9.0-4
GNU R statistical computation and graphics system
ii  r-base-core  2.9.0-4
GNU R core of statistical computation and graphics system
ii  r-recommended2.9.0-4
GNU R collection of recommended packages [metapackage]

Thanks for your time,
  Marco Scutari

-- 
Marco Scutari, Ph.D. Student
Department of Statistical Sciences
University of Padova, Italy
Facts don't care if you feel good about them. Slashdot, 25/10/07

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 Objects [Sec=Unclassified]

2009-06-02 Thread Martin Morgan
Hi Troy --

Troy Robertson wrote:
 I am new to R programming but have dived into a medium sized modelling 
 software development project.
 
 Having come from a Java OO background I have a couple of questions about S4 
 objects.
 
 
 
 Is there a way to make S4 slots (and methods) private and hence force the use 
 of accessor methods?


No, except by convention (e.g., 'don't directly access slots
name-mangled in this way'; non-package code must never directly access
slots')

 
 
 
 Is there a straight-forward way to implement pass-by-reference for method 
 parameters?
 
 I am currently returning and overwritting updated objects which is clunky and 
 costly and would like a more efficient way of doing this.

no, copy-on-change is the most common semantic in R; using an
'environment' provides some flexibility, but use with S4 introduces
twists. See this concurrent thread

  https://stat.ethz.ch/pipermail/r-help/2009-June/200038.html

(my 2 cents:) embracing rather than avoiding the paradigm might lead to
different designs, e.g., 'column-oriented' (an S4 instance representing
an entire table) rather than row-oriented (an S4 instance for each row)
data structures.


 
 
 Can anyone point me to some useful texts on S4 programming apart from the 
 following:
 
 Chambers - Software for Data Analysis: Programming with R
 
 Venables - S Programming
 

Gentleman, R Programming for Bioinformatics.

Hope that helps.

Martin
 
 
 Thanks heaps
 
 
 
 Troy
 
 
 
 
 
 
 ___
 
 Australian Antarctic Division - Commonwealth of Australia
 IMPORTANT: This transmission is intended for the addressee only. If you are 
 not the
 intended recipient, you are notified that use or dissemination of this 
 communication is
 strictly prohibited by Commonwealth law. If you have received this 
 transmission in error,
 please notify the sender immediately by e-mail or by telephoning +61 3 6232 
 3209 and
 DELETE the message.
 Visit our web site at http://www.antarctica.gov.au/
 ___
 
   [[alternative HTML version deleted]]
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] warning: some HTML links ...

2009-06-02 Thread Daniel Wright
Hi Everyone,

I am getting a warning when I build and package from windows. The warning is 
some HTML links may not be found. Through some searching I found this 
probably has to do with the Microsoft HTML Help Workshop, but I have installed 
it (a few times) from the different locations in the manual and have the path 
as listed. When I install the package in my R session all the help files (which 
I assumes is what the error is about) work fine as do the functions.

Has anybody else had this problem and have found a solution? 

Dan


Daniel B. Wright
Psychology 
Florida International University
11200 S.W. 8th Street 
Miami, FL 33199, USA 

http://www.fiu.edu/~dwright/   
dwri...@fiu.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] The default position of plot title

2009-06-02 Thread Ben Bolker



ronggui-2 wrote:
 
 Dear R-developers,
 
 It seems to me that the position of title is usually at the bottom of
 a plot in sociological and political science books and articles. I
 wonder if the same convention applies in other disciplines. If yes, is
 it reasonable to change the default position of main title of plot
 function?
 
 

  I'm not an R developer, but:

 * in my field (biology/ecology), it's at the top.
 * it doesn't seem that hard to put the titles where you want with mtext()
 * changing this would have major backward-compatibility/surprise issues
for all the other users who expect the title to be at the top ...

  Ben Bolker

-- 
View this message in context: 
http://www.nabble.com/The-default-position-of-plot-title-tp23828342p23831792.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Recommendations for a quick UI.

2009-06-02 Thread Thomas Baier
Alex,

Kevin R. Coombes wrote:
 The following idea only partially answers your question
 
 I have successfully written a GUI using the tcl/tk package
 that ships with standard R. It is then possible (in Windows)
 to create a shortcut icon that runs the following command:
 C:\R\R-2.8.1\bin\R.exe --vanilla -e
 library(SuperCurveGUI);sc(); Note two features:
 [1] the first part of the -e switch loads the library
 containing the GUI [2] the second part (after the semicolon) launches
 the GUI 
 
 If you make a normal shortcut this way, a batch window will
 open showing the ongoing R session, which is not quite what you want.
 However, if you adjust the shortcut to Run: Minimized, then
 (most) users will never see the batch window, and will only
 see the GUI.
 
 The reasons that this only partially answers your question
 are [1] It is Windows-specific [2] I do not know how to set
 up the shortcut automatically upon installation.

depending on how deep you want to dig into programming (aside from R), you
could use any COM client (on Windows), e.g., Visual Basic, C# using
statconnDCOM (just download and install the package rcom) or using Java
(RServe)

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] qpois documentation (PR#13743)

2009-06-02 Thread Jerry . Lewis
Full_Name: Jerry W. Lewis
Version: 2.9.0
OS: Windows XP Professional
Submission from: (NULL) (166.186.168.103)


Quantiles for discrete distributions are consitently implemented, but
inconsitently documented.  Help for qpois incorrectly states in the Details
section that
  The quantile is left continuous: qgeom(q, prob) is the largest integer x such
that P(X = x)  q.
which disagrees with the implementation; it should read
  The quantile is defined as the smallest value x such that F(x) = p, where F
is the distribution function.
Also, this definition should be added to Help for qhyper.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre (was: odd behavior out of setdiff)

2009-06-02 Thread Stavros Macrakis
On Sat, May 30, 2009 at 11:59 AM, Stavros Macrakis macra...@alum.mit.eduwrote:

 Since R is object-oriented, data frame set operations should be the natural
 operations for their class.  There are, I suppose, two natural ways: the
 column-wise (variable-wise) and the row-wise (observation-wise) one.  The
 row-wise one seems more natural and more useful to me.
 ...

 The row-wise interpretation makes sense in cases where observations with
 the same values for all variables can be considered redundant.  That seems
 to me a much more useful interpretation.  The union, intersection, and set
 difference of two sets of observations would seem to all be highly useful.


Another argument for the row-wise interpretation: the `subset` function
(also part of base) works that way on data frames.

Interestingly, %in%/match appears to work neither row-wise nor column-wise:

 1 %in% data.frame(a=1:3)  # FALSE  (would be true if row-wise)
 1:3 %in% data.frame(a=1:3) # FALSE FALSE FALSE (would be true if
column-wise)

but simply treats the data frame as a *character* list:

 1 %in% data.frame(a=2,b=1)  # TRUE
 '1' %in% data.frame(a=2,b=1)  # TRUE
 1 %in% data.frame(a=2:3,b=1:2) # FALSE
 1:3 %in% data.frame(a=2:4,b=1:3)  # FALSE FALSE FALSE
 '1:3' %in% data.frame(a=2:4,b=1:3)  # TRUE

This specification is clearly documented in ? match, but I am mystified by
it.  Perhaps someone from R core can shed light on the rationale?

  -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reference counting bug: overwriting for loop 'seq' variable

2009-06-02 Thread luke

Thanks for the report.  Should be fixed in teh devel and 2.9 branches.

luke

On Mon, 1 Jun 2009, William Dunlap wrote:


It looks like the 'seq' variable to 'for' can be altered from
within the loop, leading to incorrect answers.  E.g., in
the following I'd expect 'sum' to be 1+2=3, but R 2.10.0
(svn 48686) gives 44.5.

   x = c(1,2);  sum = 0; for (i in x) { x[i+1] = i + 42.5; sum = sum +
i }; sum
  [1] 44.5
or, with a debugging cat()s,
   x = c(1,2);  sum = 0; for (i in x) { cat(before, i=, i, \n);
x[i+1] = i + 42.5; cat(after, i=, i,\n); sum = sum + i }; sum
  before, i= 1
  after, i= 1
  before, i= 43.5
  after, i= 43.5
  [1] 44.5

If I force the for's 'seq' to be a copy of x by adding 0 to it, then I
do get the expected answer.

   x = c(1,2);  sum = 0; for (i in x+0) { x[i+1] = i + 42.5; sum = sum
+ i }; sum
  b[1] 3

It looks like an error in reference counting.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Recommendations for a quick UI.

2009-06-02 Thread Stavros Macrakis
On Mon, Jun 1, 2009 at 9:21 AM, Martin Maechler
maech...@stat.math.ethz.chwrote:

  AB == Alex Bokov bo...@uthscsa.edu  on Mon, 01 Jun 2009 00:24:58
 -0500 writes:

AB I'm trying to wrap my R package in a GUI such that when
AB the user launches the app, they see my GUI window and
AB never interact with the R console at all



 There's a dedicated Special Interest Group mailing list for
 answering / discussing such questions : R-SIG-GUI


I would also be interested in the answer to this.

My impression is that the R-sig-gui is mostly about graphical programming
environments for R rather than about building GUI applications on top of R,
though of course there is some overlap.

I have recently started playing with R.rsp and it seems to provide a fairly
simple solution for developing GUIs if you have some familiarity with
generating Web pages dynamically (cf. ASP, JSP, etc.); R.rsp lets you build
a dynamic Web page powered by R.  It includes its own asynchronous Web
server.  To get started:

 install.packages('R.rsp')
 library(R.rsp)
 browseRsp()

This will bring up the R.rsp documentation in a Web browser.

You can then edit rsp files in   .../r/win-library/2.8/R.rsp/rsp and run
them.  It is even pretty straightforward to include plotting output, though
the solution demonstrated in figures.rsp has a problem: either all users of
the server share the same set of plot files (so one user's output will
overwrite another's) or there will be an ever-growing collection of old plot
files, with no mechanism for culling them.  You can imagine various ways
around this, but as far as I know R.rsp doesn't support them directly.

  -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] formal argument envir matched by multiple actual arguments

2009-06-02 Thread luke

On Tue, 2 Jun 2009, Henrik Bengtsson wrote:


Nice case - I think you're onto something. /Henrik

2009/6/2  hpa...@fhcrc.org:

In fact reg.finalizer() looks like a dangerous feature.

If the finalizer itself triggers (implicitely or
explicitely) garbage collection, then bad things happen.
In the following example, garbage collection is triggered
explicitely (using R-2.9.0):

  setClass(B, representation(bb=environment))

  newB - function()
  {
    ans - new(B, bb=new.env())
    reg.finalizer(a...@bb,
                  function(e)
                  {
                      gc()
                      cat(cleaning, class(ans), object...\n)
                  }
    )
    return(ans)
  }

   for (i in 1:500) {cat(i, \n); b1 - newB()}
  1
  2
  3
  4
  5
  6
  ...
  13
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  14
  ...
  169
  170
  171
  Error: not a weak reference
  Error: not a weak reference
  [repeat the above line thousands of times]
  ...
  Error: not a weak reference
  Error: not a weak reference
  cleaning B object...
  Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
  Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
  [repeat the above line thousands of times]
  ...
  Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
  Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer'
  172
  ...
  246
  247
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...
  cleaning B object...

   *** caught segfault ***
  address 0x41, cause 'memory not mapped'

  Traceback:
   1: gc()
   2: function (e) {    gc()    cat(cleaning, class(ans),
object...\n)}(environment)

  Possible actions:
  1: abort (with core dump, if enabled)
  2: normal R exit
  3: exit R without saving workspace
  4: exit R saving workspace
  Selection: 2
  Save workspace image? [y/n/c]: n
  Segmentation fault

So apparently, if the finalizer triggers garbage collection,
then we can end up with a corrupted session. Then anything can
happen, from the strange 'formal argument envir matched by
multiple actual arguments' error I reported in the previous post,
to a segfault. In the worse case, nothing apparently happens but
the output produced by the code is wrong.

Maybe garbage collection requests should be ignored during the
execution of the finalizer? (and more generally during garbbage
collection itself)


Thanks for the report.  The gc proper does not (or should not) do
anything that could cause allocation or trigger another gc.  The gc
proper only identifies objects ready for finalization; running the
finalizers happens outside the gc proper where allocation and gc calls
should be safe.  This looks like either a missing PROTECT call in the
code for running finalizers or possibly a more subltle bug in managing
the lists of objects in different states of finalization. I will look
more carefully when I get a chance.

luke




Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread Wacek Kusnierczyk
Stavros Macrakis wrote:

  '1:3' %in% data.frame(a=2:4,b=1:3)  # TRUE
   

utterly weird.  so what would x have to be so that

x %in% data.frame('a')
# TRUE

hint: 

'1' %in% data.frame(1)
# TRUE

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread William Dunlap
%in% is a thin wrapper on a call to match().  match() is
not a generic function (and is not documented to be one),
so it treats data.frames as lists, as their underlying
representation is a list of columns.  match is documented
to convert lists to character and to then run the character
version of match on that character data.  match does not
bail out if the types of the x and table arguments don't match
(that would be undesirable in the integer/numeric mismatch case).
Hence
   '1' %in% data.frame(1) # - TRUE
is acting consistently with
   match(as.character(pi), c(1, pi, exp(1))) # - 2
and
   1L %in% c(1.0, 2.0, 3.0) # - TRUE

The related functions, duplicated() and unique(), do have
row-wise data.frame methods.  E.g.,
duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
   [1] FALSE FALSE FALSE FALSE  TRUE
Perhaps match() ought to have one also.  S+'s match is generic
and has a data.frame method (which is row-oriented) so there we get:
 match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]),
data.frame(x=1:10,y=letters[1:10]))
   [1] 1 3 5
is.element(data.frame(x=1:10,y=letters[1:10]),
data.frame(x=c(1,3,5), y=letters[c(1,3,5)]))
[1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE

I think that %in% and is.element() ought to remain calls to match()
and that if you want them to work row-wise on data.frames then
match should get a data.frame method.


Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 -Original Message-
 From: r-devel-boun...@r-project.org 
 [mailto:r-devel-boun...@r-project.org] On Behalf Of Wacek Kusnierczyk
 Sent: Tuesday, June 02, 2009 9:11 AM
 To: Stavros Macrakis
 Cc: r-devel@r-project.org; dwinsem...@comcast.net
 Subject: Re: [Rd] setdiff bizarre
 
 Stavros Macrakis wrote:
 
   '1:3' %in% data.frame(a=2:4,b=1:3)  # TRUE

 
 utterly weird.  so what would x have to be so that
 
 x %in% data.frame('a')
 # TRUE
 
 hint: 
 
 '1' %in% data.frame(1)
 # TRUE
 
 vQ
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre (was: odd behavior out of setdiff)

2009-06-02 Thread Barry Rowlingson
On Tue, Jun 2, 2009 at 4:13 PM, Stavros Macrakis macra...@alum.mit.edu wrote:

 but simply treats the data frame as a *character* list:

     1 %in% data.frame(a=2,b=1)  # TRUE
     '1' %in% data.frame(a=2,b=1)  # TRUE
     1 %in% data.frame(a=2:3,b=1:2) # FALSE
     1:3 %in% data.frame(a=2:4,b=1:3)  # FALSE FALSE FALSE
     '1:3' %in% data.frame(a=2:4,b=1:3)  # TRUE

It applies as.character to the dataframe:

  z=data.frame(a=2:4,b=1:3)
  as.character(z)
 [1] 2:4 1:3

  The as.character method for data frames seems to spot integer
sequences (but only for int types and not num types) and show the a:b
notation:

  x=data.frame(z=as.integer(c(1,2,3,4,5)))
  str(x)
 'data.frame':  5 obs. of  1 variable:
  $ z: int  1 2 3 4 5
  as.character(x)
 [1] 1:5

 Obviously it doesn't do this for vectors:

  as.character(x$z)
 [1] 1 2 3 4 5

 I suspect it's using 'deparse()' to get the character representation.
This function is mentioned in ?as.character, but as.character.default
disappears into the infernal .Internal and I don't have time to chase
source code - it's sunny outside!

Barry

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread Wacek Kusnierczyk
William Dunlap wrote:
 %in% is a thin wrapper on a call to match().  match() is
 not a generic function (and is not documented to be one),
 so it treats data.frames as lists, as their underlying
 representation is a list of columns.  match is documented
 to convert lists to character and to then run the character
 version of match on that character data.  match does not
 bail out if the types of the x and table arguments don't match
 (that would be undesirable in the integer/numeric mismatch case).
   

yes, i understand that this is documented behaviour, and that it's not a
bug.  nevertheless, the example is odd, and hints that there's a design
flaw.  i also do not understand why the following should be useful and
desirable:

as.character(list('a'))
# a

as.character(data.frame('a'))
# 1

and hence

'a' %in% list('a')
# TRUE

while

'a' %in% data.frame('a')
# FALSE
'1' %in% data.frame('a')
# TRUE

there is a mechanistic explanation for how this works, but is there one
for why this works this way?


 Hence
'1' %in% data.frame(1) # - TRUE
 is acting consistently with
match(as.character(pi), c(1, pi, exp(1))) # - 2
 and
1L %in% c(1.0, 2.0, 3.0) # - TRUE

 The related functions, duplicated() and unique(), do have
 row-wise data.frame methods.  E.g.,
 duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
[1] FALSE FALSE FALSE FALSE  TRUE
 Perhaps match() ought to have one also.  S+'s match is generic
 and has a data.frame method (which is row-oriented) so there we get:
  match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]),
 data.frame(x=1:10,y=letters[1:10]))
[1] 1 3 5
 is.element(data.frame(x=1:10,y=letters[1:10]),
 data.frame(x=c(1,3,5), y=letters[c(1,3,5)]))
 [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE

 I think that %in% and is.element() ought to remain calls to match()
 and that if you want them to work row-wise on data.frames then
 match should get a data.frame method.
   

sounds good to me.  how is

'a' %in% data.frame('a')

in S+?

thanks for the response.

regards,
vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread William Dunlap

  ...
  The related functions, duplicated() and unique(), do have
  row-wise data.frame methods.  E.g.,
  duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
 [1] FALSE FALSE FALSE FALSE  TRUE
  Perhaps match() ought to have one also.  S+'s match is generic
  and has a data.frame method (which is row-oriented) so there we get:
   match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]),
  data.frame(x=1:10,y=letters[1:10]))
 [1] 1 3 5
  is.element(data.frame(x=1:10,y=letters[1:10]),
  data.frame(x=c(1,3,5), y=letters[c(1,3,5)]))
  [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
 
  I think that %in% and is.element() ought to remain calls to match()
  and that if you want them to work row-wise on data.frames then
  match should get a data.frame method.

 
 sounds good to me.  how is
 
 'a' %in% data.frame('a')
 
 in S+?
 
 thanks for the response.

S+ gives:
 'a' %in% data.frame(letters)
   [1] TRUE
'a' %in% data.frame(letters[2:26])
   [1] FALSE
but that special case, x a scalar and table a data.frame with
one column, gets by more or less by accident.
'a' %in% data.frame(letters, num=1:26)
   Problem in match.data.frame(x, table, nomatch, incom..: table must be
a list the same length as x
c('a', 'b') %in% data.frame(letters)
   Problem in match.data.frame(x, table, nomatch, incom..: table must be
a list the same length as x
The intent is that the x and table arguments to match be
compatible data.frames.

S+'s match works differently on lists than R's does.  It is set
up to work on data.frame-like things: x and table must be
lists of the the same length and within each list, each element
must have the same length.  It acts like
  match(do.call(paste,x), do.call(paste,table))
but doesn't actually do the conversion to character implied in
that (it hashes all the entries in each 'row' into one hash table
entry, using the usual type-specific hash number computation
on each entry and combining them to make the row hash number).
E.g.,
match(list(c(3,2), c(1,7), c(4,1)),
list(c(1,4,2,3),c(0,6,7,1),c(0,5,1,4)))
   [1] 4 3

(Its match.data.frame() doesn't actually call this, for
historical/inertial
reasons.  It goes the paste() route.)
 
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Recommendations for a quick UI.

2009-06-02 Thread Greg Snow
Some possibilities:

The Rcmdr package is a very good example of a GUI built using Tk (it does not 
hide the R program, but lets you do analyses using menus and dialogs).  Rcmdr 
also has a plug-in mechanism to write extensions to it, depending on what you 
want to do, writing a simple extension to Rcmdr may be enough and a lot less 
work than creating your own from scratch.

There are tools (Rpad, Rserve, and others) that allow web interfaces to R, that 
may work for you.

There is the Rcom project uses R as a background tool for other programs, the 
most developed tool uses MSExcel as the GUI with R doing the heavy work behind 
the scenes.  There are various examples of tools using the excel interface 
available.


There is a lot of info at: http://www.sciviews.org/_rgui/

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-
 project.org] On Behalf Of Alex Bokov
 Sent: Sunday, May 31, 2009 11:25 PM
 To: r-devel@r-project.org
 Subject: [Rd] Recommendations for a quick UI.
 
 Hi. This is my first post to this list, I seem to be graduating to from
 the r-help list. :-)
 
 I'm trying to wrap my R package in a GUI such that when the user
 launches the app, they see my GUI window and never interact with the R
 console at all. I don't have any fancy requirements for the GUI itself-
 -
 all it needs to do is collect input from the user and pass the input as
 arguments to an R function, which writes the results to a file.
 
 I read the R Extensions Manual section about GUIs, and it seems like
 overkill to write the thing in a compiled language and link against R
 as
 a library when there are dozens of different interpreted cross-platform
 GUI toolkits out there. Does anybody know of any functioning examples
 of
 packages (or other add-ons) with GUIs that run R silently in the
 background which I can study? Do they use the R CMD BATCH mechanism,
 or something else?
 
 Thanks.
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread Wacek Kusnierczyk
Barry Rowlingson wrote:

[...]

 I suspect it's using 'deparse()' to get the character representation.
 This function is mentioned in ?as.character, but as.character.default
 disappears into the infernal .Internal and I don't have time to chase
 source code - it's sunny outside!
   

on the side, as.character triggers do_ascharacter, which in turn calls
DispatchOrEval, a function with the following beautiful comment:

To call this an ugly hack would be to insult all existing ugly hacks at
large in the world.

a fortune?

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff bizarre

2009-06-02 Thread Stavros Macrakis
On Tue, Jun 2, 2009 at 1:18 PM, William Dunlap wdun...@tibco.com wrote:

 %in% is a thin wrapper on a call to match().


Yes, as I mentioned in my email, all this is clearly documented in ? match.


 match() is not a generic function (and is not documented to be one),
 so it treats data.frames as lists, as their underlying representation is a
 list of columns.


Yes, I understand that this is the proximal cause of the current strange
behavior.  What I don't understand is why the current behavior is a good
idea.


 match is documented to convert lists to character and to then run

the character version of match on that character data


Yes, this peculiar behavior is documented.  What I don't get is its
rationale.


 match does not bail out if the types of the x and table arguments don't
 match
 (that would be undesirable in the integer/numeric mismatch case).


Why would it 'bail out'?

The related functions, duplicated() and unique(), do have
 row-wise data.frame methods.  E.g.,
duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)]))
   [1] FALSE FALSE FALSE FALSE  TRUE
 Perhaps match() ought to have one also


I think that %in% and is.element() ought to remain calls to match()
 and that if you want them to work row-wise on data.frames then
 match should get a data.frame method.


After all that, it sounds like we agree...!

 -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] statutes of R Foundation

2009-06-02 Thread Christophe Dutang

Good evening all,

I realised yesterday that the R Foundation statutes doc was only  
available in English and in German.


I tried to translate into French: a first version is available here http://dutangc.free.fr/pub/statut%20R.pdf 
 .


Could you please tell me what do you think of my translation?

Thanks in advance



Christophe

PS : Tex sources are here http://dutangc.free.fr/pub/statut%20R.tex

--
Christophe Dutang
Ph. D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] How to generate R objects in C?

2009-06-02 Thread Kynn Jones
I'm in the process of coding a parser (in C) to generate R entities
(vectors, lists, etc.) from a text description (different from R).
The basic parser works, and now I need to tell it how to create R
entities.  I need to be able to create character vectors (for unicode
strings), integers, floats, unnamed lists, named lists, boolean
values, and NA.  With the exception of the two types of lists and the
character vectors, all the other objects I need to generate are
scalars, so I suppose they will correspond to 1-element vectors in
R.  I also need to be able to add R entities to both kinds of lists.

I've been staring at various official documents (ch 5 of Writing R
Extensions, R Internals, Rinternals.h) for this kind of work for some
time, but I can't find the constructors for such objects (here I'm
using the term constructor loosely).  I'm even further from finding
the C equivalent of my.list[[ length(my.list) + 1 ]] - new.thing.

Can someone point me in the right direction?

Thanks!

Kynn

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Vectorize fails for function with ... arglist

2009-06-02 Thread Stavros Macrakis
Vectorize is defined to return a function that acts as if 'mapply' was
called.

So we have:

 mapply(dput,1:2)# mapply form
1L  # calls dput on each element of 1:2
2L
[1] 1 2
 Vectorize(dput)(1:2)# Vectorize form
1L# same behavior
2L
[1] 1 2

Same thing with a named argument:

 mapply(function(a)dput(a),1:2)
1L
2L
[1] 1 2
 Vectorize(function(a)dput(a))(1:2)
1L
2L
[1] 1 2

But though mapply has no problem with function(...):

 mapply(function(...)dput(list(...)),1:2)
list(1L)
list(2L)
[[1]]
[1] 1

[[2]]
[1] 2

 mapply(function(...)dput(list(...)),1:2,11:12)
list(1L, 11L)
list(2L, 12L)
 [,1] [,2]
[1,] 12
[2,] 11   12

Vectorize fails silently in this case:

 Vectorize(function(...)dput(list(...))(1:2)
list(1:2)# calls dput with entire vector
# invisible result inherited from dput

 Vectorize(function(...)dput(list(...)))(1:2,11:12)
list(1:2, 11:12)

and sure enough:

 Vectorize(function(...)list(...))
function(...)list(...)# returns arg unmodified!

I looked at the code, and ... args are *explicitly* rejected. I see no
logical reason for this inconsistency, and the documentation doesn't require
it.

  -s

PS This is not an artificial example concocted to demonstrate
inconsistencies.  I had written the following function which wraps another
function in a tryCatch:

   catcher - function(f) function(...)
tryCatch(do.call(f,list(...)),error=function(e) NA)

(The '...' argument list allows this to work with a function of any number
of arguments.)

but instead of catching individual errors in
Vectorize(catcher(fun))(1:10,1:10), it caught them all as one big error,
which was not at all the goal.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] reference counting bug related to break and next in loops

2009-06-02 Thread William Dunlap
One of our R users here just showed me the following problem while
investigating the return value of a while loop.  I added some
information
on a similar bug in for loops.  I think he was using 2.9.0
but I see the same problem on today's development version of 2.10.0
(svn 48703).

Should the semantics of while and for loops be changed slightly to avoid
the memory
buildup that fixing this to reflect the current docs would entail?  S+'s
loops return nothing useful - that change was made long ago to avoid
memory buildup resulting from semantics akin the R's present semantics.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

Forwarded (and edited) message
below---
--

 I think I have found another reference counting bug.

If you type in the following in R you get what I think is the wrong
result.

 i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i =
i + 1; y}; q
 [1] 42 42 42 42 42 42 42 42  9 10

I had expected  [1] 42 42 42 42 42 42 42  8  9 10 which is what you get
if you add 0 to y in the last statement in the while loop:

 i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i =
i + 1; y + 0}; q
 [1] 42 42 42 42 42 42 42  8  9 10  

Also, 

 i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
i-i+1 ; if (i=8i3)next ; cat(Completing iteration, i, \n); y};
q
Completing iteration 2
Completing iteration 3
 [1] 42 42 42 42 42 42 42 42  9 10

but if the last statement in the while loop is y+0 instead of y I get
the
expected result:

 i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break };
i-i+1 ; if (i=8i3)next ; cat(Completing iteration, i, \n);
y+0L}; q
Completing iteration 2
Completing iteration 3
 [1] 42 42  3  4  5  6  7  8  9 10

A background to the problem is that in R a while-loop returns the value
of the last iteration. However there is an exception if an iteration is
terminated by a break or a next. Then the value is the value of the
previously completed iteration that did not execute a break or next.
Thus in an extreme case the value of the while may be the value of the
very first iteration even though it executed a million iterations. 

Thus to implement that correctly one needs to keep a reference to the
value of the last non-terminated iteration. It seems as if the current R
implementation does that but does not increase the reference counter
which explains the odd behavior.

The for loop example is

 z-{ tmp-rep(pi,10);for(i in 1:10){ tmp[i]-i^2;if(i==9)break ; if
(i9i3)next ; tmp } }
 z
 [1]  1.00  4.00  9.00 16.00 25.00 36.00
49.00
 [8] 64.00 81.00  3.141593
 z-{ tmp-rep(pi,10);for(i in 1:10){ tmp[i]-i^2;if(i==9)break ; if
(i9i3)next ; tmp+0 } }
 z
 [1] 1.00 4.00 9.00 3.141593 3.141593 3.141593 3.141593
3.141593
 [9] 3.141593 3.141593

I can think of a couple of ways to solve this.

1.   Increment the reference counter. This solves the bug but may
have serious performance implications. In the while example above it
needs to copy y in every iteration.

2.   Change the semantics of while loops by getting rid of the
exception described above. When a loop is terminated with a break the
value of the loop would be NULL. Thus there is no need to keep a
reference to the value of the last non-terminated iteration.

Any opinions?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel