Re: [Rd] promptFunctions() to handle multiple names

2008-04-14 Thread Daniel Sabanés Bové
Hi John,

thanks for the pointer to promptAll()! Judging by the description
of your function, it seems to be exactly the thing I've searched for.

Well, this is yet another reason to be looking forward to your new book.
(When will it be in the stores?)

Daniel


John Chambers schrieb:
> Daniel,
> 
> Check out the promptAll() function in the SoDA package on CRAN.  
> (Because it was written as an example for my new book, it's not the 
> fanciest imaginable, but seems to work OK.)
> 
> John
> 
> 
> Daniel Sabanés Bové wrote:
> Hi all,
> 
> I wanted to set up my first (private) R-package and wondered
> if there was a function to prompt() for multiple aliases in one Rd-file,
> e.g. to create something like the normal distribution manual page
> encompassing rnorm, dnorm,...
> 
> As I didn't find it, I modified prompt.default() and wrote a small 
> function
> to do this job, called "promptFunctions". It basically calls the helper
> ".promptFunction" for every name it gets and puts together the output
> from each function.
> 
> It would be interesting for me if such a function already existed in R
> or if something like "promptFunction" could be included in any future 
> R version.
> I think it would be used as many man pages document several functions 
> at once,
> and cutting and pasting the single prompt() files by hand could be 
> boring.
> 
> regards,
> Daniel
> 
> The Code:
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] option(expressions) and --max-ppsize

2008-04-14 Thread Prof Brian Ripley

On Mon, 14 Apr 2008, Tobias Verbeke wrote:


Dear list,

Is there an exact formula / safe rule of thumb that allows
one to express the value of --max-ppsize as a function of
the value of getOption("expressions") ?


There is none: it entirely depends on what you are doing.
But you would expect pp stack usage to be proportional to the number of 
expressions, the problem being what the ratio should be.


The ratio of the defaults was empirically determined, and should be a good 
guide.


However, if you find yourself increasing either then you probably want to 
rethink how you do the calculations in R.



?options tells "If you increase it [the expressions option],
you may also want to start R with a larger protection stack".

Motivation is to determine stack size of a Java vm used
to launch R with a given --max-ppsize option.


R's expression limit is only vaguely related to its C stack usage
(assuming you mean 'embed R' not 'launch R' -- it is processes that get 
launched in the OS world I inhabit).




Many thanks in advance,
Tobias

--

Tobias Verbeke - Consultant
Business & Decision Benelux
Rue de la révolution 8
1000 Brussels - BELGIUM

+32 499 36 33 15
[EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Small memory leak in plot on OS X (PR#11170)

2008-04-14 Thread abergeron
Full_Name: Arnaud Bergeron
Version: 2.6.2
OS: Mac OS X 10.5.2
Submission from: (NULL) (69.157.224.197)


When I run the following loop

repeat { plot(seq(5), seq(5)) }

the memory consumed by the process goes up by a small amout each time. I tried
this with the quartz() and pdf() output devices.  Only a single output device is
created by the process and is repeatedly overwritten by plot().

Also even if I close the device with dev.off() the memory is never released,
even after a gc() pass.  In fact the gc() pass reports way less memory used than
what the system reports.  I can understand a discrepancy in the numbers here,
but 9.1Mb (reported by gc()) versus 415Mb (reported by the system) is not normal
in my eyes.

With the png() device, the results were less conclusive. I'd say there was a
slight overall augmentation of used memory, but may that's just me.  Note that
this device creates many files by default and the problem does not seem to
happen when it does this.

Below is the sessionInfo() output:

R version 2.6.2 (2008-02-08) 
powerpc-apple-darwin8.10.1 

locale:
fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] NEWS file

2008-04-14 Thread David Scott

Is there Emacs support for creating a NEWS file for a package? If so where 
could I find it? I had a look at the GNU coding standards on documenting 
programs. It has a bit on Emacs and Change Logs but not concerning a NEWS 
file as far as I could see.

David Scott

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Doing the right amount of copy for large data frames.

2008-04-14 Thread Gopi Goswami
Dear All,


Thanks a lot for your helpful comments (e.g., NAMED, ExpressionSet,
DNAStringSet).


Observations and questions ::

ooo   For a data.frame dd and a list ll with same contents to being with,
the following operations show significant difference in the maximum memory
usage column of the gc( ) output on R-2.6.2 (the detailed code is in the PS
section below).

ll$xx <- zz
dd$xx <- zz

My understanding is that the '$<-.data.frame' S3 method above makes a copy
of the whole dd first (using '*tmp*'). But for a list this is avoided due to
the use of SET_VECTOR_ELT at the C-level. Is this a valid explanation or
something deeper is happening behind the scene?



oooI'll look into the read-only flag idea to avoid unhappy circumstances
that might arise while bypassing the copy-on-modify principle. Any pointers
or code snippets as to how to implement this idea?



oooThe main reason I want to bypass copy-on-modify is that I want to
emulate a Python like behavior for lists (and data.frame), in the sense
that, I want to take the responsibility of making a deep copy if need be,
but most of the time I want to knowingly change 'things in place' using the
proposed S4 class DataFrame.


Regards,
Gopi Goswami.
PhD, Statistics, 2005
http://gopi-goswami.net/index.html



PS:

zz <- seq_len(100)
gc( )
dd <- data.frame(xx = zz)
dd$yy <- zz
gc( )
object.size(dd)

##

zz <- seq_len(100)
gc( )
ll <- list(xx = zz)
ll$yy <- zz
gc( )
object.size(ll)




On Mon, Apr 14, 2008 at 10:18 AM, Tony Plate <[EMAIL PROTECTED]> wrote:

> Gopi Goswami wrote:
>
> > Hi there,
> >
> >
> > Problem ::
> > When one tries to change one or some of the columns of a data.frame, R
> > makes
> > a copy of the whole data.frame using the '*tmp*' mechanism (this does
> > not
> > happen for components of a list, tracemem( ) on R-2.6.2 says so).
> >
> >
> > Suggested solution ::
> > Store the columns of the data.frame as a list inside of an environment
> > slot
> > of an S4 class, and define the '[', '[<-' etc. operators using
> > setMethod( )
> > and setReplaceMethod( ).
> >
> >
> > Question ::
> > This implementation will violate copy on modify principle of R (since
> > environments are not copied), but will save a lot of memory. Do you see
> > any
> > other obvious problem(s) with the idea?
> >
> Well, because it violates the copy-on-modify principle it can potentially
> break code that depends on this principle.  I don't know how much there is
> -- did you try to see if R and recommended packages will pass checks with
> this change in place?
>
> >  Have you seen a related setup
> > implemented / considered before (apart from the packages like filehash,
> > ff,
> > and database related ones for saving memory)?
> >
> >
> I've frequently used a personal package that stores array data in a file
> (like ff).  It works fine, and I partially get around the problem of
> violating the copy-on-modify principle by having a readonly flag in the
> object -- when the flag is set to allow modification I have to be careful,
> but after I set it to readonly I can use it more freely with the knowledge
> that if some function does attempt to modify the object, it will stop with
> an error.
>
> In this particular case, why not just track down why data frame
> modification is copying the entire object and suggest a change so that it
> just copies the column being changed?  (should be possible if list
> modification doesn't copy all components).
>
> -- Tony Plate
>
> >
> > Implementation code snippet ::
> > ### The S4 class.
> > setClass('DataFrame',
> >  representation(data = 'data.frame', nrow = 'numeric', ncol
> > =
> > 'numeric', store = 'environment'),
> >  prototype(data = data.frame( ), nrow = 0, ncol = 0))
> >
> > setMethod('initialize', 'DataFrame', function(.Object) {
> >.Object <- callNextMethod( )
> >[EMAIL PROTECTED] <- new.env(hash = TRUE)
> >assign('data', as.list([EMAIL PROTECTED]), [EMAIL PROTECTED])
> >[EMAIL PROTECTED] <- nrow([EMAIL PROTECTED])
> >[EMAIL PROTECTED] <- ncol([EMAIL PROTECTED])
> >[EMAIL PROTECTED] <- data.frame( )
> >.Object
> > })
> >
> >
> > ### Usage:
> > nn  <- 10
> > ## dd1 below could possibly be created by read.table or scan and
> > data.frame
> > dd1 <- data.frame(xx = rnorm(nn), yy = rnorm(nn))
> > dd2 <- new('DataFrame', data = dd1)
> > rm(dd1)
> > ## Now work with dd2
> >
> >
> > Thanks a lot,
> > Gopi Goswami.
> > PhD, Statistics, 2005
> > http://gopi-goswami.net/index.html
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> >
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] && and ||

2008-04-14 Thread Yuan Jian


Berwin A Turlach <[EMAIL PROTECTED]> wrote:   G'day Yu,

On Sun, 13 Apr 2008 23:48:33 -0700 (PDT)
Yuan Jian wrote:

> what should I do when I want to get a sequence for operate && or ||?

Read `help("&&")' and then use & and |. :)

HTH.

Cheers,

Berwin

=== Full address =
Berwin A Turlach Tel.: +65 6515 4416 (secr)
Dept of Statistics and Applied Probability +65 6515 6650 (self)
Faculty of Science FAX : +65 6872 3919 
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7 e-mail: [EMAIL PROTECTED]
Singapore 117546 http://www.stat.nus.edu.sg/~statba



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] clean-up actions after non-local exits

2008-04-14 Thread Luke Tierney
On Mon, 14 Apr 2008, Vadim Organovich wrote:

> This is good, thanks!
>
> I'd like to be able to make sure that the resource is released in conrolled 
> fasion rather than at some arbitrary gc() call time. Will the following trick 
> achieve the goal:
>
> foo <-function(whatever) {
> on.exit(gc())
> ## arrange for an external pointer, don't know how yet
> ...
> ## actual call
> .Call(whatever)
> }
>
> The idea is to have gc() called on exit.
>
> I seem to recall that a call to gc() doesn't guarantee that all possibly 
> collectable objects are actually collected, which will defeat my solution. Is 
> that correct?

It would not be safe to rely on gc runing the finalizer at precisely
this time.  Probably it does now, but that may not always remain true.

If you are prepared to use R level wrappers with on.exit code, or
tryCatch with a finally clause, the I would use those to call your
cleanup code directly.  That is nearly guaranteed to succeed.  The
exception is that it is currently possible for a user interrupt (or a
timer interrupt in R 2.7 and up) to be handled in the running of the
interpreted part of the on.exit/finally code.  At some point there
will be an R-level mechanism for selectively disabling and then
re-enabling interrupts (and maybe that will be used by default in
conjunction with tryCatch or on.exit, but that needs thinking
though).  Until then if you need a stronger guarantee I would add the
finalizer option as a backup and let the GC call it when it gets to
it.

luke


>
> Thanks,
> Vadim
>
> 
> From: Duncan Murdoch [EMAIL PROTECTED]
> Sent: Monday, April 14, 2008 3:53 PM
> To: Vadim Organovich
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] clean-up actions after non-local exits
>
> On 14/04/2008 4:33 PM, Vadim Organovich wrote:
>> Dear R-devel,
>>
>>
>>
>> Some time ago I started a thread that boiled down to clean-up actions after 
>> non-local exits in R, see below. I wonder if there has been any progress on 
>> this? R-ext 2.6.1 doesn't say much on the subject.
>>
>>
>>
>> How, for example, do people deal with a situation where their C (C++) 
>> function opens a file and then receives a signal or  longjump-s on error(), 
>> how do they make sure the file is eventually closed?
>
> The finalizer code that Luke mentioned is more easily accessible now
> than it was in 2004.  See the section on external pointers and weak
> references in the Writing R Extensions manual.
>
> The idea would be to create an external pointer object that controls the
> resource.  If there's an error, at the next GC the external pointer will
> be finalized and that's where the cleanup can happen.
>
> Duncan Murdoch
>
>>
>>
>> Thanks,
>>
>> Vadim
>>
>>
>>
>> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
>>
>>> This is disappointing. How on Earth can mkChar know when it is safe or
>>> not to make a long jump? For example if I just opened a file how am I
>>> supposed to close it after the long jump? I am not even talking about
>>> C++ where long jumps are simply devastating... (and this is the language
>>> I am coding in :-( )
>>>
>>> Ok. A practical question: is it possible to somehow block
>>> R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
>>> but Ctrl-C is too common to be ignored.
>>
>> Interrupts are not the issue. The issue is making sure that cleanup
>> actions occur even if there is a non-local exit. A solution that
>> addresses that issue will work for any non-local exit, whether it
>> comes from an interrupt or an exception. So you don't have to put up
>> with anything if you approach this the right way,
>>
>> Currently there is no user accessible C level try/finally mechanism
>> for insuring that cleanup code is executed during a non-local exit.
>> We should make such a mechanicm available; maybe one will make it into
>> the next major release.
>>
>> For now you have two choices:
>>
>> You can create an R level object and attach a finalizer to the object
>> that will arrange for the GC to close the file at some point in the
>> future if a non-local exit occurs. Search developer.r-project.org for
>> finalization and weak references for some info on this.
>>
>> One other option is to use the R_ToplevelExec function. This has some
>> drawbacks since it effectively makes invisible all other error
>> handlers, but it is an option. It is also not officially documented
>> and subject to change.
>>
>>> And I think it makes relevant again the question I asked in another
>>> related thread: how is memory allocated by Calloc() and R_alloc() stand
>>> up against long jumps?
>>
>> R_alloc is stack-based; the stack is unwound on a non-local exit, so
>> this is released on regular exits and non-local ones. It uses R
>> allocation, so it could itself cause a non-local exit.
>>
>> Calloc is like calloc but will never return NULL. If the allocation
>> fails, then an error is signaled, which will result in a non-local
>> exit. If th

Re: [Rd] HOW TO AVOID LOOPS

2008-04-14 Thread Bill Dunlap
On Mon, 14 Apr 2008, Stephen Milborrow wrote:

> > Le sam. 12 avr. ? 12:47, carlos martinez a ?crit :
> > Looking for a simple, effective a minimum execution time solution.
> >
> > For a vector as:
> >
> > c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >
> > To transform it to the following vector without using any loops:
> >
> > (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
>
> Here is a fast solution using the Ra just-in-time compiler
> www.milbo.users.sonic.net/ra.
>
> jit(1)
> if (length(x) > 1)
> for (i in 2:length(x))
> if (x[i])
> x[i] <- x[i-1] + 1
>
> The times in seconds for various solutions mailed to r-devel are listed
> below. There is some variation between runs and with the contents of x. The
> times shown are for
>
> set.seed(1066);  x <- as.double(runif(1e6) > .5)
>
> This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R 2.6.2).
> The code to generate these results is attached.
>
> vin 24
> greg   11
> had3.9
> dan1.4
> dan2  1.4
> jit   0.25# code is shown above, 7 secs with standard R 2.6.2>

Stephen's solution is certainly easy to read and write.

Another solution, if I understand the scope of the problem, is
   f7 <- function(x){ tmp<-cumsum(x);tmp-cummax((!x)*tmp)}

I made a script to run all the functions I noticed
(except for the library(inline) one) on various 0-1
vectors of length one million and got the following
timings on R 2.6.2 running on my Windows laptop
(Lenovo T61, Core 2 Duo at 2 GHz).   "error thrown"
means the function call died and "incorrect" means it
returned the wrong answer.

   > source("z:/dumpdata.R")
   Timing stopped at: 0.05 0 0.05 NA NA
   Error in dots[[1L]][[1L]] : subscript out of bounds
   Timing stopped at: 0.14 0.06 0.21 NA NA
   Error in x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n)) :
 incompatible types (from NULL to double) in subassignment type fix
  all.ones all.zerosfew.long  many.short
   f17.02 7.03 7.04  7.03
   f20.13 0.13 0.13  2.52
   f3 error thrown   35.47  incorrect incorrect
   f40.19  error thrown0.21  1.20
   f50.28 0.09 0.29  0.18
   f65.40 0.78 5.42  3.14
   f70.06 0.05 0.06  0.06

I've attached the script so you can figure out whose
function is whose if you care to.  The lapply/mapply
solution, f3, required that there be 1's at both ends of the
input vector.  Perhaps I miscopied the code.


Bill Dunlap
Insightful Corporation
bill at insightful dot com

 "All statements in this message represent the opinions of the author and do
 not necessarily reflect Insightful Corporation policy or position."


The test script:

len <- 1e6 # length of vectors in tests

funs <- list(
`f1` = function(x)Reduce( function(x,y) x*y + y, x, accumulate=TRUE ),

`f2` = function(x)x * unlist(lapply(rle(x)$lengths, seq_len)),

`f3` = function(x){
ind <- which(x == 0)
unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, length(x))),
function(y) cumsum(x[y]))) },

`f4` = function(x) {
d <- diff(c(0,x,0))
starts <- which(d == 1)
ends <- which(d == -1)
x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
x },

`f5` = function(x) {
if (existsFunction("jit")) jit(1) else stop("no jit available")
if (length(x) > 1)
for (i in 2:length(x))
if (x[i])
x[i] <- x[i-1] + 1
x
},
`f6` = # same as f5, but not compiled with jit.
function(x) {
if (existsFunction("jit")) jit(0)
if (length(x) > 1)
for (i in 2:length(x))
if (x[i])
x[i] <- x[i-1] + 1
x
},
`f7` =
function(x) {
tmp<-cumsum(x)
tmp-cummax((!x)*tmp)
}
)

data <- list(
all.ones = rep(1, len),
all.zeros = rep(0, len),
few.long = rep( c(rep(1,len/10-1),0), len=len), # 10 long runs
many.short = rep(c(1,0), len=len) # len/2 runs of length 1
)
expected <- list(
all.ones = 1:len,
all.zeros = rep(0, len),
few.long = rep( c(1:(len/10-1), 0), len=len),
many.short = rep(c(1,0), len=len)
)

print(noquote(sapply(names(data),
   function(d) sapply(names(funs),
  function(f){
  time<-try(unix.time(gcFirst=TRUE, val<-funs[[f]](data[[d]])))
  if (is(time, "try-error"))
  "error thrown"
  else if (!isTRUE(report <- all.equal(val, expected[[d]])))
  "incorrect"
  else
  sprintf("%7.2f", unname(time[1]+time[2]))
  }
   )
)))

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] clean-up actions after non-local exits

2008-04-14 Thread Duncan Murdoch
On 14/04/2008 5:10 PM, Vadim Organovich wrote:
> This is good, thanks!
> 
> I'd like to be able to make sure that the resource is released in conrolled 
> fasion rather than at some arbitrary gc() call time. Will the following trick 
> achieve the goal:
> 
> foo <-function(whatever) {
>  on.exit(gc())
>  ## arrange for an external pointer, don't know how yet
>  ...
>  ## actual call
>  .Call(whatever)
> }

Someone else more familiar with the memory manager will have to say 
whether this will work.  One thing is that objects bound to names within 
foo are still bound at the time of the gc() call, so they won't get 
collected, but variables that are not bound to anything should.  So I 
think if your C code is something like

PROTECT( ptr = R_MakeExternalPtr(...) )
R_RegisterFinalizer(ptr, ...)


UNPROTECT(1)

your scheme should work.

Duncan Murdoch
> 
> The idea is to have gc() called on exit.
> 
> I seem to recall that a call to gc() doesn't guarantee that all possibly 
> collectable objects are actually collected, which will defeat my solution. Is 
> that correct?
> 
> Thanks,
> Vadim
> 
> 
> From: Duncan Murdoch [EMAIL PROTECTED]
> Sent: Monday, April 14, 2008 3:53 PM
> To: Vadim Organovich
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] clean-up actions after non-local exits
> 
> On 14/04/2008 4:33 PM, Vadim Organovich wrote:
>> Dear R-devel,
>>
>>
>>
>> Some time ago I started a thread that boiled down to clean-up actions after 
>> non-local exits in R, see below. I wonder if there has been any progress on 
>> this? R-ext 2.6.1 doesn't say much on the subject.
>>
>>
>>
>> How, for example, do people deal with a situation where their C (C++) 
>> function opens a file and then receives a signal or  longjump-s on error(), 
>> how do they make sure the file is eventually closed?
> 
> The finalizer code that Luke mentioned is more easily accessible now
> than it was in 2004.  See the section on external pointers and weak
> references in the Writing R Extensions manual.
> 
> The idea would be to create an external pointer object that controls the
> resource.  If there's an error, at the next GC the external pointer will
> be finalized and that's where the cleanup can happen.
> 
> Duncan Murdoch
> 
>>
>> Thanks,
>>
>> Vadim
>>
>>
>>
>> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
>>
>>> This is disappointing. How on Earth can mkChar know when it is safe or
>>> not to make a long jump? For example if I just opened a file how am I
>>> supposed to close it after the long jump? I am not even talking about
>>> C++ where long jumps are simply devastating... (and this is the language
>>> I am coding in :-( )
>>>
>>> Ok. A practical question: is it possible to somehow block
>>> R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
>>> but Ctrl-C is too common to be ignored.
>> Interrupts are not the issue. The issue is making sure that cleanup
>> actions occur even if there is a non-local exit. A solution that
>> addresses that issue will work for any non-local exit, whether it
>> comes from an interrupt or an exception. So you don't have to put up
>> with anything if you approach this the right way,
>>
>> Currently there is no user accessible C level try/finally mechanism
>> for insuring that cleanup code is executed during a non-local exit.
>> We should make such a mechanicm available; maybe one will make it into
>> the next major release.
>>
>> For now you have two choices:
>>
>> You can create an R level object and attach a finalizer to the object
>> that will arrange for the GC to close the file at some point in the
>> future if a non-local exit occurs. Search developer.r-project.org for
>> finalization and weak references for some info on this.
>>
>> One other option is to use the R_ToplevelExec function. This has some
>> drawbacks since it effectively makes invisible all other error
>> handlers, but it is an option. It is also not officially documented
>> and subject to change.
>>
>>> And I think it makes relevant again the question I asked in another
>>> related thread: how is memory allocated by Calloc() and R_alloc() stand
>>> up against long jumps?
>> R_alloc is stack-based; the stack is unwound on a non-local exit, so
>> this is released on regular exits and non-local ones. It uses R
>> allocation, so it could itself cause a non-local exit.
>>
>> Calloc is like calloc but will never return NULL. If the allocation
>> fails, then an error is signaled, which will result in a non-local
>> exit. If the allocation succeeds, you are responsable for calling
>> Free.
>>
>> luke
>>
 -Original Message-
 From: Luke Tierney [mailto:[EMAIL PROTECTED]]
 Sent: Monday, June 14, 2004 5:43 PM
 To: Vadim Ogranovich
 Cc: R-Help
 Subject: RE: [R] mkChar can be interrupted

 On Mon, 14 Jun 2004, Vadim Ogranovich wrote:

>>

Re: [Rd] HOW TO AVOID LOOPS

2008-04-14 Thread Simon Urbanek

On Apr 14, 2008, at 4:22 PM, Stephen Milborrow wrote:

>> Le sam. 12 avr. à 12:47, carlos martinez a écrit :
>> Looking for a simple, effective a minimum execution time solution.
>>
>> For a vector as:
>>
>> c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
>>
>> To transform it to the following vector without using any loops:
>>
>> (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
>
> Here is a fast solution using the Ra just-in-time compiler
> www.milbo.users.sonic.net/ra.
>
> jit(1)
> if (length(x) > 1)
>   for (i in 2:length(x))
>   if (x[i])
>   x[i] <- x[i-1] + 1
>
> The times in seconds for various solutions mailed to r-devel are  
> listed
> below. There is some variation between runs and with the contents of  
> x. The
> times shown are for
>
> set.seed(1066);  x <- as.double(runif(1e6) > .5)
>
> This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R  
> 2.6.2).
> The code to generate these results is attached.
>

Well, if you want to break the rules, you may as well do it properly ;).

library(inline)
f = cfunction(signature(n="integer", x="numeric"),
"for(int i = 1; i < *n; i++) if (x[i]) x[i] = x[i-1] + 1;",
convention=".C")

f(length(x), x)

inline 0.03s
pure 2.7s
hadley 4.5s

(I couldn't measure Ra reliably - I was getting times around 2s which  
seems inappropriate - Stephen, how did you measure it?).

Cheers,
S




> vin 24
> greg   11
> had3.9
> dan1.4
> dan2  1.4
> jit   0.25# code is shown above, 7 secs with standard R 2.6.2>
>
> Stephen Milborrow
> www.milbo.users.sonic.net
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] clean-up actions after non-local exits

2008-04-14 Thread Vadim Organovich
This is good, thanks!

I'd like to be able to make sure that the resource is released in conrolled 
fasion rather than at some arbitrary gc() call time. Will the following trick 
achieve the goal:

foo <-function(whatever) {
 on.exit(gc())
 ## arrange for an external pointer, don't know how yet
 ...
 ## actual call
 .Call(whatever)
}

The idea is to have gc() called on exit.

I seem to recall that a call to gc() doesn't guarantee that all possibly 
collectable objects are actually collected, which will defeat my solution. Is 
that correct?

Thanks,
Vadim


From: Duncan Murdoch [EMAIL PROTECTED]
Sent: Monday, April 14, 2008 3:53 PM
To: Vadim Organovich
Cc: r-devel@r-project.org
Subject: Re: [Rd] clean-up actions after non-local exits

On 14/04/2008 4:33 PM, Vadim Organovich wrote:
> Dear R-devel,
>
>
>
> Some time ago I started a thread that boiled down to clean-up actions after 
> non-local exits in R, see below. I wonder if there has been any progress on 
> this? R-ext 2.6.1 doesn't say much on the subject.
>
>
>
> How, for example, do people deal with a situation where their C (C++) 
> function opens a file and then receives a signal or  longjump-s on error(), 
> how do they make sure the file is eventually closed?

The finalizer code that Luke mentioned is more easily accessible now
than it was in 2004.  See the section on external pointers and weak
references in the Writing R Extensions manual.

The idea would be to create an external pointer object that controls the
resource.  If there's an error, at the next GC the external pointer will
be finalized and that's where the cleanup can happen.

Duncan Murdoch

>
>
> Thanks,
>
> Vadim
>
>
>
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
>
>> This is disappointing. How on Earth can mkChar know when it is safe or
>> not to make a long jump? For example if I just opened a file how am I
>> supposed to close it after the long jump? I am not even talking about
>> C++ where long jumps are simply devastating... (and this is the language
>> I am coding in :-( )
>>
>> Ok. A practical question: is it possible to somehow block
>> R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
>> but Ctrl-C is too common to be ignored.
>
> Interrupts are not the issue. The issue is making sure that cleanup
> actions occur even if there is a non-local exit. A solution that
> addresses that issue will work for any non-local exit, whether it
> comes from an interrupt or an exception. So you don't have to put up
> with anything if you approach this the right way,
>
> Currently there is no user accessible C level try/finally mechanism
> for insuring that cleanup code is executed during a non-local exit.
> We should make such a mechanicm available; maybe one will make it into
> the next major release.
>
> For now you have two choices:
>
> You can create an R level object and attach a finalizer to the object
> that will arrange for the GC to close the file at some point in the
> future if a non-local exit occurs. Search developer.r-project.org for
> finalization and weak references for some info on this.
>
> One other option is to use the R_ToplevelExec function. This has some
> drawbacks since it effectively makes invisible all other error
> handlers, but it is an option. It is also not officially documented
> and subject to change.
>
>> And I think it makes relevant again the question I asked in another
>> related thread: how is memory allocated by Calloc() and R_alloc() stand
>> up against long jumps?
>
> R_alloc is stack-based; the stack is unwound on a non-local exit, so
> this is released on regular exits and non-local ones. It uses R
> allocation, so it could itself cause a non-local exit.
>
> Calloc is like calloc but will never return NULL. If the allocation
> fails, then an error is signaled, which will result in a non-local
> exit. If the allocation succeeds, you are responsable for calling
> Free.
>
> luke
>
>>> -Original Message-
>>> From: Luke Tierney [mailto:[EMAIL PROTECTED]>> PROTECTED]:%20%5BR%5D%20mkChar%20can%20be%20interrupted>]
>>> Sent: Monday, June 14, 2004 5:43 PM
>>> To: Vadim Ogranovich
>>> Cc: R-Help
>>> Subject: RE: [R] mkChar can be interrupted
>>>
>>> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
>>>
 I am confused. Here is an excerpt from R-exts:

 "As from R 1.8.0 no port of R can be interrupted whilst
>>> running long
 computations in compiled code,..."

 Doesn't it imply that the primitive functions like allocVector,
 mkChar, etc., which are likely to occur in any compiled code called
 via .Call, are not supposed to handle interrupts in any way?
>>> No it does not. Read the full context. It says that if you
>>> wite a piece of C code that may run a long time and you want
>>> to guarantee that users will be able to interrupt your code
>>> then you should insure that R_CheckUserInterrupt is called
>>> periodica

Re: [Rd] promptFunctions() to handle multiple names

2008-04-14 Thread John Chambers
Daniel Sabanés Bové wrote:
> Hi John,
>
> thanks for the pointer to promptAll()! Judging by the description
> of your function, it seems to be exactly the thing I've searched for.
>
> Well, this is yet another reason to be looking forward to your new book.
> (When will it be in the stores?)
According to my last mail from Springer, should be sometime in May.

John
>
> Daniel
>
>
> John Chambers schrieb:
>> Daniel,
>>
>> Check out the promptAll() function in the SoDA package on CRAN.  
>> (Because it was written as an example for my new book, it's not the 
>> fanciest imaginable, but seems to work OK.)
>>
>> John
>>
>>
>> Daniel Sabanés Bové wrote:
>> Hi all,
>>
>> I wanted to set up my first (private) R-package and wondered
>> if there was a function to prompt() for multiple aliases in one Rd-file,
>> e.g. to create something like the normal distribution manual page
>> encompassing rnorm, dnorm,...
>>
>> As I didn't find it, I modified prompt.default() and wrote a small 
>> function
>> to do this job, called "promptFunctions". It basically calls the helper
>> ".promptFunction" for every name it gets and puts together the output
>> from each function.
>>
>> It would be interesting for me if such a function already existed in R
>> or if something like "promptFunction" could be included in any future 
>> R version.
>> I think it would be used as many man pages document several functions 
>> at once,
>> and cutting and pasting the single prompt() files by hand could be 
>> boring.
>>
>> regards,
>> Daniel
>>
>> The Code:
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] clean-up actions after non-local exits

2008-04-14 Thread Duncan Murdoch
On 14/04/2008 4:33 PM, Vadim Organovich wrote:
> Dear R-devel,
> 
> 
> 
> Some time ago I started a thread that boiled down to clean-up actions after 
> non-local exits in R, see below. I wonder if there has been any progress on 
> this? R-ext 2.6.1 doesn't say much on the subject.
> 
> 
> 
> How, for example, do people deal with a situation where their C (C++) 
> function opens a file and then receives a signal or  longjump-s on error(), 
> how do they make sure the file is eventually closed?

The finalizer code that Luke mentioned is more easily accessible now 
than it was in 2004.  See the section on external pointers and weak 
references in the Writing R Extensions manual.

The idea would be to create an external pointer object that controls the 
resource.  If there's an error, at the next GC the external pointer will 
be finalized and that's where the cleanup can happen.

Duncan Murdoch

> 
> 
> Thanks,
> 
> Vadim
> 
> 
> 
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> 
>> This is disappointing. How on Earth can mkChar know when it is safe or
>> not to make a long jump? For example if I just opened a file how am I
>> supposed to close it after the long jump? I am not even talking about
>> C++ where long jumps are simply devastating... (and this is the language
>> I am coding in :-( )
>>
>> Ok. A practical question: is it possible to somehow block
>> R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
>> but Ctrl-C is too common to be ignored.
> 
> Interrupts are not the issue. The issue is making sure that cleanup
> actions occur even if there is a non-local exit. A solution that
> addresses that issue will work for any non-local exit, whether it
> comes from an interrupt or an exception. So you don't have to put up
> with anything if you approach this the right way,
> 
> Currently there is no user accessible C level try/finally mechanism
> for insuring that cleanup code is executed during a non-local exit.
> We should make such a mechanicm available; maybe one will make it into
> the next major release.
> 
> For now you have two choices:
> 
> You can create an R level object and attach a finalizer to the object
> that will arrange for the GC to close the file at some point in the
> future if a non-local exit occurs. Search developer.r-project.org for
> finalization and weak references for some info on this.
> 
> One other option is to use the R_ToplevelExec function. This has some
> drawbacks since it effectively makes invisible all other error
> handlers, but it is an option. It is also not officially documented
> and subject to change.
> 
>> And I think it makes relevant again the question I asked in another
>> related thread: how is memory allocated by Calloc() and R_alloc() stand
>> up against long jumps?
> 
> R_alloc is stack-based; the stack is unwound on a non-local exit, so
> this is released on regular exits and non-local ones. It uses R
> allocation, so it could itself cause a non-local exit.
> 
> Calloc is like calloc but will never return NULL. If the allocation
> fails, then an error is signaled, which will result in a non-local
> exit. If the allocation succeeds, you are responsable for calling
> Free.
> 
> luke
> 
>>> -Original Message-
>>> From: Luke Tierney [mailto:[EMAIL PROTECTED]>> PROTECTED]:%20%5BR%5D%20mkChar%20can%20be%20interrupted>]
>>> Sent: Monday, June 14, 2004 5:43 PM
>>> To: Vadim Ogranovich
>>> Cc: R-Help
>>> Subject: RE: [R] mkChar can be interrupted
>>>
>>> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
>>>
 I am confused. Here is an excerpt from R-exts:

 "As from R 1.8.0 no port of R can be interrupted whilst
>>> running long
 computations in compiled code,..."

 Doesn't it imply that the primitive functions like allocVector,
 mkChar, etc., which are likely to occur in any compiled code called
 via .Call, are not supposed to handle interrupts in any way?
>>> No it does not. Read the full context. It says that if you
>>> wite a piece of C code that may run a long time and you want
>>> to guarantee that users will be able to interrupt your code
>>> then you should insure that R_CheckUserInterrupt is called
>>> periodically. If your code already periodically calls other
>>> R code that checks for interrupts then you may not need to do
>>> this yourself, but in general you do.
>>>
>>> Prior to 1.8.0 on Unix-like systems the asynchronous signal
>>> handler for SIGINT would longjmp to the nearest top level or
>>> browser context, which meant that on these sytems any code
>>> was interruptible at any point unless it was explicitly
>>> protected by a construct that suspended interrupts. Allowing
>>> interrupts at any point meant that inopportune interrupts
>>> could and did crash R, which is why this was changed.
>>>
>>> Unless there is explicit documentation to the contrary you
>>> should assume that every function in the R API might allocate
>>> and might cause a no

[Rd] clean-up actions after non-local exits

2008-04-14 Thread Vadim Organovich
Dear R-devel,



Some time ago I started a thread that boiled down to clean-up actions after 
non-local exits in R, see below. I wonder if there has been any progress on 
this? R-ext 2.6.1 doesn't say much on the subject.



How, for example, do people deal with a situation where their C (C++) function 
opens a file and then receives a signal or  longjump-s on error(), how do they 
make sure the file is eventually closed?



Thanks,

Vadim



On Mon, 14 Jun 2004, Vadim Ogranovich wrote:

> This is disappointing. How on Earth can mkChar know when it is safe or
> not to make a long jump? For example if I just opened a file how am I
> supposed to close it after the long jump? I am not even talking about
> C++ where long jumps are simply devastating... (and this is the language
> I am coding in :-( )
>
> Ok. A practical question: is it possible to somehow block
> R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
> but Ctrl-C is too common to be ignored.

Interrupts are not the issue. The issue is making sure that cleanup
actions occur even if there is a non-local exit. A solution that
addresses that issue will work for any non-local exit, whether it
comes from an interrupt or an exception. So you don't have to put up
with anything if you approach this the right way,

Currently there is no user accessible C level try/finally mechanism
for insuring that cleanup code is executed during a non-local exit.
We should make such a mechanicm available; maybe one will make it into
the next major release.

For now you have two choices:

You can create an R level object and attach a finalizer to the object
that will arrange for the GC to close the file at some point in the
future if a non-local exit occurs. Search developer.r-project.org for
finalization and weak references for some info on this.

One other option is to use the R_ToplevelExec function. This has some
drawbacks since it effectively makes invisible all other error
handlers, but it is an option. It is also not officially documented
and subject to change.

> And I think it makes relevant again the question I asked in another
> related thread: how is memory allocated by Calloc() and R_alloc() stand
> up against long jumps?

R_alloc is stack-based; the stack is unwound on a non-local exit, so
this is released on regular exits and non-local ones. It uses R
allocation, so it could itself cause a non-local exit.

Calloc is like calloc but will never return NULL. If the allocation
fails, then an error is signaled, which will result in a non-local
exit. If the allocation succeeds, you are responsable for calling
Free.

luke

> > -Original Message-
> > From: Luke Tierney [mailto:[EMAIL PROTECTED] > PROTECTED]:%20%5BR%5D%20mkChar%20can%20be%20interrupted>]
> > Sent: Monday, June 14, 2004 5:43 PM
> > To: Vadim Ogranovich
> > Cc: R-Help
> > Subject: RE: [R] mkChar can be interrupted
> >
> > On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> >
> > > I am confused. Here is an excerpt from R-exts:
> > >
> > > "As from R 1.8.0 no port of R can be interrupted whilst
> > running long
> > > computations in compiled code,..."
> > >
> > > Doesn't it imply that the primitive functions like allocVector,
> > > mkChar, etc., which are likely to occur in any compiled code called
> > > via .Call, are not supposed to handle interrupts in any way?
> >
> > No it does not. Read the full context. It says that if you
> > wite a piece of C code that may run a long time and you want
> > to guarantee that users will be able to interrupt your code
> > then you should insure that R_CheckUserInterrupt is called
> > periodically. If your code already periodically calls other
> > R code that checks for interrupts then you may not need to do
> > this yourself, but in general you do.
> >
> > Prior to 1.8.0 on Unix-like systems the asynchronous signal
> > handler for SIGINT would longjmp to the nearest top level or
> > browser context, which meant that on these sytems any code
> > was interruptible at any point unless it was explicitly
> > protected by a construct that suspended interrupts. Allowing
> > interrupts at any point meant that inopportune interrupts
> > could and did crash R, which is why this was changed.
> >
> > Unless there is explicit documentation to the contrary you
> > should assume that every function in the R API might allocate
> > and might cause a non-local exit (i.e. a longjmp) when an
> > exception is raised (and an interrupt is one of, but only one
> > of, the exceptions that might occur).
> >
> > luke
> >
> > > Thanks,
> > > Vadim
> > >
> > >
> > > > From: Luke Tierney [mailto:[EMAIL PROTECTED] > > > PROTECTED]:%20%5BR%5D%20mkChar%20can%20be%20interrupted>]
> > > >
> > > > On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> > > >
> > > > > > From: Luke Tierney [mailto:[EMAIL PROTECTED] > > > > > PROTECTED]:%20%5BR%5D%20mkChar%20can%20be%20interrupted>]
> > > ...
> > > > > >
> >

Re: [Rd] HOW TO AVOID LOOPS

2008-04-14 Thread Stephen Milborrow

Le sam. 12 avr. à 12:47, carlos martinez a écrit :
Looking for a simple, effective a minimum execution time solution.

For a vector as:

c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)

To transform it to the following vector without using any loops:

(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)


Here is a fast solution using the Ra just-in-time compiler
www.milbo.users.sonic.net/ra.

jit(1)
if (length(x) > 1)
   for (i in 2:length(x))
   if (x[i])
   x[i] <- x[i-1] + 1

The times in seconds for various solutions mailed to r-devel are listed
below. There is some variation between runs and with the contents of x. The
times shown are for

set.seed(1066);  x <- as.double(runif(1e6) > .5)

This was tested on a WinXP 3 GHz Pentium D with Ra 1.0.7 (based on R 2.6.2).
The code to generate these results is attached.

vin 24
greg   11
had3.9
dan1.4
dan2  1.4
jit   0.25# code is shown above, 7 secs with standard R 2.6.2>

Stephen Milborrow
www.milbo.users.sonic.net

# cm-post.R: compare solutions to the following post to
#r-devel from carlos martinez 12 apr 2008:
# Looking for a simple, effective a minimum execution time solution.
# For a vector as:
# c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
# To transform it to the following vector without using any loops:
# c(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)

set.seed(1066) # for reproducibility
N <- 1e6
x <- as.double(runif(N) > .5)
x[1] <- 0  # seems to be needed for fhad (and fvin?)

fvin <- function(x) {
   ind <- which(x == 0)
   unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, length(x))),
 function(y) cumsum(x[y])))
}
fdan <- function(x) {
d <- diff(c(0,x,0))
starts <- which(d == 1)
ends <- which(d == -1)
x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
x
}
fdan2 <- function(x) {
   runs <- rle(x)
   runlengths <- runs$lengths[runs$values == 1]
   x[x == 1] <- unlist(lapply(runlengths, function(n) 1:n))
   x
}
fhad <- function(x)
   unlist(lapply(split(x, cumsum(x == 0)), seq_along)) - 1

# following requires "ra" for fast times www.milbo.users.sonic.net/ra
library(jit)
fjit <- function(x) { 
   jit(1)

   if (length(x) > 1)
   for (i in 2:length(x))
   if (x[i])
   x[i] <- x[i-1] + 1
   x
}
fgreg <- function(x)
  Reduce( function(x,y) x*y + y, x, accumulate=TRUE )

fanon <- function(x)
x * unlist(lapply(rle(x)$lengths, seq))

cat("times with N =", N, "\n")
cat("dan",  system.time(ydan  <- fdan(x))[3],  "\n")
cat("dan2", system.time(ydan2 <- fdan2(x))[3], "\n")
cat("had",  system.time(yhad  <- fhad(x))[3],  "\n")
cat("vin",  system.time(yvin  <- fvin(x))[3],  "\n")
cat("jit",  system.time(yjit  <- fjit(x))[3],  "\n")
cat("greg", system.time(ygreg <- fgreg(x))[3], "\n")
# very slow cat("anon", system.time(yanon <- fanon(x))[3], "\n")

stopifnot(identical(ydan2, ydan))
stopifnot(identical(as.numeric(yhad), ydan))
stopifnot(identical(yvin, ydan))
stopifnot(identical(yjit, ydan))
stopifnot(identical(ygreg, ydan))
# stopifnot(identical(yanon, ydan))
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] option(expressions) and --max-ppsize

2008-04-14 Thread Tobias Verbeke
Dear list,

Is there an exact formula / safe rule of thumb that allows
one to express the value of --max-ppsize as a function of
the value of getOption("expressions") ?

?options tells "If you increase it [the expressions option],
you may also want to start R with a larger protection stack".

Motivation is to determine stack size of a Java vm used
to launch R with a given --max-ppsize option.

Many thanks in advance,
Tobias

-- 

Tobias Verbeke - Consultant
Business & Decision Benelux
Rue de la révolution 8
1000 Brussels - BELGIUM

+32 499 36 33 15
[EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] unix.time() scoping problem (PR#11169)

2008-04-14 Thread bill
Full_Name: Bill Dunlap
Version: 2.8.0 Under development (unstable) svn 45325
OS: Linux
Submission from: (NULL) (76.28.245.14)


It is difficult to write wrapper functions
for unix.time(expr) because it uses the idiom
expr <- substitute(expr)
eval(expr, envir=sys.parent())
to evaluate the input expression.  Here is an
example of the problem
> elapsed.time<-function(...)unix.time(...)[3]
> sapply(1:3, function(seconds.arg)elapsed.time(Sys.sleep(seconds.arg)))
Error in Sys.sleep(seconds.arg) : object "seconds.arg" not found
Timing stopped at: 0.002 0 0.002

I think thatif unix.time(expr) made use of lazy evaluation of
arguments and used just
expr
at the point where it wanted expr to be evaluated,
then the evaluation would take place in the right frame.
If I make the following change
diff -c /tmp/unix.time.R /tmp/my.unix.time.R
*** /tmp/unix.time.R2008-04-14 12:22:39.0 -0700
--- /tmp/my.unix.time.R 2008-04-14 12:22:32.0 -0700
***
*** 10,23 
  }
  if (!exists("proc.time"))
  return(rep(NA_real_, 5))
- loc.frame <- parent.frame()
  if (gcFirst)
  gc(FALSE)
- expr <- substitute(expr)
  time <- proc.time()
  on.exit(cat("Timing stopped at:", ppt(proc.time() - time),
  "\n"))
! eval(expr, envir = loc.frame)
  new.time <- proc.time()
  on.exit()
  structure(new.time - time, class = "proc_time")
--- 10,21 
  }
  if (!exists("proc.time"))
  return(rep(NA_real_, 5))
  if (gcFirst)
  gc(FALSE)
  time <- proc.time()
  on.exit(cat("Timing stopped at:", ppt(proc.time() - time),
  "\n"))
! expr # evaluate in its original frame by lazy evaluation
  new.time <- proc.time()
  on.exit()
  structure(new.time - time, class = "proc_time")

then my wrapper function works as expected:
> sapply(1:3, function(seconds.arg)elapsed.time(Sys.sleep(seconds.arg)))
elapsed elapsed elapsed
  1.001   2.001   3.001

Another approach would be to find the environment that expr
came from and feed that into eval().  Is there a way to get
the environment that an unevaluated argument was created in?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] NAs and Infinitely Large POSIXct Objects

2008-04-14 Thread McGehee, Robert
Hello,
I got bungled up by the fact that an infinite POSIXct object is
represented by NA, but is not, in fact, NA. While an infinitely large
POSIXct object seems strange, perhaps R should use the convention of
representing it as Inf rather than NA to avoid any confusion.

Cheers,
Robert

> x <- as.POSIXct(NA)
> z <- min(x, x, na.rm=TRUE)
Warning message:
In min.default(NA_real_, NA_real_, na.rm = TRUE) :
  no non-missing arguments to min; returning Inf
> z
[1] NA
> is.na(z)
[1] FALSE
> is.infinite(z)
[1] TRUE

> R.version
   _
platform   i686-pc-linux-gnu
arch   i686
os linux-gnu
system i686, linux-gnu
status
major  2
minor  6.2
year   2008
month  02
day08
svn rev44383
language   R
version.string R version 2.6.2 (2008-02-08)

Robert McGehee, CFA
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Tel: 617/392-8396Fax:617/476-6389
mailto:[EMAIL PROTECTED]



This e-mail, and any attachments hereto, are intended fo...{{dropped:11}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] promptFunctions() to handle multiple names

2008-04-14 Thread John Chambers
Daniel,

Check out the promptAll() function in the SoDA package on CRAN.  
(Because it was written as an example for my new book, it's not the 
fanciest imaginable, but seems to work OK.)

John


Daniel Sabanés Bové wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
>
> Hi all,
>
> I wanted to set up my first (private) R-package and wondered
> if there was a function to prompt() for multiple aliases in one Rd-file,
> e.g. to create something like the normal distribution manual page
> encompassing rnorm, dnorm,...
>
> As I didn't find it, I modified prompt.default() and wrote a small function
> to do this job, called "promptFunctions". It basically calls the helper
> ".promptFunction" for every name it gets and puts together the output
> from each function.
>
> It would be interesting for me if such a function already existed in R
> or if something like "promptFunction" could be included in any future R 
> version.
> I think it would be used as many man pages document several functions at 
> once,
> and cutting and pasting the single prompt() files by hand could be boring.
>
> regards,
> Daniel
>
> The Code:
>
> ## modified prompt.default to handle multiple functions correctly
> promptFunctions <-
> ~function (...,  # objects to be documented
> ~  filename = NULL,  # file name string or NA for 
> console
> ~  names = NULL, # character vector of object names
> ~  rdname = NULL,# name of the documentation
> ~  overwrite = FALSE # overwrite existing Rd file?
> ~  )
> {
> ~## helper functions
> ~paste0 <- function(...) paste(..., sep = "")
> ~is.missing.arg <- function(arg) typeof(arg) == "symbol" &&
> ~deparse(arg) == ""
>
> ~## generate additional names from objects
> ~objects <- as.list (substitute (...[]))
> ~objects <- objects[seq(from = 2, to = length(objects) - 1)]
> ~objects <- sapply(objects, deparse)
>
> ~## merge with names from call and stop if there are no usable names
> ~names <- unique(c(objects, names))
> ~if (is.null(names))
> ~stop ("cannot determine usable names")
>
> ~## determine Rd name
> ~if(is.null(rdname))
> ~rdname <- names[1]
>
> ~## determine file name
> ~if (is.null(filename))
> ~filename <- paste0(rdname, ".Rd")
>
> ~## treat each name individually
> ~promptList <- lapply(names, .promptFunction)
> ~names(promptList) <- names
>
> ~## construct text
> ~Rdtxt <- list()
>
> ~Rdtxt$name <- paste0("\\name{", rdname, "}")
> ~Rdtxt$aliases <- c(paste0("\\alias{", names, "}"),
> ~   paste("%- Also NEED an '\\alias' for EACH other 
> topic",
> ~ "documented here."))
> ~Rdtxt$title <- "\\title{ ~~functions to do ... ~~ }"
> ~Rdtxt$description <- c("\\description{",
> ~   paste("  ~~ A concise (1-5 lines) 
> description of what",
> ~ "the functions"),
> ~   paste("", paste(names, collapse = ", "),
> ~ "do. ~~"),
> ~   "}")
> ~Rdtxt$usage <- c("\\usage{",
> ~ unlist(lapply(promptList, "[[", "usage")),
> ~ "}",
> ~ paste("%- maybe also 'usage' for other objects",
> ~   "documented here."))
> ~arguments <- unique (unlist (lapply(promptList, "[[", "arg.n")))
> ~Rdtxt$arguments <- if(length(arguments))
> ~c("\\arguments{",
> ~  paste0("  \\item{", arguments, "}{",
> ~ " ~~Describe \\code{", arguments, "} here~~ }"),
> ~  "}")
> ~Rdtxt$details <- c("\\details{",
> ~   paste("  ~~ If necessary, more details than the",
> ~ "description above ~~"),
> ~   "}")
> ~Rdtxt$value <- c("\\value{",
> ~ "  ~Describe the values returned",
> ~ "  If it is a LIST, use",
> ~ "  \\item{comp1 }{Description of 'comp1'}",
> ~ "  \\item{comp2 }{Description of 'comp2'}",
> ~ "  ...",
> ~ "}")
> ~Rdtxt$references <- paste("\\references{ ~put references to the",
> ~  "literature/web site here ~ }")
> ~Rdtxt$author <- "\\author{Daniel Saban\\'es Bov\\'e}"
> ~Rdtxt$note <- c("\\note{ ~~further notes~~ ",
> ~"",
> ~paste(" ~Make other sections like Warning with",
> ~  "\\section{Warning }{} ~"),
> ~"}")
> ~Rdtxt$seealso <- paste("\\seealso{ ~~objects to See Also as",
> ~   "\\code{\\link{help}}, ~~~ }")
> ~Rdtxt$examples <- c("\\examples{",
> ~"## Should be DIRECTLY executable !! 

Re: [Rd] Doing the right amount of copy for large data frames.

2008-04-14 Thread Martin Morgan
Hi Gopi

"Gopi Goswami" <[EMAIL PROTECTED]> writes:

> Hi there,
>
>
> Problem ::
> When one tries to change one or some of the columns of a data.frame, R makes
> a copy of the whole data.frame using the '*tmp*' mechanism (this does not
> happen for components of a list, tracemem( ) on R-2.6.2 says so).
>
>
> Suggested solution ::
> Store the columns of the data.frame as a list inside of an environment slot
> of an S4 class, and define the '[', '[<-' etc. operators using setMethod( )
> and setReplaceMethod( ).

The Biocondcutor package Biobase has a class 'ExpressionSet' with slot
assayData. By default assayData is an environment that is 'locked' so
can't be modified casually. The interface to ExpressionSet unlocks the
environment, and copies and modifies it when necessary. This is not
quite the same as you propose, but has some similar characteristics.

I've spent a lot of time with this data structure, and think this
borders on one of those ideas that 'seemed like a good idea at the
time'. You end up using R-level tools to manage memory. Copy-on-change
is better than you might naively think at not making unnecessary
copies. S4 caries significant overhead, including copies during method
dispatch, that work against you (subsetting an expression set in an
OOP way, no behind-the-scenes tricks, makes *5* copies of the S4
instance, though perhaps these are light-weight because the big data
is in an environment). And in the mean time computers have gotten
faster and bigger, and the 'big' data of ExpressionSets are now only
modestly sized or even small.

A somewhat different approach is in the Biostrings package, for
instance DNAStringSet, where the original object is 'read-only'. The
user is presented with a 'view' into the object; changing the view
(subsetting) changes the indicies in the view but not the original
data. This is both fast and memory efficient. This is a read-only
solution, though.

Hope that helps, Martin

> Question ::
> This implementation will violate copy on modify principle of R (since
> environments are not copied), but will save a lot of memory. Do you see any
> other obvious problem(s) with the idea? Have you seen a related setup
> implemented / considered before (apart from the packages like filehash, ff,
> and database related ones for saving memory)?
>
>
> Implementation code snippet ::
> ### The S4 class.
> setClass('DataFrame',
>   representation(data = 'data.frame', nrow = 'numeric', ncol =
> 'numeric', store = 'environment'),
>   prototype(data = data.frame( ), nrow = 0, ncol = 0))
>
> setMethod('initialize', 'DataFrame', function(.Object) {
> .Object <- callNextMethod( )
> [EMAIL PROTECTED] <- new.env(hash = TRUE)
> assign('data', as.list([EMAIL PROTECTED]), [EMAIL PROTECTED])
> [EMAIL PROTECTED] <- nrow([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- ncol([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- data.frame( )
> .Object
> })
>
>
> ### Usage:
> nn  <- 10
> ## dd1 below could possibly be created by read.table or scan and data.frame
> dd1 <- data.frame(xx = rnorm(nn), yy = rnorm(nn))
> dd2 <- new('DataFrame', data = dd1)
> rm(dd1)
> ## Now work with dd2
>
>
> Thanks a lot,
> Gopi Goswami.
> PhD, Statistics, 2005
> http://gopi-goswami.net/index.html
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] HOW TO AVOID LOOPS

2008-04-14 Thread Greg Snow
I would be interested to see how the following approach compares to the other 
suggestions:

> x <- c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> test <- c(0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> out <- Reduce( function(x,y) x*y + y, x, accumulate=TRUE )
> all.equal(out,test)
[1] TRUE

For the second question, you can do something like:

> test2 <- c(0,0,1,0,0,0,3,0,0,0,2,0,1,0,0,0,0,0,0,6)
> out2 <- out * c( out[-1]==0, 1 )
> all.equal(out2,test2)
[1] TRUE



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of carlos martinez
> Sent: Saturday, April 12, 2008 7:33 PM
> To: r-devel@r-project.org
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Rd] HOW TO AVOID LOOPS
> 
> Appreciate the ingenious and effective suggestions and feedback from:
> 
> Dan Davison
> Vincent Goulet
> Martin Morgan
> Hadley Wickham
> 
> The variety of technical approaches proposes so far are clear 
> prove of the strong and flexible capabilites of the R system, 
> and specially the dynamics and technical understanding of the 
> R user base.
> 
> We tested all four recommendations with an input vector of 
> more than 85 components, and got time-responses from 
> about 40-second to 20-seconds.
> 
> All four approches produced the desired vector. The Wickham's 
> approach produced and extra vector, but the second vector 
> included the correct format.
> 
> Just one additional follow up, to obtain from the same input vector:
> c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> 
> A vector of the following format:
> (0,0,1,0,0,0,3,0,0,0,2,0,1,0,0,0,0,0,6)
> 
> Will be easier and more efficient to start from the original 
> input vector, or start from the above second vector
> (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> 
> Thanks for your responses.
> 
> --
> ---
> Hadley Wickham Approach
> 
> How about:
> 
> unlist(lapply(split(x, cumsum(x == 0)), seq_along)) - 1
> 
> Hadley
> --
> 
> -Original Message-
> From: Martin Morgan [mailto:[EMAIL PROTECTED]
> Sent: Saturday, April 12, 2008 5:00 PM
> To: Dan Davison
> Cc: [EMAIL PROTECTED]
> Subject: Re: [Rd] HOW TO AVOID LOOPS
> 
> (anonymous 'off-list' response; some extra calcs but tidy)
> 
> > x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> > x * unlist(lapply(rle(x)$lengths, seq))
>  [1] 0 0 1 0 1 2 3 0 0 1 2 0 1 0 1 2 3 4 5 6
> 
> 
> Dan Davison <[EMAIL PROTECTED]> writes:
> 
> > On Sat, Apr 12, 2008 at 06:45:00PM +0100, Dan Davison wrote:
> >> On Sat, Apr 12, 2008 at 01:30:13PM -0400, Vincent Goulet wrote:
> >> > Le sam. 12 avr. à 12:47, carlos martinez a écrit :
> >> > >> Looking for a simple, effective a minimum execution 
> time solution.
> >> > >>
> >> > >> For a vector as:
> >> > >>
> >> > >> c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >> > >>
> >> > > To transform it to the following vector without using 
> any loops:
> >> > >
> >> > >> (0,0,1,0,1,2,3,0,0,1,2,0,1,0,1,2,3,4,5,6)
> >> > >>
> >> > > Appreciate any suggetions.
> >> > 
> >> > This does it -- but it is admittedly ugly:
> >> > 
> >> >  > x <- c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)
> >> >  > ind <- which(x == 0)
> >> >  > unlist(lapply(mapply(seq, ind, c(tail(ind, -1) - 1, 
> length(x))),
> >> > function(y) cumsum(x[y])))
> >> >   [1] 0 0 1 0 1 2 3 0 0 1 2 0 1 0 1 2 3 4 5 6
> >> > 
> >> > (The mapply() part is used to create the indexes of each 
> sequence 
> >> > in x starting with a 0. The rest is then straightforward.)
> >> 
> >> 
> >> Here's my effort. Maybe a bit easier to digest? Only one *apply so
> probably more efficient.
> >> 
> >> function(x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)) {
> >> d <- diff(c(0,x,0))
> >> starts <- which(d == 1)
> >> ends <- which(d == -1)
> >> x[x == 1] <- unlist(lapply(ends - starts, function(n) 1:n))
> >> x
> >> }
> >> 
> >
> > Come to think of it, I suggest using the existing R function rle(), 
> > rather
> than my dodgy substitute.
> >
> > e.g.
> >
> > g <- function(x=c(0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,1,1,1,1,1)) {
> >
> > runs <- rle(x)
> > runlengths <- runs$lengths[runs$values == 1]
> > x[x == 1] <- unlist(lapply(runlengths, function(n) 1:n))
> > x
> > }
> >
> > Dan
> >
> > p.s. R-help would perhaps have been more appropriate than R-devel
> >
> >
> >> Dan
> >> 
> >> 
> >> > 
> >> > HTH
> >> > 
> >> > ---
> >> >Vincent Goulet, Associate Professor
> >> >École d'actuariat
> >> >Université Laval, Québec
> >> >[EMAIL PROTECTED]   http://vgoulet.act.ulaval.ca
> >> > 
> >> > __
> >> > R-devel@r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mai

Re: [Rd] Doing the right amount of copy for large data frames.

2008-04-14 Thread Tony Plate
Gopi Goswami wrote:
> Hi there,
>
>
> Problem ::
> When one tries to change one or some of the columns of a data.frame, R makes
> a copy of the whole data.frame using the '*tmp*' mechanism (this does not
> happen for components of a list, tracemem( ) on R-2.6.2 says so).
>
>
> Suggested solution ::
> Store the columns of the data.frame as a list inside of an environment slot
> of an S4 class, and define the '[', '[<-' etc. operators using setMethod( )
> and setReplaceMethod( ).
>
>
> Question ::
> This implementation will violate copy on modify principle of R (since
> environments are not copied), but will save a lot of memory. Do you see any
> other obvious problem(s) with the idea?
Well, because it violates the copy-on-modify principle it can 
potentially break code that depends on this principle.  I don't know how 
much there is -- did you try to see if R and recommended packages will 
pass checks with this change in place?
>  Have you seen a related setup
> implemented / considered before (apart from the packages like filehash, ff,
> and database related ones for saving memory)?
>   
I've frequently used a personal package that stores array data in a file 
(like ff).  It works fine, and I partially get around the problem of 
violating the copy-on-modify principle by having a readonly flag in the 
object -- when the flag is set to allow modification I have to be 
careful, but after I set it to readonly I can use it more freely with 
the knowledge that if some function does attempt to modify the object, 
it will stop with an error.

In this particular case, why not just track down why data frame 
modification is copying the entire object and suggest a change so that 
it just copies the column being changed?  (should be possible if list 
modification doesn't copy all components).

-- Tony Plate
>
> Implementation code snippet ::
> ### The S4 class.
> setClass('DataFrame',
>   representation(data = 'data.frame', nrow = 'numeric', ncol =
> 'numeric', store = 'environment'),
>   prototype(data = data.frame( ), nrow = 0, ncol = 0))
>
> setMethod('initialize', 'DataFrame', function(.Object) {
> .Object <- callNextMethod( )
> [EMAIL PROTECTED] <- new.env(hash = TRUE)
> assign('data', as.list([EMAIL PROTECTED]), [EMAIL PROTECTED])
> [EMAIL PROTECTED] <- nrow([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- ncol([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- data.frame( )
> .Object
> })
>
>
> ### Usage:
> nn  <- 10
> ## dd1 below could possibly be created by read.table or scan and data.frame
> dd1 <- data.frame(xx = rnorm(nn), yy = rnorm(nn))
> dd2 <- new('DataFrame', data = dd1)
> rm(dd1)
> ## Now work with dd2
>
>
> Thanks a lot,
> Gopi Goswami.
> PhD, Statistics, 2005
> http://gopi-goswami.net/index.html
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Doing the right amount of copy for large data frames.

2008-04-14 Thread Peter Dalgaard
Gopi Goswami wrote:
> Hi there,
>
>
> Problem ::
> When one tries to change one or some of the columns of a data.frame, R makes
> a copy of the whole data.frame using the '*tmp*' mechanism (this does not
> happen for components of a list, tracemem( ) on R-2.6.2 says so).
>
>
> Suggested solution ::
> Store the columns of the data.frame as a list inside of an environment slot
> of an S4 class, and define the '[', '[<-' etc. operators using setMethod( )
> and setReplaceMethod( ).
>
>
> Question ::
> This implementation will violate copy on modify principle of R (since
> environments are not copied), but will save a lot of memory. Do you see any
> other obvious problem(s) with the idea? Have you seen a related setup
> implemented / considered before (apart from the packages like filehash, ff,
> and database related ones for saving memory)?
>
>
>   
A short --- although crass --- reply is that you should not meddle with
this until you know _exactly_ what you are doing

Two main points are that (a) copying of dataframes in principle only
copies pointers to each variable, until the actual contents are modified
and (b) breaking copy-on-modify (and consequently effectively also break
pass-by-value) semantics is a source of unhappiness.

R does duplicate rather more than it needs to, but the main reason
probably lies in its rudimentary reference tracking (the NAMED entry in
the object header structure). Some of us do wish we could try and fix
this at some point, but it would be a major undertaking. (There are a
zillion places where we'd need to do extra housekeeping rather than let
the garbage collector tidy up after us. Also, reference-counting
solutions from other computer languages do not apply because R can have
circular references.)


> Implementation code snippet ::
> ### The S4 class.
> setClass('DataFrame',
>   representation(data = 'data.frame', nrow = 'numeric', ncol =
> 'numeric', store = 'environment'),
>   prototype(data = data.frame( ), nrow = 0, ncol = 0))
>
> setMethod('initialize', 'DataFrame', function(.Object) {
> .Object <- callNextMethod( )
> [EMAIL PROTECTED] <- new.env(hash = TRUE)
> assign('data', as.list([EMAIL PROTECTED]), [EMAIL PROTECTED])
> [EMAIL PROTECTED] <- nrow([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- ncol([EMAIL PROTECTED])
> [EMAIL PROTECTED] <- data.frame( )
> .Object
> })
>
>
> ### Usage:
> nn  <- 10
> ## dd1 below could possibly be created by read.table or scan and data.frame
> dd1 <- data.frame(xx = rnorm(nn), yy = rnorm(nn))
> dd2 <- new('DataFrame', data = dd1)
> rm(dd1)
> ## Now work with dd2
>
>
> Thanks a lot,
> Gopi Goswami.
> PhD, Statistics, 2005
> http://gopi-goswami.net/index.html
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Doing the right amount of copy for large data frames.

2008-04-14 Thread Gopi Goswami
Hi there,


Problem ::
When one tries to change one or some of the columns of a data.frame, R makes
a copy of the whole data.frame using the '*tmp*' mechanism (this does not
happen for components of a list, tracemem( ) on R-2.6.2 says so).


Suggested solution ::
Store the columns of the data.frame as a list inside of an environment slot
of an S4 class, and define the '[', '[<-' etc. operators using setMethod( )
and setReplaceMethod( ).


Question ::
This implementation will violate copy on modify principle of R (since
environments are not copied), but will save a lot of memory. Do you see any
other obvious problem(s) with the idea? Have you seen a related setup
implemented / considered before (apart from the packages like filehash, ff,
and database related ones for saving memory)?


Implementation code snippet ::
### The S4 class.
setClass('DataFrame',
  representation(data = 'data.frame', nrow = 'numeric', ncol =
'numeric', store = 'environment'),
  prototype(data = data.frame( ), nrow = 0, ncol = 0))

setMethod('initialize', 'DataFrame', function(.Object) {
.Object <- callNextMethod( )
[EMAIL PROTECTED] <- new.env(hash = TRUE)
assign('data', as.list([EMAIL PROTECTED]), [EMAIL PROTECTED])
[EMAIL PROTECTED] <- nrow([EMAIL PROTECTED])
[EMAIL PROTECTED] <- ncol([EMAIL PROTECTED])
[EMAIL PROTECTED] <- data.frame( )
.Object
})


### Usage:
nn  <- 10
## dd1 below could possibly be created by read.table or scan and data.frame
dd1 <- data.frame(xx = rnorm(nn), yy = rnorm(nn))
dd2 <- new('DataFrame', data = dd1)
rm(dd1)
## Now work with dd2


Thanks a lot,
Gopi Goswami.
PhD, Statistics, 2005
http://gopi-goswami.net/index.html

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in ci.plot(HH Package) (PR#11163)

2008-04-14 Thread ligges
Please report bugs in contributes packages to the package maintainer 
(CCing), not to R-bugs.

Best,
Uwe Ligges


[EMAIL PROTECTED] wrote:
> Full_Name: Yasuhiro Nakajima
> Version: 2.6.1
> OS: WinXP SP2
> Submission from: (NULL) (202.237.255.13)
> 
> 
> Dear all,
> 
> I noticed the following behaviour of ci.plot in HH Package(ver.2.1-9):
> 
>> library(HH)
>> data(women, package="datasets")
>> attach(women)
>> ft <- lm(height~weight)
>> windows()
>> ci.plot(ft,conf.level=0.95)
>> windows()
>> ci.plot(ft,conf.level=0.999)
> 
> I tried to change the confidence interval, but I couldn't.
> CI was "always" 0.95. 
> 
> I have found wrong argument in predict function in ci.plot.
> 
>>>   predict(., conf.level=conf.level)
> 
> I think the correct is "level=conf.level".
> 
> Is that right?
> 
> 
> --- ci.plot source code ---
> "ci.plot" <-
> function(lm.object, ...)
>   UseMethod("ci.plot")
> 
> "ci.plot.lm" <-
> function(lm.object,
>  xlim=range(data[, x.name]),
>  newdata,
>  conf.level=.95,
>  data=model.frame(lm.object),
>  newfit,
>  ylim=range(newfit$pi.fit),
>  pch=16,
>  main.cex=1,
>  main=list(paste(100*conf.level,
>"% confidence and prediction intervals for ",
>substitute(lm.object), sep=""), cex=main.cex), ...
>  ) {
>   formula.lm <- formula(lm.object)
>   x.name <- as.character(formula.lm[[3]])
>   missing.xlim <- missing(xlim)   ## R needs this
>   missing.newdata <- missing(newdata) ## R needs this
>   if.R(s={
> ## Save a copy of the data.frame in frame=0 to put it where
> ## model.frame.lm needs to find it when the example data is
> ## run through Splus CMD check.
> my.data.name <- as.character(lm.object$call$data)
> if (length(my.data.name)==0)
>   stop("Please provide an lm.object calculated with an explicit
> 'data=my.data.frame' argument.")
> undo.it <- (!is.na(match(my.data.name, objects(0
> if (undo.it) old.contents <- get(my.data.name, frame=0)
> my.data <- try(get(my.data.name))
> if (class(my.data)=="Error")
>   my.data <- try(get(my.data.name, frame=sys.parent()))
> if (class(my.data)=="Error")
>   stop("Please send me an email with a reproducible situation that got you
> here. ([EMAIL PROTECTED])")
> assign(my.data.name, my.data, frame=0)
>   },r={})
>   default.newdata <- data.frame(seq(xlim[1], xlim[2], length=51))
>   names(default.newdata) <- x.name
>   if (missing.xlim) xlim <- xlim + diff(xlim)*c(-.02,.02) ## needed
>   if (missing.newdata) {
> newdata <- default.newdata
> newdata.x <- numeric()
>   }
>   else {
> if (is.na(match(x.name, names(newdata
>   stop(paste("'newdata' must be a data.frame containing a column named '",
>  x.name, "'", sep=""))
> if (missing.xlim)
>   xlim=range(xlim, newdata[[x.name]])
> newdata.x <- as.data.frame(newdata)[,x.name]
> newdata <- rbind(as.data.frame(newdata)[,x.name, drop=FALSE],
>  default.newdata)
> newdata <- newdata[order(newdata[,x.name]), , drop=FALSE]
>   }
>   if (missing.xlim) xlim <- xlim + diff(xlim)*c(-.02,.02) ## repeat is needed
>   if (missing(newfit)) newfit <-
> if.R(s={
>   
>   prediction <-
> predict(lm.object, newdata=newdata,
> se.fit=TRUE, ci.fit=TRUE, pi.fi=TRUE,
> conf.level=conf.level)
>   {
> ## restore frame=0
> if (undo.it) assign(my.data.name, old.contents, frame=0)
> else remove(my.data.name, frame=0)
>   }
>   prediction
> }
>  ,r={
>new.p <-
>  predict(lm.object, newdata=newdata,
>  se.fit=TRUE, conf.level=conf.level,
>  interval = "prediction")
>new.c <-
>  predict(lm.object, newdata=newdata,
>  se.fit=TRUE, conf.level=conf.level,
>  interval = "confidence")
>tmp <- new.p
>tmp$ci.fit <- new.c$fit[,c("lwr","upr"), drop=FALSE]
>dimnames(tmp$ci.fit)[[2]] <- c("lower","upper")
>attr(tmp$ci.fit,"conf.level") <- conf.level
>tmp$pi.fit <- new.p$fit[,c("lwr","upr"), drop=FALSE]
>dimnames(tmp$pi.fit)[[2]] <- c("lower","upper")
>attr(tmp$pi.fit,"conf.level") <- conf.level
>tmp$fit <- tmp$fit[,"fit", drop=FALSE]
>tmp
>  })
>   tpgsl <- trellis.par.get("superpose.line")
>   tpgsl <- Rows(tpgsl, 1:4)
>   tpgsl$col[1] <- 0
>   xyplot(formula.lm, data=data, newdata=newdata, newfit=newfit,
>  newdata.x=newdata.x,
>  xlim=xlim, ylim=ylim, pch=pch,
>  panel=function(..., newdata.x) {
>panel.ci.plot(...)
>if (length(newdata.x) > 0)
>  panel.rug(x=newdata.x)
>  },
>  main=main,
>  key=list(border=TRUE,
>space="right",
>text=list(c("observed",

Re: [Rd] && and ||

2008-04-14 Thread Simon Blomberg
Use the vector versions: & and |

Cheers,

Simon.

On Sun, 2008-04-13 at 23:48 -0700, Yuan Jian wrote:
> Hello there,
>
>   I got a small problem about logical calculation:
>   we can get a sequene from a+b as below:
>
>   > a<-c(1,2)
> > b<-c(3,4)
> > a+b
> [1] 4 6
>
>   but when the sequences are logical. (I want to get (True,False) && (True, 
> True) ==> (True, False), but when I do as below.
> > e<-c(T,T)
> > f<-c(F,T)
> > e
> [1] TRUE TRUE
> > f
> [1] FALSE  TRUE
> > g<-e && f
>   **g becomes one logical value only
> > g
> [1] FALSE
> > 
> what should I do when I want to get a sequence for operate && or ||?
>
>   kind regards
>   Yu
>
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
-- 
Simon Blomberg, BSc (Hons), PhD, MAppStat. 
Lecturer and Consultant Statistician 
Faculty of Biological and Chemical Sciences 
The University of Queensland 
St. Lucia Queensland 4072 
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au

Policies:
1.  I will NOT analyse your data for you.
2.  Your deadline is your problem.

The combination of some data and an aching desire for 
an answer does not ensure that a reasonable answer can 
be extracted from a given body of data. - John Tukey.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] (PR#11161) Incorrect @INC: Rcmd SHLIB error under Windows

2008-04-14 Thread ripley
The bug is yours: you are using Cygwin Perl, not a Windows Perl.

This is specifically warned about in the manual:

   @strong{Beware}: you do need a @emph{Windows} port and not the Cygwin
   one.  Users of 64-bit Windows can use a Win64 Perl (such as that from
   ActiveState) if they prefer.

You can't expect to use tools from a different operating system (Cygwin is 
effectively a guest OS) with different conventions.

You also missed

   @emph{This section contains a lot of prescriptive comments.  They are
   here as a result of bitter experience.  Please do not report problems
   to R-help unless you have followed all the prescriptions.}

Do we need to say 'do not report problems to R-help, let alone as bugs 
...'?



On Sun, 13 Apr 2008, [EMAIL PROTECTED] wrote:

> Hi, R team.
>
> I'm trying to build a dll from a c program to be invoked within R using
> the .C() functionality.
>
> Everything works like a charm on my Linux (Centos 5) (also 2.6.2) machines---
> but under windows (Vista Ultimate) upon running (in either the windows 'Cmd'
> command window or a Bash window) the command
>
> Rcmd SHLIB myfun.c
>
> I receive the error:
>
> Can't locate R/Utils.pm in
> @INC (@INC contains: c \PROGRA~1\R\R-26~1.2\share\perl;
> /usr/lib/perl5/5.8/cygwin /usr/lib/perl5/5.8
> /usr/lib/perl5/site_perl/5.8/cygwin /usr/lib/perl5/site_perl/5.8
> /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8/cygwin
> /usr/lib/perl5/vendor_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .) at
> c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line 22.
> BEGIN failed--compilation aborted at c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line
> 22.
>
> -- -- -- -- --
>
> Notice the first entry in @INC is incorrect---  the item
>
>   c \PROGRA~1\R\R-26~1.2\share\perl;

That is two entries, not one.

> SHOULD be
>
>   c:\PROGRA~1\R\R-26~1.2\share\perl;
>
> but a space appears where the colon should be.
>
> The neeeded file 'R/Utils.pm' is in fact located where it should be,
> below 'c:\PROGRA~1\R\R-26~1.2\share\perl', but perl can't find it because
> INC is set incorrectly.
>
> Any suggestions?  Where IS 'INC' set?  Cheers,-R
>
> --
> Prof. Robert L. Wolpert : <[EMAIL PROTECTED]>  : +1-919-684-3275
> Duke Univ. Dept. of Statistical Science  :  211c Old Chem, Box 90251
> & Nicholas School of the Environment :   www.stat.Duke.edu/~rlw/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Incorrect @INC: Rcmd SHLIB error under Windows (2.6.2, 44383) (PR#11161)

2008-04-14 Thread Mathieu Ribatet
Has your environment variable path been set correctly?

It's been a while that I'm not using R under windows but remember had 
experienced the same error. I remember that I had to put many paths as 
environment variable to build R packages properly.

Best,
Mathieu


[EMAIL PROTECTED] a écrit :
> Hi, R team.
>
> I'm trying to build a dll from a c program to be invoked within R using 
> the .C() functionality.
>
> Everything works like a charm on my Linux (Centos 5) (also 2.6.2) machines--- 
> but under windows (Vista Ultimate) upon running (in either the windows 'Cmd' 
> command window or a Bash window) the command
>
> Rcmd SHLIB myfun.c
>
> I receive the error:
>
> Can't locate R/Utils.pm in
> @INC (@INC contains: c \PROGRA~1\R\R-26~1.2\share\perl;
> /usr/lib/perl5/5.8/cygwin /usr/lib/perl5/5.8
> /usr/lib/perl5/site_perl/5.8/cygwin /usr/lib/perl5/site_perl/5.8
> /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8/cygwin
> /usr/lib/perl5/vendor_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .) at
> c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line 22.
> BEGIN failed--compilation aborted at c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line
> 22.
>
> -- -- -- -- --
>
> Notice the first entry in @INC is incorrect---  the item
>
>c \PROGRA~1\R\R-26~1.2\share\perl;
>
> SHOULD be
>
>c:\PROGRA~1\R\R-26~1.2\share\perl;
>
> but a space appears where the colon should be.
>
> The neeeded file 'R/Utils.pm' is in fact located where it should be,
> below 'c:\PROGRA~1\R\R-26~1.2\share\perl', but perl can't find it because
> INC is set incorrectly.
>
> Any suggestions?  Where IS 'INC' set?  Cheers,-R
>
> --
>  Prof. Robert L. Wolpert : <[EMAIL PROTECTED]>  : +1-919-684-3275
>  Duke Univ. Dept. of Statistical Science  :  211c Old Chem, Box 90251
>  & Nicholas School of the Environment :   www.stat.Duke.edu/~rlw/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   

-- 
Institute of Mathematics
Ecole Polytechnique Fédérale de Lausanne
STAT-IMA-FSB-EPFL, Station 8
CH-1015 Lausanne   Switzerland
http://stat.epfl.ch/
Tel: + 41 (0)21 693 7907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] && and |

2008-04-14 Thread Jared O'Connell
Use just "&" or "|", these perform element wise,

> c(1,2) && c(0,1)
[1] FALSE
> c(1,2) & c(0,1)
[1] FALSE  TRUE

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Incorrect @INC: Rcmd SHLIB error under Windows (2.6.2, 44383) (PR#11165)

2008-04-14 Thread mathieu . ribatet
Has your environment variable path been set correctly?

It's been a while that I'm not using R under windows but remember had 
experienced the same error. I remember that I had to put many paths as 
environment variable to build R packages properly.

Best,
Mathieu


[EMAIL PROTECTED] a écrit :
> Hi, R team.
>
> I'm trying to build a dll from a c program to be invoked within R using 
> the .C() functionality.
>
> Everything works like a charm on my Linux (Centos 5) (also 2.6.2) machines--- 
> but under windows (Vista Ultimate) upon running (in either the windows 'Cmd' 
> command window or a Bash window) the command
>
> Rcmd SHLIB myfun.c
>
> I receive the error:
>
> Can't locate R/Utils.pm in
> @INC (@INC contains: c \PROGRA~1\R\R-26~1.2\share\perl;
> /usr/lib/perl5/5.8/cygwin /usr/lib/perl5/5.8
> /usr/lib/perl5/site_perl/5.8/cygwin /usr/lib/perl5/site_perl/5.8
> /usr/lib/perl5/site_perl/5.8 /usr/lib/perl5/vendor_perl/5.8/cygwin
> /usr/lib/perl5/vendor_perl/5.8 /usr/lib/perl5/vendor_perl/5.8 .) at
> c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line 22.
> BEGIN failed--compilation aborted at c:\PROGRA~1\R\R-26~1.2/bin/SHLIB line
> 22.
>
> -- -- -- -- --
>
> Notice the first entry in @INC is incorrect---  the item
>
>c \PROGRA~1\R\R-26~1.2\share\perl;
>
> SHOULD be
>
>c:\PROGRA~1\R\R-26~1.2\share\perl;
>
> but a space appears where the colon should be.
>
> The neeeded file 'R/Utils.pm' is in fact located where it should be,
> below 'c:\PROGRA~1\R\R-26~1.2\share\perl', but perl can't find it because
> INC is set incorrectly.
>
> Any suggestions?  Where IS 'INC' set?  Cheers,-R
>
> --
>  Prof. Robert L. Wolpert : <[EMAIL PROTECTED]>  : +1-919-684-3275
>  Duke Univ. Dept. of Statistical Science  :  211c Old Chem, Box 90251
>  & Nicholas School of the Environment :   www.stat.Duke.edu/~rlw/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   

-- 
Institute of Mathematics
Ecole Polytechnique Fédérale de Lausanne
STAT-IMA-FSB-EPFL, Station 8
CH-1015 Lausanne   Switzerland
http://stat.epfl.ch/
Tel: + 41 (0)21 693 7907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] && and ||

2008-04-14 Thread Mathieu Ribatet
Well, you just have to use "&", "|" instead of "&&", "||".

Best,
Mathieu

Yuan Jian a écrit :
> Hello there,
>
>   I got a small problem about logical calculation:
>   we can get a sequene from a+b as below:
>
>   > a<-c(1,2)
>   
>> b<-c(3,4)
>> a+b
>> 
> [1] 4 6
>
>   but when the sequences are logical. (I want to get (True,False) && (True, 
> True) ==> (True, False), but when I do as below.
>   
>> e<-c(T,T)
>> f<-c(F,T)
>> e
>> 
> [1] TRUE TRUE
>   
>> f
>> 
> [1] FALSE  TRUE
>   
>> g<-e && f
>> 
>   **g becomes one logical value only
>   
>> g
>> 
> [1] FALSE
>   
> what should I do when I want to get a sequence for operate && or ||?
>
>   kind regards
>   Yu
>
>
> 
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   

-- 
Institute of Mathematics
Ecole Polytechnique Fédérale de Lausanne
STAT-IMA-FSB-EPFL, Station 8
CH-1015 Lausanne   Switzerland
http://stat.epfl.ch/
Tel: + 41 (0)21 693 7907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel