[Rd] Defining an alias for a generic function and callNextMethod() strange behaviour

2008-09-09 Thread Herve Pages

Hi,

My package contains the following foo() generic and methods (simplified
version):

  setGeneric("foo", signature="x", function(x, y=NA) standardGeneric("foo"))
  setMethod("foo", "ANY", function(x, y=NA) list(x, y))
  setMethod("foo", "character", function(x, y=NA) unlist(callNextMethod()))

  > foo(5)
  [[1]]
  [1] 5

  [[2]]
  [1] NA

  > foo("a")
  [1] "a" NA

And I want to define a temporary alias for foo() for backward
compatibility with some existing code:

  oldfoo <- function(...) { .Deprecated("foo"); foo(...) }

  > oldfoo(5)
  [[1]]
  [1] 5

  [[2]]
  [1] NA

  Warning message:
  'oldfoo' is deprecated.
  Use 'foo' instead.
  See help("Deprecated")

  > oldfoo("a")
  Error in .nextMethod() : argument "x" is missing, with no default
  In addition: Warning message:
  'oldfoo' is deprecated.
  Use 'foo' instead.
  See help("Deprecated")

Why isn't this working?

One way to make this work is to specify the arguments in the call
to callNextMethod(), or in the definition of the oldfoo alias.
But wouldn't that be nice to be able to just use
fooalias <- function(...) foo(...) for aliasing?

Thanks!
H.


R version 2.7.0 (2008-04-22)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Cannot link mypackage to 2 other packages

2008-08-08 Thread Herve Pages

Prof Brian Ripley wrote:
Thanks for the examples. The specific problems was a typo, but there was 
another on Windows (missing quotes).


This should work in R-patched and R-devel on Linux and Windows now.


Yes this works now. Thanks!

H.



On Thu, 7 Aug 2008, Herve Pages wrote:


Prof Brian Ripley wrote:
[...]


Somewhere on your website would be best.


Here they are (stub versions):

 http://bioconductor.org/packages/misc/

mypackage tries to link to packageA and packageB but this is what I get:

[EMAIL PROTECTED]:~> R-2.8 CMD INSTALL mypackage_0.0.0.tar.gz
* Installing to library '/home/hpages/R-2.8/library'
* Installing *source* package 'mypackage' ...
** libs
gcc -std=gnu99 -I/home/hpages/R-2.8/include  -I/usr/local/include 
-I"/home/hpages/R-2.8/library/packageA/include"   -fpic  -g -O2 -Wall 
-c R_init_mypackage.c -o R_init_mypackage.o

In file included from R_init_mypackage.c:1:
mypackage.h:2:30: error: packageB_defines.h: No such file or directory
make: *** [R_init_mypackage.o] Error 1
ERROR: compilation failed for package 'mypackage'
** Removing '/home/hpages/R-2.8/library/mypackage'

[EMAIL PROTECTED]:~> gcc --version
gcc (GCC) 4.1.0 (SUSE Linux)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There 
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
PURPOSE.


[EMAIL PROTECTED]:~> cat /etc/SuSE-release
SUSE LINUX 10.1 (X86-64)
VERSION = 10.1

I get the same on 64-bit openSUSE 10.3

Thanks!

H.





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rowSums()/colSums() don't preserve the 'integer' storage mode

2008-07-17 Thread Herve Pages

Hi Bill,

[EMAIL PROTECTED] wrote:

I don't see the cost of doing so paying off.


The cost seems rather small. For the "if" block that is tagged /* columns */
in the do_colsum() function (src/main/array.c), that would give something
like:

if (OP == 0 || OP == 1) { /* columns */
int ans_type = (type == REALSXP || OP == 1) ? REALSXP : INTSXP;
int ina = type == INTSXP ? NA_INTEGER : NA_LOGICAL;
PROTECT(ans = allocVector(ans_type, p));
for (j = 0; j < p; j++) {
double rsum = 0.0;
int isum = 0;
switch (type) {
case REALSXP:
rx = REAL(x) + n*j;
for (i = cnt = 0; i < n; i++, rx++) {
if (!ISNAN(*rx)) {cnt++; rsum += *rx;}
else if (keepNA) {rsum = NA_REAL; break;}
}
if (OP == 1) {
if (cnt == 0 || ISNAN(rsum))
REAL(ans)[j] = NA_REAL;
else
REAL(ans)[j] = rsum / cnt;
} else
REAL(ans)[j] = rsum;
break;
case INTSXP: case LGLSXP:
ix = (type == INTSXP ? INTEGER(x) : LOGICAL(x)) + n*j;
for (i = cnt = 0; i < n; i++, ix++) {
if (*ix != ina) {cnt++; isum += *ix;}
else if (keepNA) {isum = NA_INTEGER; break;}
}
if (OP == 1) {
if (cnt == 0 || isum == NA_INTEGER)
REAL(ans)[j] = NA_REAL;
else
REAL(ans)[j] = ((double) isum) / cnt;
} else
INTEGER(ans)[j] = isum;
break;
default:
/* we checked the type above, but be sure */
UNIMPLEMENTED_TYPEt("do_colsum", type);
}
}
}

So now you have 42 lines instead of 37 in the current code.
Then do something similar for the "if" block tagged /* rows */.
Basically the new code would not be more complicated than the
current code.

In the end rowSums()/colSums() will do the _right_ thing i.e.
they'll preserve the 'integer' storage mode, and, by doing so, will
behave consistently with the sum() function.

BTW since the LGLSXP type is supported then the "'x' must be numeric"
error msg at the beginning of the do_colsum() function is inappropriate.

Cheers,
H.




storage.mode is really only important if you are passing arguments to
compiled code.

If you are passing to compiled code, you really need to ensure the
storage mode is what you think it is, anyway.

Bill Venables.  




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Herve Pages
Sent: Thursday, 17 July 2008 3:48 PM
To: R-devel@r-project.org
Subject: [Rd] rowSums()/colSums() don't preserve the 'integer' storage
mode

Hi,

Wouldn't that make sense to have rowSums()/colSums() to preserve the
storage mode?

m <- matrix(1:15, nrow=5)

 > storage.mode(m)
[1] "integer"

 > storage.mode(sum(m))
[1] "integer"

 > storage.mode(rowSums(m))
[1] "double" <--- surprising!

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rowSums()/colSums() don't preserve the 'integer' storage mode

2008-07-16 Thread Herve Pages

Hi,

Wouldn't that make sense to have rowSums()/colSums() to preserve the
storage mode?

m <- matrix(1:15, nrow=5)

> storage.mode(m)
[1] "integer"

> storage.mode(sum(m))
[1] "integer"

> storage.mode(rowSums(m))
[1] "double" <--- surprising!

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R 2.7.0, match() and strings containing \0 - bug?

2008-04-28 Thread Herve Pages

Hi Jon,

Jon Clayden wrote:

Hi,

A piece of my code that uses readBin() to read a certain file type is
behaving strangely with R 2.7.0. This seems to be because of a failure
to match() strings after using rawToChar() when the original was
terminated with a "\0" character. Direct equality testing with ==
still works as expected. I can reproduce this as follows:


x <- "foo"
y <- c(charToRaw("foo"),as.raw(0))
z <- rawToChar(y)
z==x

[1] TRUE

z=="foo"

[1] TRUE

z %in% c("foo","bar")

[1] FALSE

z %in% c("foo","bar","foo\0")

[1] FALSE


But this gives TRUE:

  > z %in% c("foo","bar", z)
  [1] TRUE

An additional problem you have here is that when the "foo\0" string literal
is converted into a character string, then the string data that are after the
first embedded nul are dropped:

  > identical("foo\0a\0b", "foo")
  [1] TRUE

And to add to the endless source of surprises that come with embedded nuls:

  > dump("z", file="")
  z <-
  "foo\0"

but of course sourcing the above dump into an R session will not restore 'z'.

Dropping support for embedded nuls in R 2.8.0 sounds like good news to me.

Cheers,
H.




But without the nul character it works fine:


zz <- rawToChar(charToRaw("foo"))
zz %in% c("foo","bar")

[1] TRUE

I don't see anything about this in the latest NEWS, but is this
expected behaviour? Or is it, as I suspect, a bug? This seems to be
new to R 2.7.0, as I said.

Regards,
Jon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Pb with package::foo(x) <- value

2008-04-18 Thread Herve Pages
Hi,

The parser doesn't seem to like this:

   somePackage::foo(x) <- value
   somePackage:::foo(x) <- value

where foo() is a replacement function or method defined in package somePackage.

For example:

   > x <- integer(4)
   > base::length(x) <- 7
   Error in base::length(x) <- 7 : invalid function in complex assignment

I've tried to put some back quotes on the left side of the assignment in
different ways but was not successful. So finally I had to use the
non-replacement form:

   > x <- base::`length<-`(x, 7)
   > x
   [1]  0  0  0  0 NA NA NA

Is there a way to avoid this?

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-18 Thread Herve Pages
Prof Brian Ripley wrote:
> On Thu, 17 Apr 2008, Herve Pages wrote:
[...]
>> BTW, why not make is.unsorted() a little bit more prepared to silly user
>> input:
> 
> Because R is a volunteer project and resources spent on trapping misuse 
> are resources not available to be spent on other things.  (Same as for 
> bug reports on fixed issues )

I know that, thanks! Hope that you noticed that I'm not requesting a new feature
or anything fancy. Neither am I suggesting that I'm stuck because is.unsorted()
doesn't check its arguments (like any other function directly exposed to the 
user
is expected to do, especially for things that are used in an if statement).

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Herve Pages
Hi,

Thanks for your answers!

No need to change anything. In my case, 'x' is guaranteed to be an integer
vector with no NAs so I can call .Internal(is.unsorted(x)) directly.

BTW, why not make is.unsorted() a little bit more prepared to silly user
input:

   > is.unsorted(c(2:5, NA), na.rm=NA)
   Error in if (!is.atomic(x) || (!na.rm && any(is.na(x return(NA) :
 missing value where TRUE/FALSE needed

(or at least silently coerce those silly values to TRUE or FALSE like
max()/min() do, following some obscure logic though).

Also it's arguable that a length-1 vector cannot be considered sorted:
   > is.unsorted(NA)
   [1] NA

Cheers,
H.


Prof Brian Ripley wrote:
> I wouldn't say 'easy', and so I think we need a business case for this 
> change.  (One of the issues is that the internals are used elsewhere and 
> optimized for inputs without NAs.  So we would need to write separate 
> code if we pass NAs down to C level.  As I recall, is.unsorted was a 
> cheap R interface to existing C code.)
> 
> What real-world problems are being affected by this, and what would be 
> proportional speedup of the whole analysis be from this change?
> 
> On Thu, 17 Apr 2008, Bill Dunlap wrote:
> 
>> On Thu, 17 Apr 2008, Herve Pages wrote:
>>
>>> Couldn't is.unsorted() bail out immediately here (after comparing
>>> the first 2 elements):
>>>
>>> > x <- 2000:1
>>> > system.time(is.unsorted(x), gcFirst=TRUE)
>>> user  system elapsed
>>>0.084   0.040   0.124
>>>
>>> > x <- 2:1
>>> > system.time(is.unsorted(x), gcFirst=TRUE)
>>> user  system elapsed
>>>0.772   0.440   1.214
>>
>> The C code does bail out upon seeing the first out- of-order pair, but
>> before calling the C code, the S code does any(is.na(x)), forcing a
>> scan of the entire data.  If you remove the is.na calls from
>> is.unsorted's S code you will see the timings improve in your example.
>> (It looks easy to do the NA checks in the C code.)
>>
>>   is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
>>   if (is.null(x))
>>   return(FALSE)
>>   if (!is.atomic(x))
>>   return(NA)
>>   .Internal(is.unsorted(x))
>>   }
>>   > x <- 2000:1
>>   > system.time(is.unsorted(x), gcFirst=TRUE)
>>   user  system elapsed
>> 0.356   0.157   0.514
>>   > system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
>>   user  system elapsed
>>  0   0   0
>>   > revx <- rev(x)
>>   > system.time(is.unsorted(revx), gcFirst=TRUE)
>>  user  system elapsed
>> 0.500   0.170   0.672
>>   > system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
>>  user  system elapsed
>> 0.131   0.000   0.132
>>
>>  
>>
>> Bill Dunlap
>> Insightful Corporation
>> bill at insightful dot com
>> 360-428-8146
>>
>> "All statements in this message represent the opinions of the author 
>> and do
>> not necessarily reflect Insightful Corporation policy or position."
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Couldn't (and shouldn't) is.unsorted() be faster?

2008-04-17 Thread Herve Pages
Hi,

Couldn't is.unsorted() bail out immediately here (after comparing
the first 2 elements):

 > x <- 2000:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
user  system elapsed
   0.084   0.040   0.124

 > x <- 2:1
 > system.time(is.unsorted(x), gcFirst=TRUE)
user  system elapsed
   0.772   0.440   1.214

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with validObject(..., complete=TRUE)

2008-04-16 Thread Herve Pages
Hi John,

John Chambers wrote:
> The "infelicity" arises because validObject() is not a generic function; 
> validity "method" is a bit of a misnomer.

Indeed. And I guess referring to "method dispatch" like I did in my previous
email is not appropriate either.
So yes I learned that thinking of validity "methods" as regular methods doesn't
really work. For example the man page for callNextMethod is stating that "a
call to 'callNextMethod' can only appear inside a method definition" but it
won't work inside a validity "method". Why would I need to do that? Well the
problem I reported in this thread can be worked around by redefining the
the same validity method for "PosInts2" objects as for "PosInts" objects.
So while I was looking at different ways to achieve this, I tried this
(I really want to avoid code duplication):

setValidity("PosInts2", function(object) callNextMethod())

which of course doesn't work, so I finally came up with something like this:

setValidity("PosInts2", getValidity(getClassDef(extends("PosInt2")[2])))

which tries to mimic what callNextMethod would do but in an ugly and easy to
break way. This is a temporary fix anyway, until validObject(..., complete=TRUE)
is fixed in R 2.8.0.

>  The functions are attached to 
> the class definition and validObject looks for them directly--in the 
> process it catches all methods from superclasses, but not from 
> superclasses of the slots' classes.
> 
> The fix is to call validObject recursively on each slot when 
> complete=TRUE.  This is a moderately large efficiency hit, but if you're 
> using complete=TRUE, it's reasonable to assume you really want the whole 
> truth, even if it takes a bit longer.

Absolutely.

Thanks for looking into this!

Cheers,
H.

> 
> Unless there are counter-arguments, we'll make this change (not, 
> however, for 2.7.0)
> 
> John
> 
> Herve Pages wrote:
>> Hi,
>>
>> When called with complete=TRUE, validObject() is supposed to work in a
>> recursive manner. But here is a situation where it doesn't seem to be
>> the case.
>>
>> Let's define a class with a validity method:
>>
>>setClass("PosInts", representation(ii="integer"))
>>
>>setValidity("PosInts",
>>  function(object)
>>  {
>>if (!all([EMAIL PROTECTED] > 0))
>>  return("'ii' slot contains non-positive values")
>>NULL
>>  }
>>)
>>
>> Let's extend this class (no need to add new slots for illustrating the 
>> pb):
>>
>>setClass("PosInts2", contains="PosInts")
>>
>>broken <- new("PosInts2")
>>[EMAIL PROTECTED] <- 3:0
>>
>> If "PosInts2" objects don't need to satisfy additional constraints in 
>> order to
>> be considered valid, then I don't need to define a validity method for 
>> them.
>> I can just rely on method dispatch, which works as expected with 
>> validity methods:
>>
>>> validObject(broken)
>>Error in validObject(broken) :
>>  invalid class "PosInts" object: 'ii' slot contains non-positive 
>> values
>>
>> Unfortunately, this will cause problems later when I try to validate 
>> objects
>> that have slots of type "PosInts2":
>>
>>setClass("A", representation(aa="PosInts2"))
>>a <- new("A", aa=broken)
>>
>> This works as expected:
>>
>>> validObject(a)
>>[1] TRUE
>>
>> But this is not what I would expect:
>>
>>> validObject(a, complete=TRUE)
>>[1] TRUE
>>
>> ... given that 'a' has a slot that contains an invalid "PosInts2" 
>> instance:
>>
>>> validObject([EMAIL PROTECTED])
>>Error in validObject([EMAIL PROTECTED]) :
>>  invalid class "PosInts2" object: 'ii' slot contains non-positive 
>> values
>>
>> So clearly 'a' is broken and I would expect validObject(a, 
>> complete=TRUE) to
>> tell me so...
>>
>> Now if I define the same validity method for "PosInts2" objects as for 
>> "PosInts"
>> objects, then things work as expected (validObject(a, complete=TRUE) 
>> will fail)
>> but it's not clear to me why I should be forced to do this?
>>
>> Thanks!
>>
>> H.
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>   
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Pb with validObject(..., complete=TRUE)

2008-04-15 Thread Herve Pages
Hi,

When called with complete=TRUE, validObject() is supposed to work in a
recursive manner. But here is a situation where it doesn't seem to be
the case.

Let's define a class with a validity method:

   setClass("PosInts", representation(ii="integer"))

   setValidity("PosInts",
 function(object)
 {
   if (!all([EMAIL PROTECTED] > 0))
 return("'ii' slot contains non-positive values")
   NULL
 }
   )

Let's extend this class (no need to add new slots for illustrating the pb):

   setClass("PosInts2", contains="PosInts")

   broken <- new("PosInts2")
   [EMAIL PROTECTED] <- 3:0

If "PosInts2" objects don't need to satisfy additional constraints in order to
be considered valid, then I don't need to define a validity method for them.
I can just rely on method dispatch, which works as expected with validity 
methods:

   > validObject(broken)
   Error in validObject(broken) :
 invalid class "PosInts" object: 'ii' slot contains non-positive values

Unfortunately, this will cause problems later when I try to validate objects
that have slots of type "PosInts2":

   setClass("A", representation(aa="PosInts2"))
   a <- new("A", aa=broken)

This works as expected:

   > validObject(a)
   [1] TRUE

But this is not what I would expect:

   > validObject(a, complete=TRUE)
   [1] TRUE

... given that 'a' has a slot that contains an invalid "PosInts2" instance:

   > validObject([EMAIL PROTECTED])
   Error in validObject([EMAIL PROTECTED]) :
 invalid class "PosInts2" object: 'ii' slot contains non-positive values

So clearly 'a' is broken and I would expect validObject(a, complete=TRUE) to
tell me so...

Now if I define the same validity method for "PosInts2" objects as for "PosInts"
objects, then things work as expected (validObject(a, complete=TRUE) will fail)
but it's not clear to me why I should be forced to do this?

Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] coerce methods and inheritance

2008-04-09 Thread Herve Pages
Hi John,

John Chambers wrote:
> Herve Pages wrote:
>> Hi,
>>
>> It doesn't seem that the dispatching algo is finding my coerce method 
>> under
>> some circumstances.
>> Let's say I have 2 classes, A and AA and that AA is just a direct 
>> extension
>> of A with no additional slots:
>>
>>setClass("A", representation(ii="integer"))
>>setClass("AA", contains="A")
>>
>> I can define a method for coercing my objects to an integer vector with:
>>
>>setAs("A", "integer", function(from) {cat("I'm the A->integer 
>> coerce method\n"); [EMAIL PROTECTED])
>>
>> and this works as expected when my object is an AA instance:
>>
>>> aa <- new("AA", ii=sample(10, 5))
>>> as(aa, "integer")
>>I'm the A->integer coerce method
>>[1] 10  1  6  4  7
>>
>> But things don't behave that way anymore if I introduce a direct 
>> extension of AA:
>>
>>setClass("OrderedAA",
>>  contains="AA",
>>  validity=function(object)
>>  {
>>  if (!all(diff([EMAIL PROTECTED]) >= 0))
>>  return("slot 'ii' is not ordered")
>>  TRUE
>>  }
>>)
>>
>> and a method for coercing an A object to an OrderedAA object:
>>
>>setAs("A", "OrderedAA",
>>  function(from)
>>  {
>>  cat("I'm the A->OrderedAA coerce method\n")
>>  new("OrderedAA", ii=sort([EMAIL PROTECTED]))
>>  }
>>)
>>
>> My A->OrderedAA coerce method is not called anymore:
>>
>>> oaa <- as(aa, "OrderedAA")
>>> oaa
>>> validObject(oaa)
>>Error in validObject(oaa) :
>>  invalid class "OrderedAA" object: slot 'ii' is not ordered
>>
>> This looks like a bug to me.
>>   
> Well, obscure perhaps, and not as well documented as it might be.
> 
> Defining a subclass of "AA" creates implicit coerce methods in both 
> directions.  The method from "AA" to its subclass creates a new object 
> from the subclass, then inserts the inherited slots.
> 
>  > selectMethod("coerce", c("AA", "OrderedAA"))
> Method Definition:
> 
> function (from, to)
> {
>obj <- new("OrderedAA")
>as(obj, "AA") <- from
>obj
> }

The problem is that this implicit method doesn't seem to check the validity
of the new object. So now my users have an easy way to create broken objects
without being told that they are doing something wrong... unless I have
redefined a lot of coerce methods in my software (and there can be a lot of
them).
Unfortunately for me and my project, it looks like most of these "implicit"
methods are doing the wrong thing. So if the purpose of having them was to
make the developer's life easier, it doesn't work for me.

> 
> Signatures:
>from totarget  "AA" "OrderedAA"
> defined "AA" "OrderedAA"
> 
> The situation is made more confusing because these methods are only 
> explicitly inserted in the coerce() function the first time they're used 
> (for obvious efficiency reasons).

Even worse, after I define my coerce method for A->OrderedAA, and _before_
I try to coerce my first AA object to an OrderedAA object, I get this:

   > selectMethod("coerce", c("AA", "OrderedAA"))
   Method Definition:

   function (from, to = "OrderedAA", strict = TRUE)
   {
 cat("I'm the A->OrderedAA coerce method\n")
 new("OrderedAA", ii = sort([EMAIL PROTECTED]))
   }

   Signatures:
   from to
   target  "AA" "OrderedAA"
   defined "A"  "OrderedAA"

which is not reporting the truth (the method that will actually be selected
will be the implicit one, not mine).

> 
> Notice that this is a direct method, not an inherited one.  It will be 
> chosen by the method selection from as().
> 
> So it is true that if you want to override the implicit methods, you 
> have to do that for each new subclass, presumably when defining the 
> subclass.
> 
>  >setAs("AA", "OrderedAA",
> +  function(from)
> +  {
> +  cat("I'm the A->OrderedAA coerce method\n")
> +  new("OrderedAA", ii=sor  [TRUNCATED]
>  > as(aa, "

Re: [Rd] autocompletion problem

2008-04-09 Thread Herve Pages
Hi Deepayan,

Deepayan Sarkar wrote:
> On 4/9/08, Herve Pages <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>>  Let's create the xxx object just to avoid confusion even if it's not 
>> necessary
>>  for reproducing the problem below:
>>
>>xxx <- 8:3
>>
>>  If I start typing this:
>>
>>max(xxx[
>>
>>  and now try to autocomplete with , then I get the following error (and 
>> a warning):
>>
>>> max(xxx[Error in grep(sprintf("^%s", makeRegexpSafe(text)), allArgs, 
>> value = TRUE) :
>>  invalid regular expression '^xxx['
>>In addition: Warning message:
>>In grep(sprintf("^%s", makeRegexpSafe(text)), allArgs, value = TRUE) :
>>  regcomp error:  'Invalid regular expression'
> 
> Thanks for the report, makeRegexpSafe was not escaping "[". Does the
> following workaround fix the problem (without introducing additional
> ones)?
> 
> assignInNamespace("makeRegexpSafe", ns = "utils", value = function(s)
> {
> s <- gsub(".", "\\.", s, fixed = TRUE)
> s <- gsub("?", "\\?", s, fixed = TRUE)
> s <- gsub("[", "\\[", s, fixed = TRUE)
> s <- gsub("]", "\\]", s, fixed = TRUE) # necessary?
> s
> })

Yes this fixes the problem. Thanks!

BTW are there any plans to deal with backquoted symbols/names?
There are currently 2 problems with this:

   1. Completion will work and expand symbols or names that contain special
  characters but without backquoting them:

xxx <- as.list(11:13)
names(xxx) <- letters[1:3]
names(xxx)[2] <- "2b"

> xxx$#  stands for a hit on the Tab key
xxx$2b  xxx$a   xxx$c
> xxx$2b# I hit 2, then  gives me the b

 Now if I hit , of course I get:

Error: unexpected numeric constant in "xxx$2"

   2. Completion of names after $ will not work if I've already backquoted
  the partial name:

> xxx$`2...  # nothing happens

Thanks!
H.


> 
> (with 2.6.x, replace "utils" with "rcompgen")
> 
> -Deepayan
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] autocompletion problem

2008-04-09 Thread Herve Pages
Hi,

Let's create the xxx object just to avoid confusion even if it's not necessary
for reproducing the problem below:

   xxx <- 8:3

If I start typing this:

   max(xxx[

and now try to autocomplete with , then I get the following error (and a 
warning):

   > max(xxx[Error in grep(sprintf("^%s", makeRegexpSafe(text)), allArgs, value 
= TRUE) :
 invalid regular expression '^xxx['
   In addition: Warning message:
   In grep(sprintf("^%s", makeRegexpSafe(text)), allArgs, value = TRUE) :
 regcomp error:  'Invalid regular expression'

Now it seems that this problem in R has managed to screw up something out of R 
(and
this is probably OS dependent, I'm running 64-bit openSUSE 10.3) because when I 
quit
R and try to do something at the shell level, what I type is not echoed anymore 
in
the terminal window. For example if I type 'whoami' followed by , here 
is what
I get on the screen (only the output of the command is displayed but not the 
command
itself):

   [EMAIL PROTECTED]:~> hpages

I've tried this with other R versions (2.6.1, 2.6.2) on other Linux flavors 
(64-bit
openSUSE 10.1, 32-bit openSUSE 10.3, 32-bit Ubuntu 6.06) and got almost the 
same thing,
the only difference being that with older versions of R (e.g. with 2.7.0 from 
2007-12-09
r43632), I get the error message but no additional warning.

Cheers,
H.


 > sessionInfo()
R version 2.7.0 beta (2008-04-07 r45159)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] coerce methods and inheritance

2008-04-09 Thread Herve Pages
Hi,

It doesn't seem that the dispatching algo is finding my coerce method under
some circumstances.
Let's say I have 2 classes, A and AA and that AA is just a direct extension
of A with no additional slots:

   setClass("A", representation(ii="integer"))
   setClass("AA", contains="A")

I can define a method for coercing my objects to an integer vector with:

   setAs("A", "integer", function(from) {cat("I'm the A->integer coerce 
method\n"); [EMAIL PROTECTED])

and this works as expected when my object is an AA instance:

   > aa <- new("AA", ii=sample(10, 5))
   > as(aa, "integer")
   I'm the A->integer coerce method
   [1] 10  1  6  4  7

But things don't behave that way anymore if I introduce a direct extension of 
AA:

   setClass("OrderedAA",
 contains="AA",
 validity=function(object)
 {
 if (!all(diff([EMAIL PROTECTED]) >= 0))
 return("slot 'ii' is not ordered")
 TRUE
 }
   )

and a method for coercing an A object to an OrderedAA object:

   setAs("A", "OrderedAA",
 function(from)
 {
 cat("I'm the A->OrderedAA coerce method\n")
 new("OrderedAA", ii=sort([EMAIL PROTECTED]))
 }
   )

My A->OrderedAA coerce method is not called anymore:

   > oaa <- as(aa, "OrderedAA")
   > oaa
   > validObject(oaa)
   Error in validObject(oaa) :
 invalid class "OrderedAA" object: slot 'ii' is not ordered

This looks like a bug to me.

Thanks,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with lapply()

2008-02-07 Thread Herve Pages
Thanks Martin for your patient and detailed explanations!

So with the base4 solution, base would import base4 so lapply() could
call base4::as.list()? Sounds good.
Making as.list() a *primitive* function with C-internal S3 and S4
method dispatch: like for length(), etc...? Sounds good too.

I'm wondering why can't as.list() just be made an S4 generic in base.
Or in methods, which seems to be a more appropriate place for S4 generics.
But then I guess lapply() would have to be moved to methods too and that
would break code importing base but not methods...

Sounds like there is no simple solution indeed.

Cheers,
H.


Martin Maechler wrote:
>>>>>> "HP" == Herve Pages <[EMAIL PROTECTED]>
>>>>>> on Thu, 31 Jan 2008 10:26:31 -0800 writes:
> 
> HP> Hi, If needed, lapply() tries to convert its first
> HP> argument into a list before it starts doing something
> HP> with it:
> 
> >> lapply
> HP> function (X, FUN, ...) 
> HP> {
> HP>   FUN <- match.fun(FUN)
> HP>   if (!is.vector(X) || is.object(X)) 
> HP>   X <- as.list(X)
> HP>   .Internal(lapply(X, FUN))
> HP> }
> 
> HP> But in practice, things don't always seem to "work" as suggested by
> HP> this code (at least to the eyes of a naive user).
> 
> Yes.  That is the infamous problem of what I'd call a mental conflict
> between namespaces and function-centered OOP (as in S3 or S4),
> or put differently, the problem that not all R functions are
> S4 generics right from the start :
>  
> Both lapply() and as.list() are in the base namespace.
> Consequently, lapply() will always call base::as.list() and
> unfortunately  base::as.list() is not becoming an S4 generic by your
> 
> >> setClass("A", representation(data="list"))
> HP> [1] "A"
> >> setMethod("as.list", "A", function(x, ...) [EMAIL PROTECTED])
> HP> Creating a new generic function for "as.list" in ".GlobalEnv"
> HP> [1] "as.list"
> 
> See: it's an S4 generic only in .GlobalEnv, but to really work
> it should be an S4 generic in base.
> 
> 
> HP> Seems like using force() inside lapply() would solve the problem:
> 
> Well, it's not force() that makes it work,
> it's the fact that you define a version of lapply outside "base"
> and that of course does see your as.list() generic in .GlobalEnv ..
> 
> HP> lapply2 <- function(X, FUN, ...)
> HP> {
> HP> FUN <- match.fun(FUN)
> HP> if (!is.vector(X) || is.object(X))
> HP> X <- force(as.list(X))
> HP> .Internal(lapply(X, FUN))
> HP> }
> 
> HP> It works now:
> 
> [...]
> 
> 
> Now one "solution" to the problem is to redefine  base::as.list()
> to be your S4 generic.  Most smart useRs I know would call this
> a terrible hack though...
> and yes, I'm guilty of committing that hack -- inside the Matrix package, 
> not for as.list() but for as.matrix():
> 
> The consequence of that is that e.g.  eigen() *does* work for
> all our matrices, because eigen starts with as.matrix() and that
> needs to work as a proper (i.e. S4) generic in order to work as
> it should. 
> 
> A much better solution to the underlying deeper problem would be
> to find a way where ___ conceptually ___ part (or all of)
> 'methods' would be inside of 'base', and  as.list(), as.matrix()
> etc all S4 (and S3 simultaneously) generics. 
> [and we have pondered of using  'base4' for that,
>  which would contain part of current base and part of current methods;
>  but there have been different ideas].
> 
> One workaround/solution used recently (notably for the group generics)
> was to make more of these functions into
> *primitive* functions with C-internal S3 and S4 method dispatch.
> That maybe a desideratum for "now".
> 
> Martin Maechler, ETH Zurich (and R core team)
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] segfault in gregexpr()

2008-01-31 Thread Herve Pages
Hi Seth,

Seth Falcon wrote:
> Hi again,
> 
>> Herve wrote:
>>   > gregexpr("", "abc", fixed=TRUE)
>>
>>*** caught segfault ***
>>   address 0x1c09000, cause 'memory not mapped'
> 
> This should be fixed in latest svn.  Thanks for the report.

That's great. Thanks!

H.

> 
> + seth
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] segfault in gregexpr()

2008-01-30 Thread Herve Pages
Hi,

Tried with R 2.6 and R 2.7:

  > gregexpr("", "abc", fixed=TRUE)

   *** caught segfault ***
  address 0x1c09000, cause 'memory not mapped'

  Traceback:
   1: gregexpr("", "abc", fixed = TRUE)

  Possible actions:
  1: abort (with core dump, if enabled)
  2: normal R exit
  3: exit R without saving workspace
  4: exit R saving workspace
  Selection:


> sessionInfo()
R version 2.7.0 Under development (unstable) (2008-01-29 r44238)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with new("externalptr")

2008-01-29 Thread Herve Pages
Luke Tierney wrote:
> On Tue, 29 Jan 2008, Herve Pages wrote:
> 
>> Hi again,
>>
>> Here is an example of an annoyance that I think is directly related to
>> the
>> problem with new("externalptr"). When you try to extend the
>> "externalptr" class:
> 
> You don't wnat to do that for the same reason you don't want to do it
> with environments: Like environment external pointers are reference
> objects, so things like unclass and other attribute changes end up
> being destructive.  Create the thing you want as a list wrapper around
> the external pointer.  You'll be a lot happier with the result in the
> long run.

Thanks Luke!

As you've probably seen further below in my post, wrapping instead of extending
is indeed what I'm doing. And I'm explaining why I don't really have the choice
right now.
I understand why, generally speaking, you would recommend wrapping instead of
extending, but, given the context where I'm making use of those external 
pointers
(the "ExternalInteger" class itself is not exported, only used internally as 
slots
of higher level objects, never accessed directly, etc..., etc...), I don't think
extending would actually be a problem.

Unfortunately, because of the problem with new("externalptr"), I can't even
experiment the "extending" approach, test it, compare it with the "wrapping"
approach, etc... and make my own opinion.
And even if I stick with the "wrapping" approach, it would help to have
new("externalptr") doing what I think is the right thing to do.

Cheers,
H.


> 
> luke
> 
> 
>>
>>  > setClass("ExternalInteger", contains="externalptr")
>>  [1] "ExternalInteger"
>>
>> then every call to new("ExternalInteger") will return the same
>> instance too.
>>
>> I've tried to define an "initialize" method for "ExternalInteger"
>> objects, but,
>> whatever I do, I end up with the same "ExternalInteger" instance. So
>> in the end
>> I had to define the "ExternalInteger" class this way:
>>
>>  > setClass("ExternalInteger", representation(xp="externalptr"))
>>
>> even if I'd really like to be able to use the "is a" semantic and not
>> the "has a"
>> semantic.
>>
>> Then I use my xp_new() C routine (see previous post) for initializing
>> the xp slot:
>>
>>  setMethod("initialize", "ExternalInteger",
>>function(.Object, ...)
>>{
>>[EMAIL PROTECTED] <- .Call("xp_new")
>>...
>>.Object
>>}
>>  )
>>
>> Then everytime I need to pass an "ExternalInteger" instance x to a C
>> routine,
>> I need to perform one extra step to reach the externalptr (need to
>> pass [EMAIL PROTECTED] to
>> the routine instead of x itself).
>>
>> So unfortunately, things are quite ugly and more painful than necessary.
>>
>> Thanks,
>> H.
>>
>>
>> Herve Pages wrote:
>>> Hi,
>>>
>>> It seems that new("externalptr") is always returning the same
>>> instance, and
>>> not a new one as one would expect from a call to new(). Of course
>>> this is hard
>>> to observe:
>>>
>>>  > new("externalptr")
>>>   
>>>  > new("externalptr")
>>>   
>>>
>>> since not a lot of details are displayed.
>>>
>>> For example, it's easy to see that 2 consecutive calls to
>>> new("environment")
>>> create different instances:
>>>
>>>  > new("environment")
>>>   
>>>  > new("environment")
>>>   
>>>
>>> But for new("externalptr"), I had to use the following C routine:
>>>
>>>   SEXP sexp_address(SEXP s)
>>>   {
>>> SEXP ans;
>>> char buf[40];
>>>
>>> snprintf(buf, sizeof(buf), "%p", s);
>>> PROTECT(ans = NEW_CHARACTER(1));
>>> SET_STRING_ELT(ans, 0, mkChar(buf));
>>> UNPROTECT(1);
>>> return ans;
>>>   }
>>>
>>> Then I get:
>>>
>>>  > .Call("sexp_address", new("externalptr"))
>>>   [1] "0xde2ce0"
>>>  > .Call("sexp_address", new("externalptr"))
>>>   [1] "0xde2ce0"
>>>
>>> Isn't that wrong?
>>>
>>> I worked around this problem by writing the following C routine:
>>>
>>>   SEXP xp_new()
>>>   {
>>> return R_MakeExternalPtr(NULL, R_NilValue, R_NilValue);
>>>   }
>>>
>>> so I can create new "externalptr" instances from R with:
>>>
>>>   .Call("xp_new")
>>>
>>> I understand that there is not much you can do from R with an
>>> "externalptr"
>>> instance and that you will have to manipulate them at the C level
>>> anyway.
>>> But since new("externalptr") exists and seems to work, wouldn't that be
>>> better if it was really creating a new instance at each call?
>>>
>>> Thanks!
>>> H.
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with new("externalptr")

2008-01-29 Thread Herve Pages
Hi again,

Here is an example of an annoyance that I think is directly related to the
problem with new("externalptr"). When you try to extend the "externalptr" class:

  > setClass("ExternalInteger", contains="externalptr")
  [1] "ExternalInteger"

then every call to new("ExternalInteger") will return the same instance too.

I've tried to define an "initialize" method for "ExternalInteger" objects, but,
whatever I do, I end up with the same "ExternalInteger" instance. So in the end
I had to define the "ExternalInteger" class this way:

  > setClass("ExternalInteger", representation(xp="externalptr"))

even if I'd really like to be able to use the "is a" semantic and not the "has 
a"
semantic.

Then I use my xp_new() C routine (see previous post) for initializing the xp 
slot:

  setMethod("initialize", "ExternalInteger",
function(.Object, ...)
{
[EMAIL PROTECTED] <- .Call("xp_new")
...
.Object
}
  )

Then everytime I need to pass an "ExternalInteger" instance x to a C routine,
I need to perform one extra step to reach the externalptr (need to pass [EMAIL 
PROTECTED] to
the routine instead of x itself).

So unfortunately, things are quite ugly and more painful than necessary.

Thanks,
H.


Herve Pages wrote:
> Hi,
> 
> It seems that new("externalptr") is always returning the same instance, and
> not a new one as one would expect from a call to new(). Of course this is hard
> to observe:
> 
>   > new("externalptr")
>   
>   > new("externalptr")
>   
> 
> since not a lot of details are displayed.
> 
> For example, it's easy to see that 2 consecutive calls to new("environment")
> create different instances:
> 
>   > new("environment")
>   
>   > new("environment")
>   
> 
> But for new("externalptr"), I had to use the following C routine:
> 
>   SEXP sexp_address(SEXP s)
>   {
> SEXP ans;
> char buf[40];
> 
> snprintf(buf, sizeof(buf), "%p", s);
> PROTECT(ans = NEW_CHARACTER(1));
> SET_STRING_ELT(ans, 0, mkChar(buf));
> UNPROTECT(1);
> return ans;
>   }
> 
> Then I get:
> 
>   > .Call("sexp_address", new("externalptr"))
>   [1] "0xde2ce0"
>   > .Call("sexp_address", new("externalptr"))
>   [1] "0xde2ce0"
> 
> Isn't that wrong?
> 
> I worked around this problem by writing the following C routine:
> 
>   SEXP xp_new()
>   {
> return R_MakeExternalPtr(NULL, R_NilValue, R_NilValue);
>   }
> 
> so I can create new "externalptr" instances from R with:
> 
>   .Call("xp_new")
> 
> I understand that there is not much you can do from R with an "externalptr"
> instance and that you will have to manipulate them at the C level anyway.
> But since new("externalptr") exists and seems to work, wouldn't that be
> better if it was really creating a new instance at each call?
> 
> Thanks!
> H.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Problem with new("externalptr")

2008-01-29 Thread Herve Pages
Hi,

It seems that new("externalptr") is always returning the same instance, and
not a new one as one would expect from a call to new(). Of course this is hard
to observe:

  > new("externalptr")
  
  > new("externalptr")
  

since not a lot of details are displayed.

For example, it's easy to see that 2 consecutive calls to new("environment")
create different instances:

  > new("environment")
  
  > new("environment")
  

But for new("externalptr"), I had to use the following C routine:

  SEXP sexp_address(SEXP s)
  {
SEXP ans;
char buf[40];

snprintf(buf, sizeof(buf), "%p", s);
PROTECT(ans = NEW_CHARACTER(1));
SET_STRING_ELT(ans, 0, mkChar(buf));
UNPROTECT(1);
return ans;
  }

Then I get:

  > .Call("sexp_address", new("externalptr"))
  [1] "0xde2ce0"
  > .Call("sexp_address", new("externalptr"))
  [1] "0xde2ce0"

Isn't that wrong?

I worked around this problem by writing the following C routine:

  SEXP xp_new()
  {
return R_MakeExternalPtr(NULL, R_NilValue, R_NilValue);
  }

so I can create new "externalptr" instances from R with:

  .Call("xp_new")

I understand that there is not much you can do from R with an "externalptr"
instance and that you will have to manipulate them at the C level anyway.
But since new("externalptr") exists and seems to work, wouldn't that be
better if it was really creating a new instance at each call?

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with defineVar() example in the "Writing R Extensions" manual

2008-01-16 Thread Herve Pages
Peter Dalgaard wrote:
> Herve Pages wrote:
[...]
>> So everybody seems to assume that SET_ELEMENT(), SET_STRING_ELT(),
>> SET_NAMES(), etc... can't (and will never) trigger garbage collection.
>> But what about defineVar()? More generally, how do I know this for the
>> functions/macros listed in Rdefines.h and Rinternals.h?
>>  
> Yes, that point was well taken. You can't know unless we write it
> somewhere. You can read the sources, but that is no guarantee that it
> won't change in future versions. We should be better at documenting that
> stuff, if for no other reason then to be able to distinguish between
> YOUR bugs and OUR bugs.
> If you grep through the sources, you'll find a number of instances of
> things like
> 
> defineVar(name, mkPromise(...), ...)
> 
> which would break along with your case, but that is of course only an
> indication, not proof that you can actually leave the value unprotected.
> 

Agreed. Thanks for your reply!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with defineVar() example in the "Writing R Extensions" manual

2008-01-16 Thread Herve Pages
Peter Dalgaard wrote:
> Herve Pages wrote:
>> Hi Peter,
>>
>> Peter Dalgaard wrote:
>>   
>>> Herve Pages wrote:
>>> 
>>>> Hi,
>>>>
>>>> I'm wondering if this code from the "Writing R Extensions" manual
>>>> is really safe:
>>>>
>>>>  SEXP mkans(double x)
>>>>  {
>>>>  SEXP ans;
>>>>  PROTECT(ans = allocVector(REALSXP, 1));
>>>>  REAL(ans)[0] = x;
>>>>  UNPROTECT(1);
>>>>  return ans;
>>>>  }
>>>>
>>>>  double feval(double x, SEXP f, SEXP rho)
>>>>  {
>>>>  defineVar(install("x"), mkans(x), rho);
>>>>  return(REAL(eval(f, rho))[0]);
>>>>  }
>>>>
>>>> In C, the order in which function arguments are evaluated before the
>>>> function itself is called is undefined. Hence there is no guarantee
>>>> that install("x") will be evaluated before mkans(x). What happens if
>>>> mkans(x) is evaluated first? Then install("x") will be called and
>>>> eventually trigger garbage collection while the SEXP returned by
>>>> mkans(x) is still unprotected.
>>>>
>>>> I'm asking because I'm getting all sorts of problems with
>>>>
>>>>   defineVar(install(somekey), mkans(x), rho);
>>>>
>>>> In my code this line is inside a big loop (hundred of thousands of
>>>> iterations) so I end up with a lot of symbols in the rho environment.
>>>>
>>>> The problems I've seen are hard to reproduce: sometimes it's a segfault,
>>>> sometimes a "cons memory exhausted" error, or sometimes everything looks
>>>> fine except that, later, when I retrieve values from the rho environment
>>>> with findVar(), some of them are altered!
>>>>
>>>> But if I replace the above line by:
>>>>
>>>>   PROTECT(ans = mkans(x));
>>>>   defineVar(install(somekey), ans, rho);
>>>>   UNPROTECT(1);
>>>>
>>>> then everything works fine :-)
>>>>
>>>>   
>>>>   
>>> Sounds like you are right. You don't really have the "smoking gun", but
>>> it doesn't seem to be worth trying to catch the actual bug in action
>>> with hardware watchpoints and whatnot.
>>>
>>> The opposite fix should work too (does it?):
>>>
>>> { SEXP sym = install(somekey) ; defineVar(sym, mkans(x), rho);}
>>> 
>> So now you are protected against install(somekey) eventually triggering
>> garbage collection but you are still not protected against defineVar() itself
>> triggering garbage collection. Maybe defineVar() does not do that, and will
>> never do it, but isn't it risky to rely on this kind of assumption?
>>
>> Thanks!
>> H.
>>   
> That's not the problem you raised (argument evaluation order), but
> there's a CONS inside defineVar, and as far as I can see, it doesn't
> protect its arguments, so you could well be right.

This problem is related to my original problem since it would cause the same
disaster: garbage collection on my unprotected SEXP.
The more general problem I'm facing is to know whether or not it is safe to
use a function like mkans() (that returns an unprotected SEXP) like this:

  SET_ELEMENT(ans, 0, mkans(x));

In the case of SET_ELEMENT() or SET_STRING_ELT() it seems to be safe. For
example I've seen this

  SET_STRING_ELT(ans, 0, mkChar(buf));

in many places. So I'm using it too, even if the SEXP returned by mkChar()
is not protected.
Same here:

  SET_ELEMENT(ans, 0, duplicate(x));

The SEXP returned by duplicate() is not protected.

So everybody seems to assume that SET_ELEMENT(), SET_STRING_ELT(),
SET_NAMES(), etc... can't (and will never) trigger garbage collection.
But what about defineVar()? More generally, how do I know this for the
functions/macros listed in Rdefines.h and Rinternals.h?

Thanks!
H.

> 
>>   
>>> (I don't think you need to  PROTECT elements in the symbol table)
>>>
>>> 
>>   
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with defineVar() example in the "Writing R Extensions" manual

2008-01-16 Thread Herve Pages
Hi Peter,

Peter Dalgaard wrote:
> Herve Pages wrote:
>> Hi,
>>
>> I'm wondering if this code from the "Writing R Extensions" manual
>> is really safe:
>>
>>  SEXP mkans(double x)
>>  {
>>  SEXP ans;
>>  PROTECT(ans = allocVector(REALSXP, 1));
>>  REAL(ans)[0] = x;
>>  UNPROTECT(1);
>>  return ans;
>>  }
>>
>>  double feval(double x, SEXP f, SEXP rho)
>>  {
>>  defineVar(install("x"), mkans(x), rho);
>>  return(REAL(eval(f, rho))[0]);
>>  }
>>
>> In C, the order in which function arguments are evaluated before the
>> function itself is called is undefined. Hence there is no guarantee
>> that install("x") will be evaluated before mkans(x). What happens if
>> mkans(x) is evaluated first? Then install("x") will be called and
>> eventually trigger garbage collection while the SEXP returned by
>> mkans(x) is still unprotected.
>>
>> I'm asking because I'm getting all sorts of problems with
>>
>>   defineVar(install(somekey), mkans(x), rho);
>>
>> In my code this line is inside a big loop (hundred of thousands of
>> iterations) so I end up with a lot of symbols in the rho environment.
>>
>> The problems I've seen are hard to reproduce: sometimes it's a segfault,
>> sometimes a "cons memory exhausted" error, or sometimes everything looks
>> fine except that, later, when I retrieve values from the rho environment
>> with findVar(), some of them are altered!
>>
>> But if I replace the above line by:
>>
>>   PROTECT(ans = mkans(x));
>>   defineVar(install(somekey), ans, rho);
>>   UNPROTECT(1);
>>
>> then everything works fine :-)
>>
>>   
> Sounds like you are right. You don't really have the "smoking gun", but
> it doesn't seem to be worth trying to catch the actual bug in action
> with hardware watchpoints and whatnot.
> 
> The opposite fix should work too (does it?):
> 
> { SEXP sym = install(somekey) ; defineVar(sym, mkans(x), rho);}

So now you are protected against install(somekey) eventually triggering
garbage collection but you are still not protected against defineVar() itself
triggering garbage collection. Maybe defineVar() does not do that, and will
never do it, but isn't it risky to rely on this kind of assumption?

Thanks!
H.

> 
> (I don't think you need to  PROTECT elements in the symbol table)
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Pb with defineVar() example in the "Writing R Extensions" manual

2008-01-15 Thread Herve Pages
Hi,

I'm wondering if this code from the "Writing R Extensions" manual
is really safe:

 SEXP mkans(double x)
 {
 SEXP ans;
 PROTECT(ans = allocVector(REALSXP, 1));
 REAL(ans)[0] = x;
 UNPROTECT(1);
 return ans;
 }

 double feval(double x, SEXP f, SEXP rho)
 {
 defineVar(install("x"), mkans(x), rho);
 return(REAL(eval(f, rho))[0]);
 }

In C, the order in which function arguments are evaluated before the
function itself is called is undefined. Hence there is no guarantee
that install("x") will be evaluated before mkans(x). What happens if
mkans(x) is evaluated first? Then install("x") will be called and
eventually trigger garbage collection while the SEXP returned by
mkans(x) is still unprotected.

I'm asking because I'm getting all sorts of problems with

  defineVar(install(somekey), mkans(x), rho);

In my code this line is inside a big loop (hundred of thousands of
iterations) so I end up with a lot of symbols in the rho environment.

The problems I've seen are hard to reproduce: sometimes it's a segfault,
sometimes a "cons memory exhausted" error, or sometimes everything looks
fine except that, later, when I retrieve values from the rho environment
with findVar(), some of them are altered!

But if I replace the above line by:

  PROTECT(ans = mkans(x));
  defineVar(install(somekey), ans, rho);
  UNPROTECT(1);

then everything works fine :-)

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Change in write.dcf() that breaks existing code

2007-11-26 Thread Herve Pages
Hi,

When writing a DCF file one record at a time, the records used
to be separated by an empty line in R-2.5, but not in R-2.6:

  x <- data.frame(aa=letters[1:3], ii=1:3)

With R-2.5:

  > x <- data.frame(aa=letters[1:3], ii=1:3)
  > for (i in seq_len(nrow(x))) write.dcf(x[i, ,drop=FALSE])
  aa: a
  ii: 1

  aa: b
  ii: 2

  aa: c
  ii: 3

With R-2.6:

  > for (i in seq_len(nrow(x))) write.dcf(x[i, ,drop=FALSE])
  aa: a
  ii: 1
  aa: b
  ii: 2
  aa: c
  ii: 3

Unfortunately, because of this change, code that used to be OK
now produces broken DCF files.
To add to the frustration, nothing is mentioned in the man page
or in the NEWS file:

  CHANGES IN R VERSION 2.6.0

o   write.dcf() no longer writes NA fields (PR#9796), and works
correctly on empty descriptions.

Although it seems that it was a deliberate choice. Found in base/R/dcf.R:
...
if(length(eor)) {
## Newline for end of record.
## Note that we do not write a trailing blank line.
eor[ diff(c(col(out))[is_not_empty]) >= 1 ] <- "\n"
}
...

What's the benefit of this change?

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] install.packages() and configure.args

2007-10-19 Thread Herve Pages
Hi,

In the case where install.packages("packageA") also needs to install
required package "packageB", then what is passed thru the 'configure.args'
argument seems to be lost when it's the turn of packageA to be installed
(the last package to get installed).

This is not easy to reproduce but let's say you have the graphviz libraries
installed on your system, but you don't have the graph package installed yet.
Then this

  install.packages("Rgraphviz",
   rep="http://bioconductor.org/packages/2.1/bioc";,
   configure.args="--with-graphviz=/some/non/standard/place")

will fail because --with-graphviz=/some/non/standard/place doesn't seem to be
passed to Rgraphviz's configure script. But if you already have the graph 
package,
then it will work.

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'R CMD build' and file permissions

2007-10-05 Thread Herve Pages
Hi,

When building a source package on Linux with 'R CMD build',
the files in the resulting tarball don't have the original
permissions.
The problem is that the packages I want to build include an SQLite
data base (an .sqlite file) and, before I run 'R CMD build', I've
made this file read-only (chmod 444) because I want this DB to be
read-only. Then if I install by running 'R CMD INSTALL' directly on
the source directory, everything works as expected (the permissions
of the installed .sqlite file are conserved). But if I run 'R CMD build'
in order to produce the tarballs (I need to distribute those packages),
then, when the user will install them, the SQLite DBs will not be
read-only anymore.

Is there a way to prevent 'R CMD build' from changing the permissions
of the source files?
Thanks in advance!

Cheers,
H

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] breaking my "length" method also breaks str()

2007-09-26 Thread Herve Pages
Hi,

This works fine and gives 26:

  setClass("A", representation(bidule="character"))
  setMethod("length", "A", function(x) length([EMAIL PROTECTED]))
  a <- new("A", bidule=letters)
  str(a)
  length(a)

But if my "length" method is broken, then str() stops working:

  > setMethod("length", "A", function(x) length([EMAIL PROTECTED]))
  > str(a)
  Error in length(object) :
no slot of name "bidulle" for this object of class "A"

Doesn't seem like a big deal on such a simple example but this can be
really problematic. I already had the situation where the "length" method
is in fact very complicated, with a lot of nested calls and one access to
a database and an error occurring at a deep level. I noticed that my program
had a problem but I was not yet aware that "length" was broken. So I started
the debug session and tried to use str() to look inside my objects... but
str() was not working! Which makes a debugging session with a lot of S4 stuff
all around the place quite painful :-/

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Virtual class with no slots

2007-09-26 Thread Herve Pages
Hi,

It seems there are 2 slightly different ways to create a virtual
class with no slots (a kind of Java "interface"): with or without
specifying 'representation("VIRTUAL")'.

  > setClass("A", representation("VIRTUAL"))
  [1] "A"

  > showClass("A")
  Virtual Class

  No Slots, prototype of class "S4"

  > setClass("B")
  [1] "B"

  > showClass("B")
  Virtual Class

  No Slots, prototype of class "NULL"

Note that:

  - R automatically decided to make B virtual, a strange design decision
(IMO) that could have been motivated (I'm trying to guess here) by the
fact that it is probably not very useful to create objects with no slots.
But I would argue that it could also be left to the user to decide whether
or not it is a good idea to create such objects (maybe s/he has a use case).

  - Surprisingly, A and B descriptions are slightly different: A's prototype
is of class "S4" and B's prototype is of class "NULL".

So my question is (just curiosity): why do we have 2 different forms
(leading to different results) to create a virtual class with no slots.
And which one is better?

Also note that, if I try to extend "A" without adding any new slot:

  > setClass("AA", contains="A")
  [1] "AA"

  > showClass("AA")

  No Slots, prototype of class "S4"

  Extends: "A"

now it seems that, yes, it is possible to create a non-virtual class with
no slots! ( it's not easy though :-b ) So I'm wondering why this was not
possible with 'setClass("B")' or 'setClass("B", representation())' in the
first place.

Finally, if I try to instance "AA":

  > new("AA")
  An object of class "AA"
  Error in getClass(class(object)) : "S4" is not a defined class

Oops! Seems that non-virtual classes with no slots are broken anyway :-/

Cheers,
H.


> sessionInfo()
R version 2.6.0 beta (2007-09-19 r42914)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] rcompgen_0.1-15

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rggobi not in bin/macosx/universal/contrib/2.6/PACKAGES on CRAN

2007-09-26 Thread Herve Pages
hadley wickham wrote:
[...]
> I've fixed the problem with GGobi, and Simon has updated the version
> on CRAN, so you should be able to get a binary version again v. soon.

Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rggobi not in bin/macosx/universal/contrib/2.6/PACKAGES on CRAN

2007-09-26 Thread Herve Pages
Hi Simon,

Simon Urbanek wrote:
> Actually, the fact that the tar ball is there must be a mirroring
> problem, because it's not on the master CRAN server. You should fix your
> mirror - objects may appear closer ... ;)

Yesterday, before I reported the problem, I checked and found the same problem 
on
other mirrors too (i.e. the package was here but it was not listed in PACKAGES).
Today the problem seems to be gone (again, I've checked a few mirrors) so it's
hard to know what happened exactly.

When I say that the problem is gone, I mean that PACKAGES now seems to be in 
sync with
the real content of the bin/macosx/universal/contrib/2.6/ folder. But the 
problem
with ggobi itself is still here (now, as expected, the ggobi tarball is not in
the CRAN folder for Mac OS X packages anymore).

Thanks!

H.


> 
> Cheers,
> Simon
> 
> On Sep 25, 2007, at 4:28 PM, Simon Urbanek wrote:
> 
>>
>> On Sep 25, 2007, at 12:58 PM, Herve Pages wrote:
>>
>>> Hi,
>>>
>>> R-2.6 + install.packages() doesn't find rggobi on Mac OS X.
>>> The .tgz file is here:
>>>
>>>   http://cran.fhcrc.org/bin/macosx/universal/contrib/2.6/
>>>
>>> but it is not listed in the PACKAGES file:
>>>
>>>   http://cran.fhcrc.org/bin/macosx/universal/contrib/2.6/PACKAGES
>>>
>>> Any idea why?
>>>
>>
>> Yes, because it doesn't work - please see the check results:
>> http://cran.r-project.org/src/contrib/checkSummary.html
>>
>> I have verified it today and the ggobi binary from the ggobi pages is
>> still not fixed.
>>
>> Cheers,
>> Simon
>>
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rggobi not in bin/macosx/universal/contrib/2.6/PACKAGES on CRAN

2007-09-25 Thread Herve Pages
Hi,

R-2.6 + install.packages() doesn't find rggobi on Mac OS X.
The .tgz file is here:

  http://cran.fhcrc.org/bin/macosx/universal/contrib/2.6/

but it is not listed in the PACKAGES file:

  http://cran.fhcrc.org/bin/macosx/universal/contrib/2.6/PACKAGES

Any idea why?

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Compilation error with R-devel_2007-08-12.tar.gz snapshot

2007-08-13 Thread Herve Pages
Problem gone with new snapshot (2007-08-13, r42496). Thanks!

H.

Herve Pages wrote:
> Hi,
> 
> I get a compilation error with last available R devel
> snapshot (R-devel_2007-08-12.tar.gz, r42483):
> 
> 
>  CONFIGURE 
> 
> [EMAIL PROTECTED]:~/R-2.6.broken> ~/src/R-2.6.r42483/configure
> checking build system type... x86_64-unknown-linux-gnu
> checking host system type... x86_64-unknown-linux-gnu
> [...]
> R is now configured for x86_64-unknown-linux-gnu
> 
>   Source directory:  /home/hpages/src/R-2.6.r42483
>   Installation directory:/usr/local
> 
>   C compiler:gcc -std=gnu99  -g -O2
>   Fortran 77 compiler:   gfortran  -g -O2
> 
>   C++ compiler:  g++  -g -O2
>   Fortran 90/95 compiler:gfortran -g -O2
>   Obj-C compiler:gcc -g -O2
> 
>   Interfaces supported:  X11, tcltk
>   External libraries:readline
>   Additional capabilities:   PNG, JPEG, iconv, MBCS, NLS
>   Options enabled:   shared BLAS, R profiling, Java
> 
>   Recommended packages:  yes
> 
> 
> == MAKE ===
> 
> [EMAIL PROTECTED]:~/R-2.6.broken> make
> [...]
> make[1]: Entering directory 
> `/home/hpages/R-2.6.broken/src/library/Recommended'
> make[2]: Entering directory 
> `/home/hpages/R-2.6.broken/src/library/Recommended'
> make[2]: *** No rule to make target `VR.ts', needed by `stamp-recommended'.  
> Stop.
> make[2]: Leaving directory `/home/hpages/R-2.6.broken/src/library/Recommended'
> make[1]: *** [recommended-packages] Error 2
> make[1]: Leaving directory `/home/hpages/R-2.6.broken/src/library/Recommended'
> make: *** [stamp-recommended] Error 2
> 
> 
> I have no problems with older tarballs e.g. tarball from 2007-08-06 (r42439)
> compiles fine.
> 
> Cheers,
> H.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Compilation error with R-devel_2007-08-12.tar.gz snapshot

2007-08-13 Thread Herve Pages
Hi,

I get a compilation error with last available R devel
snapshot (R-devel_2007-08-12.tar.gz, r42483):


 CONFIGURE 

[EMAIL PROTECTED]:~/R-2.6.broken> ~/src/R-2.6.r42483/configure
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
[...]
R is now configured for x86_64-unknown-linux-gnu

  Source directory:  /home/hpages/src/R-2.6.r42483
  Installation directory:/usr/local

  C compiler:gcc -std=gnu99  -g -O2
  Fortran 77 compiler:   gfortran  -g -O2

  C++ compiler:  g++  -g -O2
  Fortran 90/95 compiler:gfortran -g -O2
  Obj-C compiler:gcc -g -O2

  Interfaces supported:  X11, tcltk
  External libraries:readline
  Additional capabilities:   PNG, JPEG, iconv, MBCS, NLS
  Options enabled:   shared BLAS, R profiling, Java

  Recommended packages:  yes


== MAKE ===

[EMAIL PROTECTED]:~/R-2.6.broken> make
[...]
make[1]: Entering directory `/home/hpages/R-2.6.broken/src/library/Recommended'
make[2]: Entering directory `/home/hpages/R-2.6.broken/src/library/Recommended'
make[2]: *** No rule to make target `VR.ts', needed by `stamp-recommended'.  
Stop.
make[2]: Leaving directory `/home/hpages/R-2.6.broken/src/library/Recommended'
make[1]: *** [recommended-packages] Error 2
make[1]: Leaving directory `/home/hpages/R-2.6.broken/src/library/Recommended'
make: *** [stamp-recommended] Error 2


I have no problems with older tarballs e.g. tarball from 2007-08-06 (r42439)
compiles fine.

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Embedded nuls in strings

2007-08-07 Thread Herve Pages
Duncan Murdoch wrote:
> On 07/08/2007 6:29 PM, Herve Pages wrote:
[...]
>> Same for serialization:
>>
>>> save(string0, file="string0.rda")
>>> load("string0.rda")
>>> string0
>> [1] "ABCD"
> 
> Of these, I'd say the serialization is the only case where it would be
> reasonable to fix the behaviour.  R depends on C run-time functions for
> most of the string operations, and they'll stop at a null.  So if this
> isn't documented behaviour, it should be, but it's not reasonable to
> rewrite the C run-time string functions just to handle such weird
> objects.  Functions like "grep" require thousands of lines of code, not
> written by us, and in my opinion maintaining changes to it is not
> something the R project should take on.

I was not (of course) suggesting to fix all the string manipulation functions.
I'm just wondering why R would try to support embedded nuls in the first
place given that they can only be a source of troubles.

What about this:

  > string0
  [1] "ABCD\0F"
  > string0 == "ABCD"
  [1] TRUE

string0 is obviously different from "ABCD"!

Maybe it's easier to change the semantic of rawToChar() so it doesn't return
a string with embedded nuls. More generally speaking, base functions should
always return "clean" strings.

> 
> As to serialization:  there's a comment in the source that embedded
> nulls are handled by it, and that's true up to R-patched, but not in
> R-devel.  Looks like someone has introduced a bug.
> 
> Duncan Murdoch
>>
>> One comment about the nchar man page:
>>   'chars' The number of human-readable characters.
>>
>> "human-readable" seems to be used for "everything but a nul" here
>> which can be confusing.
>> For example one would generally think of ascii codes 1 to 31 as non
>> "human-readable" but
>> nchar() seems to disagree:
>>
>>> string1 <- rawToChar(as.raw(1:31))
>>> string1
>> [1]
>> "\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037"
>>
>>> nchar(string1, type="chars")
>> [1] 31
> 
> No, "human-readable" also has other meanings in multi-byte encodings. If
> an e-acute is encoded in two bytes in your locale, it still only counts
> as one human-readable character.
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Embedded nuls in strings

2007-08-07 Thread Herve Pages
Duncan Murdoch wrote:
> On 07/08/2007 5:06 PM, Herve Pages wrote:
>> Hi,
>>
>> ?rawToChar
>>  'rawToChar' converts raw bytes either to a single character string
>>  or a character vector of single bytes.  (Note that a single
>>  character string could contain embedded nuls.)
>>
>> Allowing embedded nuls in a string might be an interesting experiment
>> but it
>> seems to cause some troubles to most of the string manipulation
>> functions.
>>
>> A string with an embedded 0:
>>
>>   raw0 <- as.raw(c(65:68, 0 , 70))
>>   string0 <- rawToChar(raw0)
>>
>>> string0
>> [1] "ABCD\0F"
>>
>> nchar() should return 6:
>>> nchar(string0)
>> [1] 4
> 
> You don't state your R version.  The default type of counting in nchar()
> has recently changed from "bytes" (where 6 is correct) to "chars" (where
> 4 is correct).


Oops, sorry:

> sessionInfo()
R version 2.6.0 Under development (unstable) (2007-07-02 r42107)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] rcompgen_0.1-15


And indeed:
  raw0 <- as.raw(c(65:68, 0 , 70))
  string0 <- rawToChar(raw0)

> nchar(string0, type="chars")
[1] 4
> nchar(string0, type="bytes")
[1] 6


In addition to the string functions already mentioned before, it's worth noting 
that
'paste' doesn't seem to be "embedded nul aware" neither:

> paste(string0, "G", sep="")
[1] "ABCDG"

Same for serialization:

> save(string0, file="string0.rda")
> load("string0.rda")
> string0
[1] "ABCD"

One comment about the nchar man page:
  'chars' The number of human-readable characters.

"human-readable" seems to be used for "everything but a nul" here which can be 
confusing.
For example one would generally think of ascii codes 1 to 31 as non 
"human-readable" but
nchar() seems to disagree:

> string1 <- rawToChar(as.raw(1:31))
> string1
[1]
"\001\002\003\004\005\006\a\b\t\n\v\f\r\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037"
> nchar(string1, type="chars")
[1] 31


Cheers,
H.


> 
> Duncan Murdoch
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Embedded nuls in strings

2007-08-07 Thread Herve Pages
Hi,

?rawToChar
 'rawToChar' converts raw bytes either to a single character string
 or a character vector of single bytes.  (Note that a single
 character string could contain embedded nuls.)

Allowing embedded nuls in a string might be an interesting experiment but it
seems to cause some troubles to most of the string manipulation functions.

A string with an embedded 0:

  raw0 <- as.raw(c(65:68, 0 , 70))
  string0 <- rawToChar(raw0)

> string0
[1] "ABCD\0F"

nchar() should return 6:
> nchar(string0)
[1] 4

In addition this embedded nul seems to break almost all string 
manipulation/searching
functions:
  grep("F", string0)
  strsplit(string0, split=NULL, fixed=TRUE)[[1]]
  tolower(string0)
  chartr("F", "x", string0)
  substr(string0, 6, 6)
  ...
  etc...

Not very surprisingly, they all seem to treat string0 as if it was "ABCD"!

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] seq_along() doesn't use _my_ length

2007-07-12 Thread Herve Pages
Hi,

According to seq_along man page, 'seq_along(x)' is equivalent to
'seq_len(length(x))' but apparently not if 'x' is an S4 object with
a defined "length" method:

  > seq_along(letters[11:15])
  [1] 1 2 3 4 5

  > setClass("A", representation(titi="character"))
  [1] "A"

  > setMethod("length", "A", function(x) length([EMAIL PROTECTED]))
  [1] "length"

  > a <- new("A", titi=letters[11:15])

  > length(a)
  [1] 5

  > seq_along(a)
  [1] 1

Thanks!
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Generics in base with no ...

2007-07-12 Thread Herve Pages
Duncan Murdoch wrote:
> On 12/07/2007 6:17 PM, Herve Pages wrote:
>> Hi,
>>
>> Some generics in base that don't have the ... extra argument:
>> rev(), t(), scale() and unlist(). Is there any plan to make these
>> more reusable? I used to be interested in having a rev() method for
>> my objects, but since I needed an extra argument for it, then I was
>> forced to create my own generic instead. And because I didn't want
>> to mask base::rev(), I chose another name too. The only advantage
>> of doing this is that the man page for myrev() was more accessible
>> than if I had made a rev() method. But it would be nice to make the
>> rev() generic more reusable anyway, just because the name "rev"
>> itself is good and easy to remember.
> 
> It would be helpful to give specific examples of use cases.  As far as I
> know, adding the ... makes every call slower; this is acceptable if
> there's a good reason to do it, but I think we should think about it.
> 
> rev() in particular seems as though it should always take an ordered
> thing and return the same kind of thing but in the reversed order:  why
> would you want to add optional args, and still call it rev?  If you're
> using some user-specified ordering, wouldn't it be better to use sort(),
> instead?

My objects have a name which is stored in a slot. When I revert it, I need
to rename it too because I don't want to end up with 2 different objects
having the same name. I could do y <- rev(x); [EMAIL PROTECTED] <- "new name" 
but
this is not conceptually very good: the action of reversing _and_ renaming
should be atomic. So I wanted to add an 'objName' arg to my rev() method.

You could also imagine that the individual elements of the ordered thing
you want to revert are themselves ordered things. So maybe you just want
to reverse the top level elements or you want this to be recursive ("deep"
reversing). Then you need an extra argument like the 'recursive' arg of
the unlist() generic to control this.

Anyway, I was not aware of the performance penalty introduced by adding ...
so I understand now why these generics have not been modified so far.

Thanks!
H.


> 
> Duncan Murdoch
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Generics in base with no ...

2007-07-12 Thread Herve Pages
Hi,

Some generics in base that don't have the ... extra argument:
rev(), t(), scale() and unlist(). Is there any plan to make these
more reusable? I used to be interested in having a rev() method for
my objects, but since I needed an extra argument for it, then I was
forced to create my own generic instead. And because I didn't want
to mask base::rev(), I chose another name too. The only advantage
of doing this is that the man page for myrev() was more accessible
than if I had made a rev() method. But it would be nice to make the
rev() generic more reusable anyway, just because the name "rev"
itself is good and easy to remember.

Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] [[.data frame and row names

2007-07-11 Thread Herve Pages
Hi,

I'm wondering why indexing a data frame by row name doesn't work
with [[. It works with [:

  > sw <- swiss[1:5,1:2]
  > sw["Moutier", "Agriculture"]
  [1] 36.5

but not with [[:

  > sw[["Moutier", "Agriculture"]]
  Error in .subset2(.subset2(x, ..2), ..1) : subscript out of bounds

The problem is really with the row name (and not the col name) since
this works:

  > sw[[4, "Agriculture"]]
  [1] 36.5

but not this:

  > sw[["Moutier", 2]]
  Error in .subset2(.subset2(x, ..2), ..1) : subscript out of bounds

No such problems with a matrix where everything works as expected:

  > msw <- as.matrix(sw)
  > msw[["Moutier", "Agriculture"]]
  [1] 36.5

Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Missing args with a default value in S4 methods

2007-06-15 Thread Herve Pages
John Chambers wrote:
> This has essentially nothing to do with methods, but rather with the
> treatment of missing arguments.
> 
> Consider:
>> foo <- function(x,...)bar(x,...)
>> bar <- function(x, y=12, z, ...) {cat(missing(y), "\n"); cat(y, "\n")}
> 
> This is the same argument-matching as your example, since the generic
> and method have different formal arguments.  And indeed,
> 
>> foo("a",,z=99)
> TRUE
> Error in cat(y, "\n") : argument is missing, with no default
> 
> The error message is correct, but the argument in question is not "y"
> but "..1".  This is constructed and passed down as a special R object
> representing "missing-argument-with-no-default".   (Splus would have
> worked as you expected, because missingness there is a property of the
> function call, not of the object corresponding to the formal argument.)

Thanks John for the clarification! I can see why, _technically speaking_, things
behave how they behave.

Note that what happens with default arguments in methods is not always the same
as with normal functions so it's not always possible to predict what is actually
going to happen... Here is an example:

1) Default arg in the method:

   o generic + method:
 > bar <- function(x, y=12, z) {cat(missing(y), "\n"); cat(x, y, z, "\n")}
 > setGeneric("mygen", signature=c("x", "z"), function(x, y, z) 
standardGeneric("mygen"))
 > setMethod("mygen", c("ANY", "ANY"), bar)
 > mygen("aa", , "bb")
 TRUE
 Error in cat(x, y, z, "\n") : argument "y" is missing, with no default

   o normal functions:
 > foo <- function(x, y, z) bar(x, y, z)
 > foo("aa", ,"bb")
 TRUE
 Error in cat(x, y, z, "\n") : argument "y" is missing, with no default

   Behaviour is the same.

2) Default arg in the generic:

   o generic + method: example 1) shows that if I want a default value
 for y, it should be put in the generic rather than in the method:
 > bar <- function(x, y, z) {cat(missing(y), "\n"); cat(x, y, z, "\n")}
 > setGeneric("mygen", signature=c("x", "z"), function(x, y=12, z) 
standardGeneric("mygen"))
 > setMethod("mygen", c("ANY", "ANY"), bar)
 > mygen("aa", , "bb")
 TRUE
 aa 12 bb

   o normal functions:
 > foo <- function(x, y=12, z) bar(x, y, z)
 > foo("aa", ,"bb")
 FALSE
 aa 12 bb

   Behaviour is _almost_ the same!

3) Default arg in the generic _and_ in the method:

   o generic + method:
 > bar <- function(x, y=999, z) {cat(missing(y), "\n"); cat(x, y, z, "\n")}
 > setMethod("mygen", c("ANY", "ANY"), bar)
 > mygen("aa", , "bb")
 TRUE
 aa 999 bb

 Not what I would expect!

   o normal functions:
 > foo("aa", ,"bb")
 FALSE
 aa 12 bb

 Much better.

I'm sure there is a _technical_ explanation for this (with probably some lazy 
evaluation
involved) but I find the current behaviour confusing and very hard to predict.

Cheers,
H.


> 
> 
> Herve Pages wrote:
>> Hi,
>>
>>
>> Strange things happen with missing args in S4 methods:
>>
>>   > setGeneric("mygen", signature="x", function(x, ...)
>> standardGeneric("mygen"))
>>   [1] "mygen"
>>
>>   > setMethod("mygen", "character", function(x, y=12, z, ...)
>> {cat(missing(y), "\n"); cat(y, "\n")})
>>   [1] "mygen"
>>
>>   > mygen("aa", z=99)
>>   TRUE
>>   12
>>
>>   > mygen("aa", , 99)
>>   TRUE
>>   Error in cat(y, "\n") : argument is missing, with no default
>>   ^^^   ^^
>>TRUE  NOT TRUE!
>>
>>
>> For "normal" functions, things work as expected:
>>
>>   > myfun <- function(x, y=12, z, ...) {cat(missing(y), "\n"); cat(y,
>> "\n")}
>>
>>   > myfun("aa", z=99)
>>   TRUE
>>   12
>>
>>   > myfun("aa", , 99)
>>   TRUE
>>   12
>>
>> And with S3 generics too:
>>
>>   > dd <- data.frame(aa=letters[1:9], ii=9:1)
>>   > head(dd, z="ignored")
>> aa ii
>>   1  a  9
>>   2  b  8
>>   3  c  7
>>   4  d  6
>>   5  e  5
>>   6  f  4
>>
>>   > head(dd, , "ignored")
>> aa ii
>>   1  a  9
>>   2  b  8
>>   3  c  7
>>   4  d  6
>>   5  e  5
>>   6  f  4
>>
>> Cheers,
>> H.
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>   
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Missing args with a default value in S4 methods

2007-06-14 Thread Herve Pages
Hi,


Strange things happen with missing args in S4 methods:

  > setGeneric("mygen", signature="x", function(x, ...) 
standardGeneric("mygen"))
  [1] "mygen"

  > setMethod("mygen", "character", function(x, y=12, z, ...) {cat(missing(y), 
"\n"); cat(y, "\n")})
  [1] "mygen"

  > mygen("aa", z=99)
  TRUE
  12

  > mygen("aa", , 99)
  TRUE
  Error in cat(y, "\n") : argument is missing, with no default
  ^^^   ^^
   TRUE  NOT TRUE!


For "normal" functions, things work as expected:

  > myfun <- function(x, y=12, z, ...) {cat(missing(y), "\n"); cat(y, "\n")}

  > myfun("aa", z=99)
  TRUE
  12

  > myfun("aa", , 99)
  TRUE
  12

And with S3 generics too:

  > dd <- data.frame(aa=letters[1:9], ii=9:1)
  > head(dd, z="ignored")
aa ii
  1  a  9
  2  b  8
  3  c  7
  4  d  6
  5  e  5
  6  f  4

  > head(dd, , "ignored")
aa ii
  1  a  9
  2  b  8
  3  c  7
  4  d  6
  5  e  5
  6  f  4

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R.INSTALL on Windows

2007-06-13 Thread Herve Pages
Hi,

This has been addressed in R-2.5.0 and R-2.6.0. Thanks!

H.

Herve Pages wrote:
> Hi,
> 
> I'd like to point out a potential problem with the current R.INSTALL
> directory created by 'R CMD INSTALL' when applied to a source package.
> 
> On Windows (and, AFAIK, only on Windows), 'R CMD INSTALL 
> mypackage_1.0.0.tar.gz'
> creates a temporary R.INSTALL directory in the current directory. This
> directory is removed before the command terminates but only if it was
> successful.
> 
> One inconvenient of this behaviour is that you can get a clash when you try to
> build 2 Windows binary packages (with 'R CMD INSTALL --build') at the same 
> time
> in the same directory. I can circumvent the 00LOCK mechanism by providing
> a different --library option to the 2 commands but, if I'm running the 2 
> commands
> in the same directory, then they will be creating and using the same 
> R.INSTALL.
> Then, the first command who terminates will remove it and break the other 
> command.
> 
> Is there an easy way to prevent this?
> 
> Is there any reason why R.INSTALL is not kept at the end of 'R CMD INSTALL',
> or why 'R CMD INSTALL' doesn't use a package specific directory name
> (e.g. R.INSTALL.mypackage) like 'R CMD check' does, or why it doesn't
> use one of the "standard" temp places (defined by env. var. TMP or TMPDIR).
> 
> Thanks in advance!
> 
> Cheers,
> H.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] C function with unknown output length

2007-06-06 Thread Herve Pages
Vincent Goulet wrote:
> Hi all,
> 
> Could anyone point me to one or more examples in the R sources of a C  
> function that is called without knowing in advance what will be the  
> length (say) of the output vector?
> 
> To make myself clearer, we have a C function that computes  
> probabilities until their sum gets "close enough" to 1. Hence, the  
> number of probabilities is not known in advance.
> 

Hi Vincent,

Let's say you want to write a function get_matches(const char * pattern, const 
char * x)
that will find all the occurrences of string 'pattern' in string 'x' and 
"return"
their positions in the form of an array of integers.
Of course you don't know in advance how many occurrences you're going to find.

One possible strategy is to:

  - Add an extra arg to 'get_matches' for storing the positions and make
'get_matches' return the number of matches (i.e. the length of *pos):

  int get_matches(int **pos_ptr, const char * pattern, const char * x)

Note that pos_ptr is a pointer to an int pointer.

  - In get_matches(): use a local array of ints and start with an arbitrary
initial size for it:

  int get_matches(...)
  {
int *tmp_pos, tmp_size, npos = 0;

tmp_size = some initial guess of the number of matches
tmp_pos = (int *) S_alloc((long) tmp_size, sizeof(int));
...

Then start searching for matches and every time you find one, store its
position in tmp_pos[npos] and increase npos.
When tmp_pos is full (npos == tmp_size), realloc with:

...
old_size = tmp_size;
tmp_size = 2 * old_size; /* there are many different strategies for 
this */
tmp_pos = (int *) S_realloc((char *) tmp_pos, (long) tmp_size,
(long) old_tmp_size, sizeof(int));
...

Note that there is no need to check that the call to S_alloc() or 
S_realloc()
were successful because these functions will raise an error and end the call
to .Call if they fail. In this case they will free the memory currently 
allocated
(and so will do on any error or user interrupt).

When you are done, just return with:

...
*pos_ptr = tmp_pos;
return npos;
  }

  - Call get_matches with:

  int *pos, npos;

  npos = get_matches(&pos, pattern, x);

Note that memory allocation took place in 'get_matches' but now you need
to decide how and when the memory pointed by 'pos' will be freed.
In the R environment, this can be addressed by using exclusively transient
storage allocation 
(http://cran.r-project.org/doc/manuals/R-exts.html#Transient)
as we did in get_matches() so the allocated memory will be automatically
reclaimed at the end of the call to .C or .Call.
Of course, the integers stored in pos have to be moved to a "safe" place
before .Call returns. Typically this will be done with something like:

  SEXP Call_get_matches(...)
  {
...
npos = get_matches(&pos, pattern, x);
PROTECT(pos_sxp = NEW_INTEGER(npos));
memcpy(INTEGER(pos_sxp), pos, npos * sizeof(int));
UNPROTECT(1);
return pos_sxp; /* end of call to .Call */
  }

There are many variations around this. One of them is to "share" pos and npos 
between
get_matches and its caller by making them global variables (in this case it is
recommended to use 'static' in their declarations but this requires that 
get_matches
and its caller are in the same .c file).

Hope this helps.

H.

> I would like to have an idea what is the best way to handle this  
> situation in R.
> 
> Thanks in advance!
> 
> ---
>Vincent Goulet, Associate Professor
>École d'actuariat
>Université Laval, Québec
>[EMAIL PROTECTED]   http://vgoulet.act.ulaval.ca
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'R CMD INSTALL mypkg' doesn't always update help pages

2007-06-05 Thread Herve Pages
Hi,

'R CMD INSTALL mypkg' and 'install.packages(mypkg, repos=NULL)' don't
update mypkg help pages when mypkg is a source directory. They only
install new help pages if there are some but they leave the already
installed pages untouched. So you end up with mixed man pages from
different versions of the package :-/

I found no mention of this in 'R CMD INSTALL --help' or in
'?install.packages' so it looks like a bug (or at least an
undocumented feature). Isn't it something that the user should be
able to control via the --no-docs option?

This problem can be worked around by first building the source
package with 'R CMD build mypkg' but...

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Unexpected alteration of data frame column names

2007-05-15 Thread Herve Pages
Hi,

I'm using data.frame(..., check.names=FALSE), because I want to create
a data frame with duplicated column names (in the real life you can get such
data frame as the result of an SQL query):

  > df <- data.frame(aa=1:5, aa=9:5, check.names=FALSE)
  > df
aa aa
  1  1  9
  2  2  8
  3  3  7
  4  4  6
  5  5  5

Why is [.data.frame changing my column names?

  > df[1:3, ]
aa aa.1
  1  19
  2  28
  3  37

How can this be avoided? Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R on Solaris 10 x64

2007-04-13 Thread Herve Pages
Hi David,

Tai-Wei (David) Lin wrote:
> Hi R Developers,
> 
> Greg is helping me with debugging R on Solaris 10 x64. Please let us
> know if you have any thoughts or tips that can help us debug this.
> 
> Thanks,
> 
> David
> 
> 
> 
> 
> Using default transfer plist
> in vector_io: permuting
> About to write
> 
>  *** caught segfault ***
> address e8554000, cause 'memory not mapped'
> 
> Traceback:
>  1: .External("do_hdf5save", call, sys.frame(sys.parent()), fileout,
>  ..., PACKAGE = "hdf5")
>  2: hdf5save(hdf5_Fstat, "Fstat", "geneNames", "genotype")
> aborting ...
> 
> 
> We've tried many things to debug it:
> 
> * dbx Runtime Checking (RTC) is not detecting any (meaningful) memory
> access problems that I can see.
> 
> * The same on Solaris/SPARC.
> 
> * Neither does Valgrind on Linux.
> 
> * I've tried increasing the C stack size, assuming R could be running
> out of stack size. Didn't help.
> 
> Running R under dbx (without RTC) until the crash shows this:
> 
> ...
> About to write
> [EMAIL PROTECTED] ([EMAIL PROTECTED]) signal SEGV (no mapping at the fault 
> address) in _memcpy at
> 0xfe90444b
> 0xfe90444b: _memcpy+0x006b: movaps   0x(%esi),%xmm0
> Current function is H5D_select_mgath
>   379   HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len);
> (dbx) where
> current thread: [EMAIL PROTECTED]
>   [1] _memcpy(0x0, 0xfdebc707, 0x9f5c4f0), at 0xfe90444b
> =>[2] H5D_select_mgath(_buf = 0x9f79580, space = 0x8966770, iter =
> 0x8045980, nelmts = 3120U, dxpl_cache = 0xfe170078, _tgath_buf =
> 0x9f5c4f0), line 379 in "H5Dselect.c"
>   [3] H5D_contig_write(io_info = 0x804620c, nelmts = 3120ULL, mem_type =
> 0x97b05c8, mem_space = 0x8966770, file_space = 0x8966770, tpath =
> 0x8ee7078, src_id = 201326906, dst_id = 201326904, buf = 0x9f79580),
> line 1418 in "H5Dio.c"
>   [4] H5D_write(dataset = 0x8f169c0, mem_type_id = 201326906, mem_space
> = 0x8966770, file_space = 0x8966770, dxpl_id = 671088643, buf =
> 0x9f79580), line 952 in "H5Dio.c"
>   [5] H5Dwrite(dset_id = 335544330, mem_type_id = 201326906,
> mem_space_id = 0, file_space_id = 0, plist_id = 671088643, buf =
> 0x9f79580), line 586 in "H5Dio.c"
>   [6] vector_io(call = 0x97234ec, writeflag = 1, dataset = 335544330,
> space = 268435472, obj = 0x98386a0), line 535 in "hdf5.c"
>   [7] hdf5_write_vector(call = 0x97234ec, id = 67108867, symname =
> 0x9cf35d0 "geneNames", val = 0x98386a0), line 693 in "hdf5.c"
>   [8] hdf5_save_object(call = 0x97234ec, fid = 67108867, symname =
> 0x9cf35d0 "geneNames", val = 0x98386a0), line 957 in "hdf5.c"
>   [9] do_hdf5save(args = 0x9723284), line 1104 in "hdf5.c"
>   [10] do_External(call = 0x86d62bc, op = 0x8371cd8, args = 0x972340c,
> env = 0x9723594), line 832 in "dotcode.c"
>   [11] Rf_eval(e = 0x86d62bc, rho = 0x9723594), line 445 in "eval.c"
>   [12] Rf_evalList(el = 0x86d6230, rho = 0x9723594, op = 0x837226c),
> line 1463 in "eval.c"
>   [13] Rf_eval(e = 0x86d6214, rho = 0x9723594), line 438 in "eval.c"
>   [14] do_begin(call = 0x86d56bc, op = 0x836709c, args = 0x86d61dc, rho
> = 0x9723594), line 1107 in "eval.c"
>   [15] Rf_eval(e = 0x86d56bc, rho = 0x9723594), line 431 in "eval.c"
>   [16] Rf_applyClosure(call = 0x9723738, op = 0x83c0328, arglist =
> 0x97236e4, rho = 0x8379b1c, suppliedenv = 0x8379b38), line 614 in "eval.c"
>   [17] Rf_eval(e = 0x9723738, rho = 0x8379b1c), line 455 in "eval.c"
>   [18] Rf_ReplIteration(rho = 0x8379b1c, savestack = 0, browselevel = 0,
> state = 0x8047328), line 256 in "main.c"
>   [19] R_ReplConsole(rho = 0x8379b1c, savestack = 0, browselevel = 0),
> line 305 in "main.c"
>   [20] run_Rmainloop(), line 944 in "main.c"
>   [21] Rf_mainloop(), line 951 in "main.c"
>   [22] main(ac = 4, av = 0x80477ac), line 33 in "Rmain.c"
> (dbx) p curr_len
> curr_len = 24960U
> (dbx) p curr_seq
> curr_seq = 0
> (dbx) p of
> dbx: "of" is not defined in the scope
> `libhdf5.so.0.0.0`H5Dselect.c`H5D_select_mgath:347`
> dbx: see `help scope' for details
> (dbx) p off
> off = 0x8042960
> (dbx) p tgath_buf
> tgath_buf = 0x9f5c4f0
> "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H"
> (dbx) p buf
> buf = 0x9f79580
> "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H"
> (dbx) p nseq
> nseq = 1U
> (dbx) p len
> len = 0x804195c
> (dbx) p len[0..2]
> len[0..2] =
> [0] = 24960U
> [1] = 140025512U
> [2] = 140013048U
> (dbx)
> 
> 
> The R code in question 

Re: [Rd] 'R CMD check' fails when suggested package is not available

2007-04-11 Thread Herve Pages
Prof Brian Ripley wrote:
> This is a configurable option 'R_check_force_suggests' documented in
> 'Writing R Extensions'.

Thanks for pointing this to me. I tried "R-2.5 CMD check --help" but was not
very successful with it...

> 
> This package should be using Enhances: Rmpi, it seems.

Yep this sounds like what we need here (never heard about this field before,
will be the first time a Bioconductor package uses it).

Is there any reason why the "Enhances" field is not supported by 
install.packages?

  > install.packages("XML", dep="Enhances")
  Error in as.vector(available[p1, dependencies]) :
subscript out of bounds

I know this is in sync with the man page for 'install.packages' (which only
mentions Depends, Imports and Suggests) but there are for sure people that would
like to be able to install a package with _all_ its capabilities (especially 
those
that enhance it) by just doing
  > install.packages(..., dependencies=TRUE)
and not having to look at its DESCRIPTION file in order to figure out what great
enhancements there are missing and then install them separately ;-)

BTW, the name of this field ("Enhances") and the documentation does not help
to understand the "direction" of the enhancing relationship (who enhances who?):
both (the name "Enhances" and the 'Writing R Extensions' manual) tend to say 
that
the packages listed in the field are enhanced by the package at hand. That's a
point of view. But IMO it's rather the package at hand that is enhanced by the
packages listed in the field (at least this is the case with Rmpi).

Cheers,
H.

> 
> On Wed, 4 Apr 2007, Herve Pages wrote:
> 
>> Hi there,
>>
>> I was wondering why I get the following error message:
>>
>>  * checking package dependencies ... ERROR
>>  Packages required but not available:
>>Rmpi
>>
>> when I run 'R CMD check' on a package that _suggests_ Rmpi?
>> Why isn't it OK to not have all the suggested packages installed?
>>
>> Maybe one of the 3 following behaviours would be more appropriate:
>>
>>  a) Having the error saying something like:
>>
>>Package suggested but not available:
>>  Rmpi
>>
>>  b) Make this a warning instead of an error.
>>
>>  c) Don't do anything at all for suggested packages.
>>
>> This issue showed up today while I was checking a new Bioconductor
>> package:
>> the package suggests Rmpi but the vignette and the examples don't use
>> it. If I remove
>> Rmpi from the Suggests field then 'R CMD check' runs all the examples
>> and re-create
>> the vignette with no problem. Most users will not have Rmpi on their
>> machine neither
>> will they be interested in getting into the trouble of installing it.
> 
> 'Most users' will not be running 'R CMD check', of course.
> 
>> The package I was checking suggests Rmpi only because it contains 1
>> function that tries
>> to use it if it's installed but will work perfectly fine otherwise.
>> In this case it seems reasonable to have Rmpi in the Suggests field
>> but this will
>> make 'R CMD check' to fail which is problematic in the context of
>> automated builds :-/
>> If 'R CMD check' can't be a little bit more relaxed about this, then I
>> guess we will
>> need to remove Rmpi from the Suggests field, but then 'R CMD check'
>> will complain that:
>>
>>  * checking for unstated dependencies in R code ... WARNING
>>  'library' or 'require' calls not declared from:
>>Rmpi
>>
>> which is always better than getting an ERROR.
>>
>> Thanks!
>>
>> H.
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with S4 inheritance: unexpected re-initialization?

2007-04-06 Thread Herve Pages
Martin,

Martin Morgan wrote:
> The funny effect where class(object) seems to trigger construction of
> a new object is lazy evaluation -- the 'object' argument to
> setValidity is not evaluated until needed, i.e., until class(object)
> (anything would trigger this, including force(object)); only then do
> you see the attempt to create the new object of the previous
> paragraph.

The fact that you can't predict the number of times the "initialize"
method will be called is problematic. Here is a simple example
where "initialize-A" increments a global counter to keep track of the
number of A-objects:

  A.GLOBALS <- new.env(parent=emptyenv())
  A.GLOBALS[["nobjs"]] <- 0L

  setClass("A",
representation(a="character"),
prototype(a="a0")
  )
  setMethod("initialize", "A",
function(.Object, ...) {
cat("--initialize:A--\n")
A.GLOBALS[["nobjs"]] <- A.GLOBALS[["nobjs"]] + 1
cat("A-object #", A.GLOBALS[["nobjs"]], "\n", sep="")
callNextMethod()
}
  )
  setValidity("A",
function(object) {
cat("--setValidity:A--\n")
tmp <- class(object)
TRUE
}
  )

  setClass("B",
contains="A",
representation(b = "character")
  )
  setMethod("initialize", "B",
function(.Object, ...) {
cat("--initialize:B--\n")
callNextMethod()
}
  )
  setValidity("B",
function(object) {
cat("--setValidity:B--\n")
TRUE
}
  )

Let's try it:

  > b1 <- new("B")
  --initialize:B--
  --initialize:A--
  A-object #1
  > A.GLOBALS[["nobjs"]]
  [1] 1

  > b1 <- new("B", b="hello")
  --initialize:B--
  --initialize:A--
  A-object #2
  --setValidity:A--
  --initialize:A--
  A-object #3
  --setValidity:B--
  > A.GLOBALS[["nobjs"]]
  [1] 3

Shouldn't the "initializing" ("constructor" in other languages) code be executed
exactly 1 time per object instanciation? Something that the programmer can 
simply
rely on, whatever this code contains and whatever the internal subtilities of 
the
programming lamguage are?

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with S4 inheritance: unexpected re-initialization?

2007-04-05 Thread Herve Pages
Hi Cristian,

cstrato wrote:
[...]
> Although SubSubClassB1 and SubSubClassB2 differ only slightly, the results
> for "subsubB1" are correct, while "subsubB2" gives a wrong result, see:
>> subsubB2 <- new("SubSubClassB2", filename="MyFileNameB2",
> nameB="MyNameB")
>> subsubB2
> An object of class "SubSubClassB2"
> Slot "nameB2":
> [1] "MyNameB"
> 
> Slot "nameB":
> [1] ""
> 
> Slot "filename":
> [1] "MyFileNameB2"

I think that the problem has to do with the way you use callNextMethod() in your
"initialize" method for SubSubClassB2:

  setMethod("initialize", "SubSubClassB2",
function(.Object, nameB2="MyNameB2", ...) {
   cat("--initialize:SubSubClassB2--\n")
   cat("SubSubClassB2:init:class(.Object) = ", class(.Object), "\n", sep="")
   if (nameB2 == "") nameB2 <- "DefaultNameB2";
   cat("SubSubClassB2:init:nameB2 = ", nameB2, "\n", sep="")
   .Object <- callNextMethod(.Object, nameB2=nameB2, ...)
   [EMAIL PROTECTED] <- nameB2
   .Object
}
  )

As a side note: I don't understand why you want to define a default value for
the 'nameB2' argument since you have defined a prototype for the SubSubClassB2
class (with a different default value for the 'nameB2' slot).

Try this instead:

  setMethod("initialize", "SubSubClassB2",
function(.Object, ...) {
   cat("--initialize:SubSubClassB2--\n")
   cat("SubSubClassB2:init:class(.Object) = ", class(.Object), "\n", sep="")
   .Object <- callNextMethod()
   if ([EMAIL PROTECTED] == "") [EMAIL PROTECTED] <- "DefaultNameB2"
   .Object
}
  )

It gives you what you want:

  > new("SubSubClassB2", filename="MyFileNameB2", nameB="MyNameB")
  --initialize:SubSubClassB2--
  SubSubClassB2:init:class(.Object) = SubSubClassB2
  --initialize:SubClassB--
  SubClassB:init:class(.Object) = SubSubClassB2
  --initialize:BaseClass--
  BaseClass:init:class(.Object) = SubSubClassB2
  BaseClass:init:filename = MyFileNameB2
  An object of class "SubSubClassB2"
  Slot "nameB2":
  [1] "NameB2"

  Slot "nameB":
  [1] "MyNameB"

  Slot "filename":
  [1] "MyFileNameB2"


[...]
> 
> In contrast, for "SubSubClassB2" I get the following sequence of events:
>> subsubB2 <- new("SubSubClassB2", filename="MyFileNameB2",
> nameB="MyNameB")
> --initialize:SubSubClassB2
> --initialize:SubClassB
> --initialize:BaseClass
> --setValidity:BaseClass
> --initialize:SubClassB
> --initialize:BaseClass
> --setValidity:BaseClass
> --setValidity:SubClassB
> --setValidity:SubClassB
> --initialize:SubClassB
> --initialize:BaseClass
> --setValidity:BaseClass
> --setValidity:SubClassB
> --setValidity:SubSubClassB2

I can reproduce this behaviour with a _much_ simple and shorter code:

  setClass("A",
representation(a="character"),
prototype(a="a0")
  )
  setValidity("A",
function(object) {
cat("--setValidity:A--\n")
tmp <- class(object)
TRUE
}
  )
  setMethod("initialize", "A",
function(.Object, ...) {
cat("--initialize:A--\n")
callNextMethod()
}
  )

  setClass("B",
contains="A",
representation(b="character")
  )
  setValidity("B",
function(object) {
cat("--setValidity:B--\n")
TRUE
}
  )
  setMethod("initialize", "B",
function(.Object, ...) {
cat("--initialize:B--\n")
callNextMethod()
}
  )

Then I get this:

  > b <- new("B", b="hello")
  --initialize:B--
  --initialize:A--
  --setValidity:A--
  --initialize:A--
  --setValidity:B--

Why is initialize:A called twice? I have no idea (bug?) but I agree with you 
that
this is unexpected. Note that this line

  tmp <- class(object)

in setValidity:A is what triggers the extra call to initialize:A. If you remove
it, things work as expected. Very strange!

Cheers,
H.


> sessionInfo()
R version 2.5.0 alpha (2007-03-30 r40957)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
[7] "base"



> 
> Furthermore, the slot "filename" is first initialized correctly to
> "filename=MyFileNameB2", but then it is twice initialized incorrectly
> to "filename=ERROR_FileName", indicating that it may not have been
> initialized at all. I do not understand this behavior, why is this so?
> 
> I really hope that this time the code below is acceptable to you.
> Thank you for your help.
> Best regards
> Christian
> 
> # - - - - - - - - - - - - - - - - BEGIN - - - - - - - - - - - - - - - -
> - - - -
> setClass("BaseClass",
>   representation(filename = "character", "VIRTUAL"),
>   prototype(filename = "DefaultFileName")
> )
> 
> setClass("SubClassB",
>   representation(nameB = "char

[Rd] 'R CMD check' fails when suggested package is not available

2007-04-04 Thread Herve Pages
Hi there,

I was wondering why I get the following error message:

  * checking package dependencies ... ERROR
  Packages required but not available:
Rmpi

when I run 'R CMD check' on a package that _suggests_ Rmpi?
Why isn't it OK to not have all the suggested packages installed?

Maybe one of the 3 following behaviours would be more appropriate:

  a) Having the error saying something like:

Package suggested but not available:
  Rmpi

  b) Make this a warning instead of an error.

  c) Don't do anything at all for suggested packages.

This issue showed up today while I was checking a new Bioconductor package:
the package suggests Rmpi but the vignette and the examples don't use it. If I 
remove
Rmpi from the Suggests field then 'R CMD check' runs all the examples and 
re-create
the vignette with no problem. Most users will not have Rmpi on their machine 
neither
will they be interested in getting into the trouble of installing it.
The package I was checking suggests Rmpi only because it contains 1 function 
that tries
to use it if it's installed but will work perfectly fine otherwise.
In this case it seems reasonable to have Rmpi in the Suggests field but this 
will
make 'R CMD check' to fail which is problematic in the context of automated 
builds :-/
If 'R CMD check' can't be a little bit more relaxed about this, then I guess we 
will
need to remove Rmpi from the Suggests field, but then 'R CMD check' will 
complain that:

  * checking for unstated dependencies in R code ... WARNING
  'library' or 'require' calls not declared from:
Rmpi

which is always better than getting an ERROR.

Thanks!

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with S4 inheritance: unexpected re-initialization?

2007-04-04 Thread Herve Pages
Hi Christian,

cstrato wrote:
> Dear Herve
> 
> Thank you for your helpful comments, and I especially appreciate that
> you tried to run my package. I will try to answer each point separately.
> 
> Herve Pages wrote:
>> Hi Christian,
>>
>> I can only give you a few reasons why IMO it is very unlikely that
>> anybody
>> will be able to help you on this, with the current form of your post.
>>
>> 1) Unless you have a really good reason to do so, don't attach a package
>>to your post. Do your best to provide a few lines of code that anybody
>>can easily copy and paste into their session.
>>   
> Sorrowly, sometimes, a few lines of code are not sufficient to show
> the problem. Furthermore, most of the time there are complaints that
> people do not provide enough information, an issue I wanted to avoid.

The code you provide below is still too long and overcluttered with stuff that 
is
probably unrelated with the issue you want to discuss. Your class definitions
still have slots that we don't care about. Basically if you want to
discuss an S4 issue, you should get rid of all this file system related stuff
(the 'dirfile', 'filedir', 'filename' slots, the 'pathFile' function, the dozens
of calls to 'basename', 'dirname', 'getwd', 'file.dir' etc...)

Also your code is dirty and hard to read. Take this for example:

  "initialize.BaseClass" <-
  function(.Object, filename=character(), filedir=as.character(getwd()), ...) {
print("--initialize:BaseClass--")
  print(paste("BaseClass:init:class(.Object) = ", class(.Object)))

  #   .Object <- callNextMethod(.Object, ...);

dirfile <- pathFile(filename, filedir);
  print(paste("BaseClass:init:dirfile = ", dirfile))

.Object <- callNextMethod(.Object, filename=filename, filedir=filedir, ...);

[EMAIL PROTECTED] <- filename;
[EMAIL PROTECTED]  <- filedir;
.Object;
  }#initialize.BaseClass

  setMethod("initialize", "BaseClass", initialize.BaseClass);

o It's not properly indented.
o Why those empty lines in the middle of such a short function?
o Why those semi-columns at the end of lines?
o Why put the implementation of the initialize method into a separate function?
o Why use this construct 'print(paste())' instead of just 'cat()'?
o Why leave the first call to callNextMethod() commented?
o What do you do with 'dirfile', except that you print it? Do you need this for
  the purpose of the S4 discussion?

What about adopting this style:

setMethod("initialize", "BaseClass",
function(.Object, filename=character(), filedir=as.character(getwd()), 
...)
{
cat("--initialize:BaseClass--\n")
cat("BaseClass:init:class(.Object) = ", class(.Object))
.Object <- callNextMethod(.Object, filename=filename, 
filedir=filedir, ...)
[EMAIL PROTECTED] <- filename
[EMAIL PROTECTED]  <- filedir
.Object
}
)

That's for the style. Now let's talk about the semantic. Here is the definition 
of
your BaseClass class:

setClass("BaseClass",
representation(
"VIRTUAL",
filename="character",
filedir="character"
),
prototype(
filename = character(),
filedir  = as.character(getwd())
)
)

(Note that this definition is what _I_ get after I cleaned it and indented it 
which
is something that _you_ are expected to do.)

o Initializing the 'filename' slot to character() is useless since this is the 
default.
o Wrapping getwd() inside as.character() is useless since getwd() returns a 
character vector.
o BUT MOST IMPORTANTLY THAN ANYTHING ELSE: given the fact that you've provided 
a prototype
  for class BaseClass, your initialize method is useless since it does 
_nothing_ more
  than what the default initialize method does! If you want to define your own 
"initialize"
  method for the only purpose of printing messages, then you could do it in a 
much simpler
  way:

setMethod("initialize", "BaseClass",
function(.Object, ...)
{
cat("--initialize:BaseClass--\n")
cat("BaseClass:init:class(.Object) = ", class(.Object), "\n", 
sep="")
callNextMethod()
}
)

But as I said initially, the file system related stuff is useless to illustrate 
the
S4 issue you want to show us so why didn't you just provide:

setClass("BaseClass",
representation("VIRTUAL", slot1="character"),
proto

Re: [Rd] source(..., echo=TRUE) broken in R-2.5.0 alpha and in R-2.6.0 devel

2007-04-04 Thread Herve Pages
Duncan Murdoch wrote:
> 
> Thanks for reporting this.
> 
> This is now fixed and committed to svn.  It will take a day or two for
> the change to make it onto CRAN.

OK. Thanks Martin and Duncan!

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with S4 inheritance: unexpected re-initialization?

2007-04-03 Thread Herve Pages
Hi Christian,

I can only give you a few reasons why IMO it is very unlikely that anybody
will be able to help you on this, with the current form of your post.

1) Unless you have a really good reason to do so, don't attach a package
   to your post. Do your best to provide a few lines of code that anybody
   can easily copy and paste into their session.

2) Your package is messy (yes I looked at it). You have far too many classes
   with far too many slots that we don't care about. If you could provide
   the smallest possible set of classes with the smallest possible set of slots
   with the smallest possible set of generics and methods that still allow to
   reproduce the issue you want to show us, that would help!

3) You show us an example where BaseClass is VIRTUAL (which is indeed how it is
   defined in your package) and then an example where BaseClass is NOT VIRTUAL.
   How can we reproduce the latter? Don't expect people to go into your package,
   change the class definition, reinstall, restart R and then run your example!

4) Note that for clarity and conformance to almost universal convention, it's
   better to use arrows pointing from derived classes to base classes
   in your inheritance tree.

5) It's good to provide the inheritance tree, but it's better when it's 
complete.
   I've looked at what you actually have in your package and the complete
   inheritance tree is something like this:

 BaseClass <- SubClassA
   <- SubClassB <- SubSubClassA
<- SubSubClassB

   Where is the SubClassA class in the inheritance tree that you included in 
your
   post below?

6) Another thing I note is that you have a naming problem: any reason why you 
name
   "SubSubClassA" a subclass of SubClassB? Given that you also have defined 
SubClassA,
   this can only lead to confusion!

7) You need to use proper terminology if you expect people to follow you. In 
your post
   below, every time you instanciate a class you say that your are creating it:
 o "First, I need to create SubClassA..."
 o "I create both subclasses, SubSubClassA and SubSubClassB..."
 o etc...
   Creating a class is not the same as instanciating it!

8) You start your examples by "First, I need to create SubClassA..." so you are
   introducing us a class that doesn't show up in your inheritance tree so we 
don't
   know how it is related to the other classes. Also you say that you "need" to
   create SubClassA but we have no idea why!

9) You have a slot in SubClassB that is of class SubClassA! This means "a 
SubClassB
   object _is_ a BaseClass object and it _has_ a slot that is itself a BaseClass
   object (since a SubClassA object is a BaseClass object too)". I hope that 
this
   is really what you want... but maybe this could be related to the fact that 
you
   see 2 instanciations of BaseClass when you instanciate SubSubClassA or 
SubSubClassB.

10) You have several different issues (initialize called multiple times when 
you expect
only 1 time, setValidity not called, etc..). May be they are related, maybe 
they
are not. If you can isolate those problems and make a separate post for 
each of them,
that would help too.

You'll be surprised, but once you've made the effort to follow those 
recommendations,
it's most likely that you will have a better understanding of what's going on. 
And you
might even be able to sort out these issues by yourself!

Cheers,
H.


cstrato wrote:
> Dear S4 experts,
> 
> Since I was reminded that I posted a similar question some time ago,
> I am attaching a modified version of my demo package, which allows me
> to track what happens during initialization of the following similar
> subclasses:
>   SubSubClassA <- SubClassB <- BaseClass
>   SubSubClassB <- SubClassB <- BaseClass
> 
> First, I need to create SubClassA:
>> library(myclasspkg)
>> subA <-
> new("SubClassA",filename="OutSubA",filedir="/Volumes/CoreData/CRAN/Workspaces/rclasspkg",mytitle="TitleSubA")
> 
> 
> Then, I create both subclasses, SubSubClassA and SubSubClassB, either
> with a virtual BaseClass or with a non-virtual BaseClass:
> 
> 1. new("SubSubClassA"): BaseClass is VIRTUAL
>> subsubA <-
> new("SubSubClassA",filename="MyFileName",filedir="/Volumes/CoreData/CRAN/Workspaces/rclasspkg",subA=subA)
> 
> [1] "--initialize:SubSubClassA--"
> [1] "--initialize:SubClassB--"
> [1] "--initialize:BaseClass--"
> [1] "--setValidity:BaseClass--"
> [1] "--setValidity:SubClassB--"
> [1] "--setValidity:SubSubClassA--"
> 
> As far as I understand the S4 classes, this is the correct initialization
> workflow that I expect. The resulting object "subsubA" is correct.
> 
> However, when BaseClass is not virtual, I get the following unexpected
> initialization workflow:
> 
> 2. new("SubSubClassA"): BaseClass is NOT VIRTUAL
>> subsubA <-
> new("SubSubClassA",filename="MyFileName",filedir="/Volumes/CoreData/CRAN/Workspaces/rclasspkg",subA=subA)
> 
> 

[Rd] source(..., echo=TRUE) broken in R-2.5.0 alpha and in R-2.6.0 devel

2007-04-03 Thread Herve Pages
Hi,

I get this error with R-2.5.0 alpha and R-2.6.0 devel:

  > source("http://bioconductor.org/biocLite.R";, echo=TRUE)
  Error in if (timestamp != srcfile$timestamp) warning("Timestamp of '",  :
  missing value where TRUE/FALSE needed

Same with 'verbose=TRUE':

  > source("http://bioconductor.org/biocLite.R";, verbose=TRUE)
  'envir' chosen:
  encoding = "native.enc" chosen
  --> parsed 2 expressions; now eval(.)ing them:

   eval(expression_nr. 1 )
   =
  Error in if (timestamp != srcfile$timestamp) warning("Timestamp of '",  :
  missing value where TRUE/FALSE needed

The above works fine with R-2.4.1.

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Limitation of dirname() and basename()

2007-03-27 Thread Herve Pages
Hi,

Simon Urbanek wrote:
> Your proposed behavior is inconsistent, anyway. The purpose of  
> dirname is to return parent directory of the entity represented by  
> the pathname.

Mmmm, I don't think this is true:

  > dirname("aaa/..")
  [1] "aaa"

"aaa" is not the parent directory of "aaa/.."

Same here:

  > dirname("/usr/./.")
  [1] "/usr/."


> "/my/path" and "/my/path/" are equivalent as they both  
> represent the directory "path" whose parent is "/my", therefore  
> returning "/my/path" in either case is inconsistent with the purpose  
> of this function. As of trailing slashes (independently of dirname),  
> sadly, some programs exploit the equivalence of both representations  
> by encoding meta-information in the representation, but this behavior  
> is quite confusing and error-prone. You're free to add such special  
> cases to your application, but there is no reason to add such  
> confusion to R.

Note that Python's designers were not afraid to emancipate from Unix for
this particular case:

  >>> import os.path
  >>> os.path.dirname("aaa/..")
  'aaa'
  >>> os.path.dirname("aaa/../")
  'aaa/..'


Also note that, if the goal was to mimic Unix behaviour, then why not
fully go for it, even for edge-cases:

  R
  
  > dirname("/")
  [1] "/"
  > basename("/")
  [1] ""

  Unix
  
  [EMAIL PROTECTED]:~> dirname "/"
  /
  [EMAIL PROTECTED]:~> basename "/"
  /

Just my 2 cents...

Cheers,
H.


> 
> Cheers,
> Simon
> 
> 
> 
 Prof Brian Ripley wrote:
> These functions work as they should: did you not read the help page
> which explicitly tells you what happens in this case?
>
> The Unix originals work in the same way:
>
> gannet% dirname /my/path/
> /my
>
> Please DO study the R posting guide and do the homework requesting
> of you before posting.
>
> On Mon, 26 Mar 2007, cstrato wrote:
>
>> Dear all,
>>
>> I have already twice encountered a case which I consider a
>> limitation of
>> dirname() and basename().
>>
>> In my functions I have a parameter "outfile" which e.g. tells  
>> where
>> a file
>> should be stored. Usually "outfile" is of the form:
>> oufile = "/my/path/myname.txt"
>>
>>> outfile <- "/my/path/myname.txt"
>>> dirname(outfile)
>> [1] "/my/path"
>>> basename(outfile)
>> [1] "myname.txt"
>>
>> However, in addition I want to be able to define the path only,  
>> while
>> creating the name "myname.txt" automatically.
>> Sorrowly, I get the following:
>>
>>> outfile <- "/my/path/"
>>> dirname(outfile)
>> [1] "/my"
>>> basename(outfile)
>> [1] "path"
>>
>> It would be great if dirname() and basename() could recognize:
>> dirname("/my/path/")  = /my/path/
>> basename(""/my/path/")  = ""
>> i.e. they should be able to recognize a trailing "/".
> Not according to the documentation.
>
>> Best regards
>> Christian
>> _._._._._._._._._._._._._._._._
>> C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a
>> V.i.e.n.n.a   A.u.s.t.r.i.a
>> _._._._._._._._._._._._._._._._
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Unexpected result of as.character() and unlist() applied to a data frame

2007-03-27 Thread Herve Pages
Liaw, Andy wrote:
> Given that the behavior is exactly as I expected it be, I would call
> that "feature" (and IMHO not a very special one).  The two data frames
> are just different (try str() on them:  A in dd is factor, while A in
> dd2 is character), so I don't know why you'd expect unlist() on them to
> give you the same answer.

Not sure what I would expect for unlist(). I was just curious and decided
to try unlist() on a data frame and found this funny behaviour.

But for as.character(), I definetly wouldn't expect this:
  > as.character(dd)
  [1] "c(2, 3, 1)" "c(3, 2, 1)"
  > as.character(dd$A)
  [1] "b" "c" "a"

And generally speaking, it's a bad idea to have the semantic of standard objects
controlled by some obscure global option.  Here 2 people with a different
"stringsAsFactors" option will not get the same results when manipulating 'dd'.
It makes writing reproducible code harder because then you must remember to 
specify
the stringsAsFactors option everytime you create a data frame.

Cheers,
H.


> 
> Andy
> 
> From: [EMAIL PROTECTED]
>> Hi,
>>
>>> dd <- data.frame(A=c("b","c","a"), B=3:1) dd
>>   A B
>> 1 b 3
>> 2 c 2
>> 3 a 1
>>> unlist(dd)
>> A1 A2 A3 B1 B2 B3
>>  2  3  1  3  2  1
>>
>> Someone else might get something different. It all depends on 
>> the values of its 'stringsAsFactors' option:
>>
>>> dd2 <- data.frame(A=c("b","c","a"), B=3:1, stringsAsFactors=FALSE)
>>> dd2
>>   A B
>> 1 b 3
>> 2 c 2
>> 3 a 1
>>> unlist(dd2)
>>  A1  A2  A3  B1  B2  B3
>> "b" "c" "a" "3" "2" "1"
>>
>> Same thing with as.character:
>>
>>> as.character(dd)
>> [1] "c(2, 3, 1)" "c(3, 2, 1)"
>>> as.character(dd2)
>> [1] "c(\"b\", \"c\", \"a\")" "c(3, 2, 1)"
>>
>> Bug or "feature"?
>>
>> Note that as.character applied directly on dd$A doesn't have 
>> this "feature":
>>
>>> as.character(dd$A)
>> [1] "b" "c" "a"
>>> as.character(dd2$A)
>> [1] "b" "c" "a"
>>
>> Cheers,
>> H.
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
> 
> 
> --
> Notice:  This e-mail message, together with any attachment...{{dropped}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Hi Greg,

Greg Snow wrote:
> Your 2 examples have 2 differences and they are therefore confounded in
> their effects.
> 
> What are your results for:
> 
> system.time(for (i in 1:100) {row <-  dat[i, ] })
> 
> 
> 

Right. What you suggest is even faster (and more simple):

  > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
  > dat <- as.data.frame(mat)

  > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
 user  system elapsed
   13.241   0.460  13.702

  > system.time(for (i in 1:100) { row <- sapply(dat, function(col) col[i]) })
 user  system elapsed
0.280   0.372   0.650

  > system.time(for (i in 1:100) {row <-  dat[i, ] })
 user  system elapsed
0.044   0.088   0.130

So apparently here extracting with dat[i, ] is 300 times faster than
extracting with dat[key, ] !

> system.time(for (i in 1:100) dat["1", ])
   user  system elapsed
 12.680   0.396  13.075

> system.time(for (i in 1:100) dat[1, ])
   user  system elapsed
  0.060   0.076   0.137

Good to know!

Thanks a lot,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Hi Wolfgang,


Wolfgang Huber wrote:
> 
> Hi Hervé
> 
> depending on your problem, using "mapply" might help, as in the code
> example below:
> 
> a = data.frame(matrix(1:3e4, ncol=3))
> 
> print(system.time({
> r1 = numeric(nrow(a))
> for(i in seq_len(nrow(a))) {
>   g = a[i,]
>   r1[i] = mean(c(g$X1, g$X2, g$X3))
> }}))
> 
> print(system.time({
> f = function(X1,X2,X3) mean(c(X1, X2, X3))
> r2 = do.call("mapply", args=append(f, a))
> }))
> 
> print(identical(r1, r2))
> 
> #   user  system elapsed
>   6.049   0.200   6.987
>user  system elapsed
>   0.508   0.000   0.509
> [1] TRUE

Thanks for the tip! It's good to know about the mapply function (which I just
realize is mentioned in the "See Also" section of the lapply man page).

Cheers,
H.


> 
>  Best wishes
>   Wolfgang
> 
> Roger D. Peng wrote:
>> Extracting rows from data frames is tricky, since each of the columns
>> could be of a different class.  For your toy example, it seems a
>> matrix would be a more reasonable option.
>>
>> R-devel has some improvements to row extraction, if I remember
>> correctly.  You might want to try your example there.
>>
>> -roger
>>
>> Herve Pages wrote:
>>> Hi,
>>>
>>>
>>> I have a big data frame:
>>>
>>>   > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
>>>   > dat <- as.data.frame(mat)
>>>
>>> and I need to do some computation on each row. Currently I'm doing this:
>>>
>>>   > for (key in row.names(dat)) { row <- dat[key, ]; ... do some
>>> computation on row... }
>>>
>>> which could probably considered a very natural (and R'ish) way of
>>> doing it
>>> (but maybe I'm wrong and the real idiom for doing this is something
>>> different).
>>>
>>> The problem with this "idiomatic form" is that it is _very_ slow. The
>>> loop
>>> itself + the simple extraction of the rows (no computation on the
>>> rows) takes
>>> 10 hours on a powerful server (quad core Linux with 8G of RAM)!
>>>
>>> Looping over the first 100 rows takes 12 seconds:
>>>
>>>   > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key,
>>> ] })
>>>  user  system elapsed
>>>12.637   0.120  12.756
>>>
>>> But if, instead of the above, I do this:
>>>
>>>   > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }
>>>
>>> then it's 20 times faster!!
>>>
>>>   > system.time(for (i in 1:100) { row <- sapply(dat, function(col)
>>> col[i]) })
>>>  user  system elapsed
>>> 0.576   0.096   0.673
>>>
>>> I hope you will agree that this second form is much less natural.
>>>
>>> So I was wondering why the "idiomatic form" is so slow? Shouldn't the
>>> idiomatic
>>> form be, not only elegant and easy to read, but also efficient?
>>>
>>>
>>> Thanks,
>>> H.
>>>
>>>
>>>> sessionInfo()
>>> R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
>>> x86_64-unknown-linux-gnu
>>>
>>> locale:
>>> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
>>>
>>>
>>> attached base packages:
>>> [1] "stats" "graphics"  "grDevices" "utils" "datasets" 
>>> "methods"
>>> [7] "base"
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Ulf Martin wrote:
> Here is an even faster one; the general point is to create a properly
> vectorized custom function/expression:
> 
> mymean <- function(x, y, z) (x+y+z)/3
> 
> a = data.frame(matrix(1:3e4, ncol=3))
> attach(a)
> print(system.time({r3 = mymean(X1,X2,X3)}))
> detach(a)
> 
> # Yields:
> # [1] 0.000 0.010 0.005 0.000 0.000
> 

Very fast indeed! And you don't need the attach/detach trick to make your point
since it is (almost) as fast without it:

  a = data.frame(matrix(1:3e4, ncol=3))
  print(system.time({r3 = mymean(a$X1,a$X2,a$X3)}))

However, you are lucky here because in this example (the "mean" example), you 
can
use vectorized arithmetic which is of course very fast.
What about the general case? Unfortunately situations where you can "properly 
vectorize"
tend to be much more frequent in tutorials and demos than in the real world.
Maybe the "mean" example is a little bit too specific to answer the
general question of "what's the best way to _efficiently_ step on a data
frame row by row".

Cheers,
H.



> print(identical(r2, r3))
> # [1] TRUE
> 
> # May values for version 1 and 2 resp. were
> # time for r1:
> [1] 29.420 23.090 60.093  0.000  0.000
> 
> # time for r2:
> [1] 1.400 0.050 1.505 0.000 0.000
> 
> Best wishes
> Ulf
> 
> 
> P.S. A somewhat more meaningful comparison of version 2 and 3:
> 
> a = data.frame(matrix(1:3e5, ncol=3))
> # time r2e5:
> [1] 12.04  0.15 12.92  0.00  0.00
> 
> # time r3e5:
> [1] 0.030 0.020 0.051 0.000 0.000
> 
>> depending on your problem, using "mapply" might help, as in the code 
>> example below:
>>
>> a = data.frame(matrix(1:3e4, ncol=3))
>>
>> print(system.time({
>> r1 = numeric(nrow(a))
>> for(i in seq_len(nrow(a))) {
>>g = a[i,]
>>r1[i] = mean(c(g$X1, g$X2, g$X3))
>> }}))
>>
>> print(system.time({
>> f = function(X1,X2,X3) mean(c(X1, X2, X3))
>> r2 = do.call("mapply", args=append(f, a))
>> }))
>>
>> print(identical(r1, r2))
>>
>> #   user  system elapsed
>>6.049   0.200   6.987
>> user  system elapsed
>>0.508   0.000   0.509
>> [1] TRUE
>>
>>   Best wishes
>>Wolfgang
>>
>> Roger D. Peng wrote:
>>> Extracting rows from data frames is tricky, since each of the columns could 
>>> be 
>>> of a different class.  For your toy example, it seems a matrix would be a 
>>> more 
>>> reasonable option.
>>>
>>> R-devel has some improvements to row extraction, if I remember correctly.  
>>> You 
>>> might want to try your example there.
>>>
>>> -roger
>>>
>>> Herve Pages wrote:
>>>> Hi,
>>>>
>>>>
>>>> I have a big data frame:
>>>>
>>>>   > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
>>>>   > dat <- as.data.frame(mat)
>>>>
>>>> and I need to do some computation on each row. Currently I'm doing this:
>>>>
>>>>   > for (key in row.names(dat)) { row <- dat[key, ]; ... do some 
>>>> computation on row... }
>>>>
>>>> which could probably considered a very natural (and R'ish) way of doing it
>>>> (but maybe I'm wrong and the real idiom for doing this is something 
>>>> different).
>>>>
>>>> The problem with this "idiomatic form" is that it is _very_ slow. The loop
>>>> itself + the simple extraction of the rows (no computation on the rows) 
>>>> takes
>>>> 10 hours on a powerful server (quad core Linux with 8G of RAM)!
>>>>
>>>> Looping over the first 100 rows takes 12 seconds:
>>>>
>>>>   > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
>>>>  user  system elapsed
>>>>12.637   0.120  12.756
>>>>
>>>> But if, instead of the above, I do this:
>>>>
>>>>   > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }
>>>>
>>>> then it's 20 times faster!!
>>>>
>>>>   > system.time(for (i in 1:100) { row <- sapply(dat, function(col) 
>>>> col[i]) })
>>>>  user  system elapsed
>>>> 0.576   0.096   0.673
>>>>
>>>> I hope you will agree that this second form is much less natural.
>>>>
>>>> So I was wondering why the "idiomatic form" is so slow? Shouldn't the 
>>>> idiomatic
>>>> form be, not only elegant and easy to read, but also efficient?
>>>>
>>>>
>>>> Thanks,
>>>> H.
>>>>
>>>>
>>>>> sessionInfo()
>>>> R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
>>>> [7] "base"
>>>>
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Roger D. Peng wrote:
> Extracting rows from data frames is tricky, since each of the columns
> could be of a different class.  For your toy example, it seems a matrix
> would be a more reasonable option.

There is no doubt about this ;-)

  > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
  > dat <- as.data.frame(mat)

With the matrix:

  > system.time(for (i in 1:100) { row <- mat[i, ] })
 user  system elapsed
0   0   0

With the data frame:

  > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
 user  system elapsed
   12.565   0.296  12.859


And even with a mixed-type data frame, it's very tempting to convert it
to a matrix before to do any loop on it:

  > dat2 <- as.data.frame(mat, stringsAsFactors=FALSE)
  > dat2 <- cbind(dat2, ii=1:30)
  > sapply(dat2, typeof)
   V1  V2  V3  V4  V5  ii
  "character" "character" "character" "character" "character"   "integer"

  > system.time(for (key in row.names(dat2)[1:100]) { row <- dat2[key, ] })
 user  system elapsed
   13.201   0.144  13.360

  > system.time({mat2 <- as.matrix(dat2); for (i in 1:100) { row <- mat2[i, ] 
}})
 user  system elapsed
0.128   0.036   0.163

Big win isn't it? (only if you have enough memory for it though...)

Cheers,
H.



> 
> R-devel has some improvements to row extraction, if I remember
> correctly.  You might want to try your example there.
> 
> -roger
> 
> Herve Pages wrote:
>> Hi,
>>
>>
>> I have a big data frame:
>>
>>   > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
>>   > dat <- as.data.frame(mat)
>>
>> and I need to do some computation on each row. Currently I'm doing this:
>>
>>   > for (key in row.names(dat)) { row <- dat[key, ]; ... do some
>> computation on row... }
>>
>> which could probably considered a very natural (and R'ish) way of
>> doing it
>> (but maybe I'm wrong and the real idiom for doing this is something
>> different).
>>
>> The problem with this "idiomatic form" is that it is _very_ slow. The
>> loop
>> itself + the simple extraction of the rows (no computation on the
>> rows) takes
>> 10 hours on a powerful server (quad core Linux with 8G of RAM)!
>>
>> Looping over the first 100 rows takes 12 seconds:
>>
>>   > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
>>  user  system elapsed
>>12.637   0.120  12.756
>>
>> But if, instead of the above, I do this:
>>
>>   > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }
>>
>> then it's 20 times faster!!
>>
>>   > system.time(for (i in 1:100) { row <- sapply(dat, function(col)
>> col[i]) })
>>  user  system elapsed
>> 0.576   0.096   0.673
>>
>> I hope you will agree that this second form is much less natural.
>>
>> So I was wondering why the "idiomatic form" is so slow? Shouldn't the
>> idiomatic
>> form be, not only elegant and easy to read, but also efficient?
>>
>>
>> Thanks,
>> H.
>>
>>
>>> sessionInfo()
>> R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
>> x86_64-unknown-linux-gnu
>>
>> locale:
>> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
>>
>>
>> attached base packages:
>> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
>> [7] "base"
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Herve Pages wrote:
...
> But if, instead of the above, I do this:
> 
>   > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }

Should have been:

  > for (i in 1:nrow(dat)) { row <- sapply(dat, function(col) col[i]) }

> 
> then it's 20 times faster!!
> 
>   > system.time(for (i in 1:100) { row <- sapply(dat, function(col) col[i]) })
>  user  system elapsed
> 0.576   0.096   0.673

...

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] extracting rows from a data frame by looping over the row names: performance issues

2007-03-02 Thread Herve Pages
Hi,


I have a big data frame:

  > mat <- matrix(rep(paste(letters, collapse=""), 5*30), ncol=5)
  > dat <- as.data.frame(mat)

and I need to do some computation on each row. Currently I'm doing this:

  > for (key in row.names(dat)) { row <- dat[key, ]; ... do some computation on 
row... }

which could probably considered a very natural (and R'ish) way of doing it
(but maybe I'm wrong and the real idiom for doing this is something different).

The problem with this "idiomatic form" is that it is _very_ slow. The loop
itself + the simple extraction of the rows (no computation on the rows) takes
10 hours on a powerful server (quad core Linux with 8G of RAM)!

Looping over the first 100 rows takes 12 seconds:

  > system.time(for (key in row.names(dat)[1:100]) { row <- dat[key, ] })
 user  system elapsed
   12.637   0.120  12.756

But if, instead of the above, I do this:

  > for (i in nrow(dat)) { row <- sapply(dat, function(col) col[i]) }

then it's 20 times faster!!

  > system.time(for (i in 1:100) { row <- sapply(dat, function(col) col[i]) })
 user  system elapsed
0.576   0.096   0.673

I hope you will agree that this second form is much less natural.

So I was wondering why the "idiomatic form" is so slow? Shouldn't the idiomatic
form be, not only elegant and easy to read, but also efficient?


Thanks,
H.


> sessionInfo()
R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
[7] "base"

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wrong vector size reported by error message

2007-02-05 Thread Herve Pages
Herve Pages wrote:
> Hi Oleg,
> 
> Oleg Sklyar wrote:
>> my R-SVN revision is 40458 compared to 40386 yours, could it be
>> corrected already?
> 
> No I don't think so. Maybe an architecture specific problem?
> You are on a 64-bit system, I'm on a 32-bit system.
> 
> I was able to reproduce on 3 systems so far (with any version of R):
>   - on my Pentium M laptop with Ubuntu 6.06
>   - on a Linux SUSE 9.2 32-bit system
>   - on a Core 1 Duo (32-bit) Mac Mini

Same problem on a 64-bit Solaris 2.9 machine but with a 32-bit R executable
(compiled with a 32-bit gcc 4.1.1).

H.

> 
> All those systems are 32-bits.
> 
> I've tried on a couple of 64-bit systems and I don't get this problem.
> 
> Cheers,
> H.
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wrong vector size reported by error message

2007-02-05 Thread Herve Pages
Hi Oleg,

Oleg Sklyar wrote:
> my R-SVN revision is 40458 compared to 40386 yours, could it be
> corrected already?

No I don't think so. Maybe an architecture specific problem?
You are on a 64-bit system, I'm on a 32-bit system.

I was able to reproduce on 3 systems so far (with any version of R):
  - on my Pentium M laptop with Ubuntu 6.06
  - on a Linux SUSE 9.2 32-bit system
  - on a Core 1 Duo (32-bit) Mac Mini

All those systems are 32-bits.

I've tried on a couple of 64-bit systems and I don't get this problem.

Cheers,
H.

> 
> * ~: R
>> 2**30
> [1] 1073741824
>> a<-2:1073741824
> Error: cannot allocate vector of size 4194304 Kb
>> sessionInfo()
> R version 2.5.0 Under development (unstable) (2007-01-22 r40548)
> x86_64-unknown-linux-gnu
> 
> locale:
> LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=en_GB.UTF-8;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C
> 
> 
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> [7] "base"
>>
> 
> -- 
> Dr Oleg Sklyar | EBI-EMBL, Cambridge CB10 1SD, UK | +44-1223-494466
> 
> 
> [EMAIL PROTECTED] wrote:
>> Hi,
>>
>> On my system, I get the following error message:
>>
>>   > big <- 2:(2**30)
>>   Error: cannot allocate vector of size 0 Kb
>>
>> Note the wrong "size 0 Kb" in the message!
>>
>> Cheers,
>> H.
>>
>>
>>> sessionInfo()
>> R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
>> i686-pc-linux-gnu
>>
>> locale:
>> LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=en_CA.UTF-8;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C
>>
>>
>> attached base packages:
>> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
>> [7] "base"
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wrong vector size reported by error message

2007-02-05 Thread Herve Pages
Oleg Sklyar wrote:
> Herve, it looks like 64-bit. I just tried another 32 bit version:
> Windows build R-2.4.1 under VMWare on the same 64 bit Ubuntu as in my
> post above and Windows version shows the same bug as you report:
> 
> R2.4.1 on Windows 2000 as guest system in VMWare Ubuntu 6.10 64bit
> ---
>> big <- 2:(2**30)
> Fehler: kann Vektor der Größe 0 Kb nicht allozieren

Right, I just tried on our Windows Server 2003 build machine (mingw32) and
got it too. Thanks Oleg!

H.

>> sessionInfo()
> R version 2.4.1 (2006-12-18)
> i386-pc-mingw32
> 
> locale:
> LC_COLLATE=German_Germany.1252;LC_CTYPE=German_Germany.1252;LC_MONETARY=German_Germany.1252;LC_NUMERIC=C;LC_TIME=German_Germany.1252
> 
> 
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> [7] "base"
> 
> -- 
> Dr Oleg Sklyar | EBI-EMBL, Cambridge CB10 1SD, UK | +44-1223-494466
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] convolve: request for "usual" behaviour + some improvements + some fixes

2007-02-02 Thread Herve Pages
Last but not least: convolve2 can be made 100 times or 1000 times faster
than convolve by choosing a power of 2 for the length of the fft-buffer
(a length of 2^n is the best case for the fft, the worst case being when
the length is a prime number):

> x <- 1:13
> y <- 1:1
> system.time(cc <- convolve(x, y, type="o")) # uses buffer length of 13
   user  system elapsed
 76.428   0.016  76.445
> system.time(cc <- convolve2(x, y, type="o")) # uses buffer length of 2^17
   user  system elapsed
  0.164   0.012   0.179

Here is the modified 'convolve2':

convolve2 <- function(x, y, type = c("circular", "open", "filter"))
{
type <- match.arg(type)
nx <- length(x)
ny <- length(y)
if (type == "circular") {
nz <- max(nx, ny)
} else {
nz0 <- nx + ny - 1
nz <- 2^ceiling(log2(nz0))
}
if (nz > nx)
x[(nx+1):nz] <- as.integer(0)
if (nz > ny)
y[(ny+1):nz] <- as.integer(0)
fz <- fft(x) * fft(y)
z <- fft(fz, inverse=TRUE) / nz
if (type == "open") {
z <- z[1:nz0]
} else {
if (type == "filter")
z <- z[1:nx]
}
if (is.numeric(x) && is.numeric(y))
z <- Re(z)
if (is.integer(x) && is.integer(y))
z <- as.integer(round(z))
z
}

In fact, it should try to be smarter than that and not use the fft at all
when one of the 2 input sequences is very short (less than 3 or 4) or
e.g. when one is 1 times shorter than the other one.

Cheers,
H.


Herve Pages wrote:
> Hi again,
> 
> There are many problems with current 'convolve' function.
> The author of the man page seems to be aware that 'convolve' does _not_ the
> usual thing:
> 
>   Note that the usual definition of convolution of two sequences 'x'
>   and 'y' is given by 'convolve(x, rev(y), type = "o")'.
> 
> and indeed, it does not:
> 
>   > x <- 1:3
>   > y <- 3:1
>   > convolve(x, y, type="o")
>   [1]  1  4 10 12  9
> 
> The "usual" convolution would rather give:
> 
>   > convolve(x, rev(y), type="o")
>   [1]  3  8 14  8  3
> 
> Also the "usual" convolution is commutative:
> 
>   > convolve(y, rev(x), type="o")
>   [1]  3  8 14  8  3
> 
> but convolve is not:
> 
>   > convolve(y, x, type="o")
>   [1]  9 12 10  4  1
> 
> Of course I could write the following wrapper:
> 
>   usual_convolve <- function(x, y, ...) convolve(x, rev(y))
> 
> to work around those issues but 'convolve' has other problems:
> 
>   (1) The input sequences shouldn't need to have the same length when
>   type = "circular" (the shortest can be right-padded with 0s up
>   to the length of the longest).
>   (2) If the input sequences are both integer vectors, then the result
>   should be an integer vector too.
>   (3) The "filter" feature seems to be broken (it's not even clear
>   what it should do or why we need it though):
> > x <- 1:9
> > y <- 1
> > convolve(x, y, type="f")
> Error in convolve(x, y, type = "f") : subscript out of bounds
> > convolve(y, x, type="f")
> numeric(0)
>   (4) If you look at the source code, you'll see that 'x' is first left-padded
>   with '0's. The "usual" convolution doesn't do that: it always padd
>   sequences on the _same_ side (generally on the right).
>   (5) It's not clear why we need a 'conj' arg. All what it does is
>   take the conjugate of fft(y) before it does the product with fft(x).
>   But this has the "non-usual" effect of reverting the expected result:
> > round(convolve(as.integer(c(0,0,0,1)), 1:7, type="o"))
> [1] 0 0 0 7 6 5 4 3 2 1
> 
> Here below is my version of 'convolve' just in case. It does the "usual"
> convolution plus:
>   - no need to have 'x' and 'y' of the same length when 'type' is "circular",
>   - when 'x' and 'y' are integer vectors, the output is an integer vector,
>   - no more 'conj' arg (not needed, only leads to confusion),
>   - when type is "filter", the output sequence is the same as with
> type="open" but is truncated to the length of 'x' (the original signal)
> It can be seen has the result of 'x' filtered by 'y' (the filt

[Rd] convolve: request for "usual" behaviour + some improvements + some fixes

2007-02-02 Thread Herve Pages
Hi again,

There are many problems with current 'convolve' function.
The author of the man page seems to be aware that 'convolve' does _not_ the
usual thing:

  Note that the usual definition of convolution of two sequences 'x'
  and 'y' is given by 'convolve(x, rev(y), type = "o")'.

and indeed, it does not:

  > x <- 1:3
  > y <- 3:1
  > convolve(x, y, type="o")
  [1]  1  4 10 12  9

The "usual" convolution would rather give:

  > convolve(x, rev(y), type="o")
  [1]  3  8 14  8  3

Also the "usual" convolution is commutative:

  > convolve(y, rev(x), type="o")
  [1]  3  8 14  8  3

but convolve is not:

  > convolve(y, x, type="o")
  [1]  9 12 10  4  1

Of course I could write the following wrapper:

  usual_convolve <- function(x, y, ...) convolve(x, rev(y))

to work around those issues but 'convolve' has other problems:

  (1) The input sequences shouldn't need to have the same length when
  type = "circular" (the shortest can be right-padded with 0s up
  to the length of the longest).
  (2) If the input sequences are both integer vectors, then the result
  should be an integer vector too.
  (3) The "filter" feature seems to be broken (it's not even clear
  what it should do or why we need it though):
> x <- 1:9
> y <- 1
> convolve(x, y, type="f")
Error in convolve(x, y, type = "f") : subscript out of bounds
> convolve(y, x, type="f")
numeric(0)
  (4) If you look at the source code, you'll see that 'x' is first left-padded
  with '0's. The "usual" convolution doesn't do that: it always padd
  sequences on the _same_ side (generally on the right).
  (5) It's not clear why we need a 'conj' arg. All what it does is
  take the conjugate of fft(y) before it does the product with fft(x).
  But this has the "non-usual" effect of reverting the expected result:
> round(convolve(as.integer(c(0,0,0,1)), 1:7, type="o"))
[1] 0 0 0 7 6 5 4 3 2 1

Here below is my version of 'convolve' just in case. It does the "usual"
convolution plus:
  - no need to have 'x' and 'y' of the same length when 'type' is "circular",
  - when 'x' and 'y' are integer vectors, the output is an integer vector,
  - no more 'conj' arg (not needed, only leads to confusion),
  - when type is "filter", the output sequence is the same as with
type="open" but is truncated to the length of 'x' (the original signal)
It can be seen has the result of 'x' filtered by 'y' (the filter).

convolve2 <- function(x, y, type = c("circular", "open", "filter"))
{
type <- match.arg(type)
nx <- length(x)
ny <- length(y)
if (type == "circular")
nz <- max(nx, ny)
else
nz <- nx + ny - 1
if (nz > nx)
x[(nx+1):nz] <- as.integer(0)
if (nz > ny)
y[(ny+1):nz] <- as.integer(0)
fx <- fft(x)
fy <- fft(y)
fz <- fx * fy
z <- fft(fz, inverse=TRUE) / nz
if (is.numeric(x) && is.numeric(y))
z <- Re(z)
if (is.integer(x) && is.integer(y))
z <- as.integer(round(z))
if (type == "filter")
z[1:nx]
else
z
}

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Speed of for loops

2007-01-30 Thread Herve Pages
Hi,

Byron Ellis wrote:
> IIRC a for loop has more per-iteration overhead that lapply, but the
> real answer is "it depends on what you're doing exactly." I've seen it
> be a faster, slower and equal approach.

gen.iter = function(y=NA) {
 function(x) {
   y <<- if(is.na(y)) x else x+y
 }
}

sapply + gen.iter is slithly faster on small vectors:

  > x <- rep(1, 5000)
  > system.time(tt <- sapply(x,gen.iter()))
 user  system elapsed
0.012   0.000   0.012
  > x <- rep(1, 5000)
  > system.time(tt <- for(i in 2:length(x)) {x[i] <- x[i-1]+x[i]})
 user  system elapsed
0.016   0.000   0.016

but much slower on big vectors:

  > x <- rep(1, 1000)
  > system.time(tt <- sapply(x,gen.iter()))
 user  system elapsed
  138.589   0.964 139.633
  > x <- rep(1, 1000)
  > system.time(tt <- for(i in 2:length(x)) {x[i] <- x[i-1]+x[i]})
 user  system elapsed
   29.978   0.480  30.454


Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Speed of for loops

2007-01-30 Thread Herve Pages
Tom McCallum wrote:
> Hi Everyone,
> 
> I have a question about for loops.  If you have something like:
> 
> f <- function(x) {
>   y <- rep(NA,10);
>   for( i in 1:10 ) {
>   if ( i > 3 ) {
>   if ( is.na(y[i-3]) == FALSE ) {
>   # some calculation F which depends on one or 
> more of the previously  
> generated values in the series
>   y[i] = y[i-1]+x[i];
>   } else {
>   y[i] <- x[i];
>   }
>   }
>   }
>   y
> }
> 
> e.g.
> 
>> f(c(1,2,3,4,5,6,7,8,9,10,11,12));
>   [1] NA NA NA  4  5  6 13 21 30 40
> 
> is there a faster way to process this than with a 'for' loop?  I have  
> looked at lapply as well but I have read that lapply is no faster than a  
> for loop and for my particular application it is easier to use a for loop.  
> Also I have seen 'rle' which I think may help me but am not sure as I have  
> only just come across it, any ideas?

Hi Tom,

In the general case, you need a loop in order to propagate calculations
and their results across a vector.

In _your_ particular case however, it seems that all you are doing is a
cumulative sum on x (at least this is what's happening for i >= 6).
So you could do:

f2 <- function(x)
{
offset <- 3
start_propagate_at <- 6
y_length <- 10
init_range <- (offset+1):start_propagate_at
y <- rep(NA, offset)
y[init_range] <- x[init_range]
y[start_propagate_at:y_length] <- cumsum(x[start_propagate_at:y_length])
y
}

and it will return the same thing as your function 'f' (at least when 'x' 
doesn't
contain NAs) but it's not faster :-/

IMO, using sapply for propagating calculations across a vector is not 
appropriate
because:

  (1) It requires special care. For example, this:

> x <- 1:10
> sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]})

  doesn't work because the 'x' symbol on the left side of the <- in the
  anonymous function doesn't refer to the 'x' symbol defined in the global
  environment. So you need to use tricks like this:

> sapply(2:length(x),
 function(i) {x[i] <- x[i-1]+x[i]; assign("x", x, 
envir=.GlobalEnv); x[i]})

  (2) Because of this kind of tricks, then it is _very_ slow (about 10 times
  slower or more than a 'for' loop).

Cheers,
H.


> 
> Many thanks
> 
> Tom
> 
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] update.packages() doesn't pick up the highest version of packages

2007-01-29 Thread Herve Pages
Hi,

Try to install limma 2.8.1 with:

  > biocRep <- "http://bioconductor.org/packages/1.9/bioc";
  > install.packages("limma", repos=biocRep)

then try to update it with

  > repos <- c(biocRep, "http://cran.fhcrc.org";)
  > update.packages(repos=repos)

--> it will not get updated (even if there is a 2.9.8 version on CRAN).

This is (1) surprising by itself, (2) inconsistent with install.packages
behaviour.

When given more than 1 repository, install.packages() will go thru
all of them and pick up the higher version of each package to install.
For example, this:

  > repos <- c(biocRep, "http://cran.fhcrc.org";)
  > install.packages("limma", repos=repos)

will install limma from CRAN (2.9.8) instead of limma from Bioconductor (2.8.1).

But if a new version of limma shows up on CRAN, this:

  > update.packages(repos=repos)

will not update it...

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] file.copy

2007-01-16 Thread Herve Pages
Hi,

Copying a non-existing file with file.copy creates an empty file

  > r <- file.copy("non-existing-file", ".")
  > r
  [1] TRUE

... and returns TRUE!

Now, when used in "vectorized" mode

  > r <- file.copy(c("toto1", "toto2"), c("dest1", "dest2"))
  [1] FALSE FALSE

file.copy looks much more reasonable, except that files "dest1"
and "dest2" are still created (despite the fact that "toto1" and
"toto2" don't exist).

If the 'to' argument is not a dir:

  > r <- file.copy(c("toto3", "toto4"), ".")
  > r
  [1] TRUE TRUE

!!?!?

May be those are (undocumented) features, but I bet 99.9% of the users would
expect something different...

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R.INSTALL on Windows

2007-01-12 Thread Herve Pages
Hi,

I'd like to point out a potential problem with the current R.INSTALL
directory created by 'R CMD INSTALL' when applied to a source package.

On Windows (and, AFAIK, only on Windows), 'R CMD INSTALL mypackage_1.0.0.tar.gz'
creates a temporary R.INSTALL directory in the current directory. This
directory is removed before the command terminates but only if it was
successful.

One inconvenient of this behaviour is that you can get a clash when you try to
build 2 Windows binary packages (with 'R CMD INSTALL --build') at the same time
in the same directory. I can circumvent the 00LOCK mechanism by providing
a different --library option to the 2 commands but, if I'm running the 2 
commands
in the same directory, then they will be creating and using the same R.INSTALL.
Then, the first command who terminates will remove it and break the other 
command.

Is there an easy way to prevent this?

Is there any reason why R.INSTALL is not kept at the end of 'R CMD INSTALL',
or why 'R CMD INSTALL' doesn't use a package specific directory name
(e.g. R.INSTALL.mypackage) like 'R CMD check' does, or why it doesn't
use one of the "standard" temp places (defined by env. var. TMP or TMPDIR).

Thanks in advance!

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] pb in regular expression with the character "-" (PR#9437)

2007-01-05 Thread Herve Pages
Hi all,

[EMAIL PROTECTED] wrote:
> 
> Consider my guesstimate:
> For 99% of all R users, the amount of time they need working
> pretty intensely with R before they find a bug in it, 
> is nowadays more than three years, and maybe even much more
> -- such as their lifetime :-)

Perhaps I belong to the 1% of unlucky users that don't have to
wait that long ;-)

  > nchar("éA", type = "bytes")
  [1] 3
  > nchar("éA", type = "chars")
  [1] 2

  OK

Now:

  > regexpr("A", "éA")
  [1] 2
  attr(,"match.length")
  [1] 1

  still OK

But:

  > regexpr("A", "éA", useBytes=TRUE)
  [1] 2
  attr(,"match.length")
  [1] 1

  not OK anymore (3 expected, not 2)

Let's try with fixed=TRUE:

  > regexpr("A", "éA", useBytes=TRUE, fixed=TRUE)
  [1] 3
  attr(,"match.length")
  [1] 1

  much better!

H.


> sessionInfo()
R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
i686-pc-linux-gnu

locale:
LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=en_CA.UTF-8;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
[7] "base"

but this happens also in 2.4.0 and 2.4.1.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Can't load XML_1.4-0.zip in last R devel

2007-01-05 Thread Herve Pages
Hi,

I can't load XML_1.4-0.zip in last R devel (Windows):

  R version 2.5.0 Under development (unstable) (2007-01-05 r40386)
  Copyright (C) 2007 The R Foundation for Statistical Computing
  ISBN 3-900051-07-0

  R is free software and comes with ABSOLUTELY NO WARRANTY.
  You are welcome to redistribute it under certain conditions.
  Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

  R is a collaborative project with many contributors.
  Type 'contributors()' for more information and
  'citation()' on how to cite R or R packages in publications.

  Type 'demo()' for some demos, 'help()' for on-line help, or
  'help.start()' for an HTML browser interface to help.
  Type 'q()' to quit R.

  > install.packages('XML')
  --- Please select a CRAN mirror for use in this session ---
  trying URL 'http://cran.fhcrc.org/bin/windows/contrib/2.5/XML_1.4-0.zip'
  Content type 'application/zip' length 1647612 bytes
  opened URL
  downloaded 1608Kb

  package 'XML' successfully unpacked and MD5 sums checked

  The downloaded packages are in
  C:\Documents and Settings\biocbuild\Local
  Settings\tmpdir\RtmpXybdrb\downloaded_packages
  updating HTML package descriptions
  > library(XML)
  Error in Sys.putenv(names(x), as.character(unlist(x))) :
  wrong type for argument
  Error: .onLoad failed in 'loadNamespace' for 'XML'
  Error: package/namespace load failed for 'XML'
  >


I don't get this with R devel from 2006-12-05:

  R version 2.5.0 Under development (unstable) (2006-12-05 r40126)
  Copyright (C) 2006 The R Foundation for Statistical Computing
  ISBN 3-900051-07-0

  R is free software and comes with ABSOLUTELY NO WARRANTY.
  You are welcome to redistribute it under certain conditions.
  Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

  R is a collaborative project with many contributors.
  Type 'contributors()' for more information and
  'citation()' on how to cite R or R packages in publications.

  Type 'demo()' for some demos, 'help()' for on-line help, or
  'help.start()' for an HTML browser interface to help.
  Type 'q()' to quit R.

  > install.packages('XML')
  --- Please select a CRAN mirror for use in this session ---
  trying URL 'http://cran.fhcrc.org/bin/windows/contrib/2.5/XML_1.4-0.zip'
  Content type 'application/zip' length 1647612 bytes
  opened URL
  downloaded 1608Kb

  package 'XML' successfully unpacked and MD5 sums checked

  The downloaded packages are in
  C:\Documents and Settings\biocbuild\Local
  Settings\tmpdir\RtmpaIcdrb\downloaded_packages
  updating HTML package descriptions
  > library(XML)
  >


Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] man page for as.matrix for data frames outdated?

2006-11-03 Thread Herve Pages
Hi Martin,

Thanks for the answer!
OK I can use data.matrix to convert a data frame to a numeric
matrix but that's another story. Basically I'm reporting 2
problems with 'as.matrix' when applied to a data frame:

1) A documentation problem:

"The method for data frames will convert any
 non-numeric/complex column into a character vector
 using 'format'"

> df5 <- data.frame(toto=c("a","bb"), titi=c(9,999))
> as.matrix(df5)
  toto titi
1 "a"  "  9"
2 "bb" "999"

 As I said, it seems to be the other way around: it's not the
 "non-numeric" column that is converted to a character vector,
 it's the "numeric" column.

2) the questionable decision to do this conversion using 'format'
   (leading to the addition of unnecessary white space) and not
   simply 'as.character'

BTW your mailer seems to do some strange reformatting to the output
of my code snippets making it hard to see the "formatting" problem
that I'm trying to show.

Cheers,

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] man page for as.matrix for data frames outdated?

2006-11-02 Thread Herve Pages
Hi again,


The man page for 'as.matrix' says:

 'as.matrix' is a generic function. The method for data frames will
 convert any non-numeric/complex column into a character vector
 using 'format' and so return a character matrix, except that
 all-logical data frames will be coerced to a logical matrix.

It's true that "all-logical data frames will be coerced to a logical
matrix":

> fourLogicals <- 2:5>3
> df1 <- data.frame(a=fourLogicals)
> storage.mode(as.matrix(df1))
[1] "logical"

Otherwise it's not true that 'as.matrix' will return a character matrix:

> fourInts <- 2:-1
> df2 <- data.frame(a=fourLogicals, b=fourInts)
> storage.mode(as.matrix(df2))
[1] "integer"


> fourDoubles <- rep(pi,4)
> df3 <- data.frame(c=fourDoubles, a=fourLogicals, b=fourInts)
> storage.mode(as.matrix(df3))
[1] "double"


> fourComplexes <- (-1:2)+3i
> df4 <- data.frame(a=fourLogicals, d=fourComplexes, b=fourInts,
c=fourDoubles)
> storage.mode(as.matrix(df4))
[1] "complex"

If one column is of mode character, then 'as.matrix' will effectively
return a character matrix:

> df5 <- data.frame(toto=c("a","bb"), titi=c(9,999))
> storage.mode(as.matrix(df5))
[1] "character"

Note that the doc says that "any non-numeric/complex column" will
be passed thru 'format' which seems to be exactly the other way
around:

> as.matrix(df5)
  toto titi
1 "a"  "  9"
2 "bb" "999"

Anyway why one would like to have the numeric values passed
thru 'format' to start with?

This is in R-2.4.0 and recent R-devel.

Best,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Display problem with named complex vectors

2006-11-02 Thread Herve Pages
Hi,


> z <- (-1:3)+2i
> names(z) <- LETTERS[1:5]
> z
A B C D E
-1+2i  0+2i  1+2i  2+2i  3+2i

Nice :-)

> names(z)[2] <- "long name"
> z
A long name C D E
-1+2i  0+2i  1+2i  2+2i  3+2i

Not nice :-(

This happens with R-2.4.0 and current R-devel.
It also happens with raw vectors as reported one month ago.


Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with .findInheritedMethods

2006-10-27 Thread Herve Pages
Hi John,


John Chambers wrote:
> A problem with callNextMethod, which is caching an inherited method as
> if it was not inherited, causing confusion on the next search.  Should
> be fairly easy to fix, but may be a while before I get time to do so.
>
> By the way, I hope your simplified example does not reflect what
> happens in the actual one.
>callNextMethod(.Object)
> throws away all the ... arguments to new(), which rather defeats the
> purpose of having initialize() methods.  Generally, callNextMethod()
> should get no arguments or all the arguments it needs, including ...
> See ?callNextMethod

Thanks for looking at this!

Yes it is a simplified version of a real case and
here .Object is all what callNextMethod() needs because
the initialize method for an "A" object takes no argument
other than .Object

More generally I don't see what's wrong with not passing
to callNextMethod all the arguments coming from the call
to new:

setClass("A", representation(toto="integer"))
setMethod("initialize", "A", function(.Object, toto0) [EMAIL PROTECTED]
<- as.integer(toto0); .Object})
new("A", 45.1)

setClass("Ab", contains="A")
setMethod("initialize", "Ab", function(.Object, x, y)
callNextMethod(.Object, x*y+1))
new("Ab", 5, 2)


Regards,

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Pb with .findInheritedMethods

2006-10-26 Thread Herve Pages
Hi again,

This happens with R-2.4.0 and R-devel.

Cheers,
H.

Herve Pages wrote:
> Hi again,
>
>
> Here is a very simplified version of a class hierarchy
> defined in the Biobase package (Bioconductor). I post
> here because this seems to be an S4 related problem:
>
> setClass("A", representation(name="character"))
> setMethod("initialize", "A", function(.Object) [EMAIL PROTECTED] <- "I'm
> an A"; .Object})
>
> setClass("Ab", contains="A")
> setMethod("initialize", "Ab", function(.Object) callNextMethod(.Object))
>
> setClass("Abc", contains="Ab")
>
> setClass("Abcd", contains = c("Abc"))
>
> Now if I do:
>
> tmp1 <- new("Abc")
> tmp2 <- new("Abcd")
>
> I get the following warning:
>
> Warning message:
> Ambiguous method selection for "initialize", target "Abcd" (the
> first of the signatures shown will be used)
> Abc
> Ab
>  in: .findInheritedMethods(classes, fdef, mtable)
>
> I don't really understand why .findInheritedMethods is
> complaining here...
> And if I don't do 'tmp1 <- new("Abc")' before I
> do 'tmp2 <- new("Abcd")', then I don't get the warning
> anymore!
>
> Does anybody have an explanation for this?
>
>
> Thanks,
> H.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] S4 pb in R 2.5.0

2006-10-26 Thread Herve Pages
Herve Pages wrote:
> ...
>
> > sessionInfo()
> R version 2.4.0 (2006-10-03)
> x86_64-unknown-linux-gnu
>  
> locale:
> 
> LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
>  
> attached base packages:
> [1] "methods"   "stats" "graphics"  "grDevices" "utils"
> "datasets"
> [7] "base"
>   

oops, please replace by

> sessionInfo()
R version 2.5.0 Under development (unstable) (2006-10-20 r39686)
x86_64-unknown-linux-gnu
 
locale:
LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
 
attached base packages:
[1] "methods"   "stats" "graphics"  "grDevices" "utils" "datasets"
[7] "base"

sorry,

> No problem with R-2.4.0.
>   

This is still true.


H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Pb with .findInheritedMethods

2006-10-26 Thread Herve Pages
Hi again,


Here is a very simplified version of a class hierarchy
defined in the Biobase package (Bioconductor). I post
here because this seems to be an S4 related problem:

setClass("A", representation(name="character"))
setMethod("initialize", "A", function(.Object) [EMAIL PROTECTED] <- "I'm
an A"; .Object})

setClass("Ab", contains="A")
setMethod("initialize", "Ab", function(.Object) callNextMethod(.Object))

setClass("Abc", contains="Ab")

setClass("Abcd", contains = c("Abc"))

Now if I do:

tmp1 <- new("Abc")
tmp2 <- new("Abcd")

I get the following warning:

Warning message:
Ambiguous method selection for "initialize", target "Abcd" (the
first of the signatures shown will be used)
Abc
Ab
 in: .findInheritedMethods(classes, fdef, mtable)

I don't really understand why .findInheritedMethods is
complaining here...
And if I don't do 'tmp1 <- new("Abc")' before I
do 'tmp2 <- new("Abcd")', then I don't get the warning
anymore!

Does anybody have an explanation for this?


Thanks,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] S4 pb in R 2.5.0

2006-10-26 Thread Herve Pages
Hi,

When playing interactively with the S4 system, I've tried
to define the following class:

> setClass("A", representation("integer"))
[1] "A"
> showClass("A")
 
Slots:
  
Name:.Data
Class: integer
 
Extends:
Class "integer", from data part
Class "vector", by class "integer", distance 2
Class "numeric", by class "integer", distance 2

then I realized that I made a typo (I don't want to extend
the "integer" type) so I redefined class A:

> setClass("A", representation(toto="integer"))
> showClass("A")
Slots:
  
Name: toto
Class: integer

Now if I try to extend A:

> setClass("Aa", representation("A"))
Error in reconcilePropertiesAndPrototype(name, slots, prototype,
superClasses,  :
"A" is not eligible to be the data part of another class
(must be a basic class or a virtual class with no slots)

Surprising. And even more surprising: I don't get this if I don't
try to define class A twice or if I invert the order of the 2 calls
to setClass("A", ...)!

> sessionInfo()
R version 2.4.0 (2006-10-03)
x86_64-unknown-linux-gnu
 
locale:

LC_CTYPE=en_US;LC_NUMERIC=C;LC_TIME=en_US;LC_COLLATE=en_US;LC_MONETARY=en_US;LC_MESSAGES=en_US;LC_PAPER=en_US;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US;LC_IDENTIFICATION=C
 
attached base packages:
[1] "methods"   "stats" "graphics"  "grDevices" "utils"
"datasets"
[7] "base"

No problem with R-2.4.0.

Thanks,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Last R-devel snapshot is an empty tarball

2006-10-05 Thread Herve Pages
Martin Maechler wrote:
> Hi Herve,
>
> Herve> Dear list,
> Herve> The last R-devel snapshot (2006-10-03) is an empty tarball:
> Herve> ftp://ftp.stat.math.ethz.ch/Software/R/
> Herve> Thanks,
>
> Peter already replied.
> The snapshot is now back to normal.
>
> The real problem was that the snapshot was broken for about 10
> days till someone noticed.
> So indeed, we would have been grateful if your e-mail had
> arrived a week earlier ;-) ;-)
>   

Hi Martin, Peter,

Thanks for fixing the problem! I've just downloaded the last
R-devel snapshot tarball (2006-10-04) and will give it a try today...

I did notice the 10-day gap but I suspected you R guys already knew
about it and that you were too busy with the R 2.4.0 pre-release stuff
to care about it _at that moment_. So I decided to wait until after the
release. Then yesterday I saw this R-devel snapshot tarball (the
first one in 12 days) and decided to give it a try, because, like Kurt,
I need it for our check procedure too (Bioconductor) ;-)

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Last R-devel snapshot is an empty tarball

2006-10-04 Thread Herve Pages
Dear list,

The last R-devel snapshot (2006-10-03) is an empty tarball:
  ftp://ftp.stat.math.ethz.ch/Software/R/
Thanks,

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Display problem with named raw vectors

2006-10-03 Thread Herve Pages
Hi,


I found that displaying a raw vector with long names is not
as pretty as for other types of named vectors:

> r <- charToRaw("Mz")
> r
[1] 4d 7a
> names(r) <- c("M", "zz")
> r
 M zz
4d 7a

The names and the values are not aligned :-(

> i <- as.integer(r)
> i
[1]  77 122
> names(i) <- c("M", "zz")
> i
 M zz
77122

Much better :-)
(Note that the names were lost during coercion... surprising
but documented.)

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] More strange [[ behaviour

2006-10-02 Thread Herve Pages
Duncan Murdoch wrote:
> On 10/2/2006 3:21 PM, Herve Pages wrote:
>>> a[[x=1]]
>> Error in a[[x = 1]] : subscript out of bounds
>>
>
> Indexing is a function call, with arguments x, i, j, ... .  If you use
> y=1, you're setting something in the "..." part of the arg list.  If
> you say x=1, you're setting the first arg, but because a[[x=1]] is the
> same as "[["(a, x=1), it is going to evaluate it as "[["(x=1, i=a) and
> try to index 1 by a instead of a by 1.
Hi Duncan,

Right, thanks for the explanation and sorry for not paying more
attention.

Best,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] More strange [[ behaviour

2006-10-02 Thread Herve Pages
Hi,

> setClass("MyList", "list")
[1] "MyList"
> a <- new("MyList")
> a
An object of class "MyList"
list()
> setMethod("[[", "MyList", function(x, i, j, ...) cat("Just testing\n"))
> a[[]]
Just testing
> a[[1]]
Just testing
> a[[a=1]]
Just testing
> a[[b=1]]
Just testing
...
> a[[v=1]]
Just testing
> a[[w=1]]
Just testing
> a[[x=1]]
Error in a[[x = 1]] : subscript out of bounds
> a[[y=1]]
Just testing
> a[[z=1]]
Just testing

Can someone explain me why the "[[x=1]]" case is treated differently?

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Strange behaviour of the [[ operator

2006-09-29 Thread Herve Pages
Hi,

This looks like a bug:

> a <- list(b=5)
> a[['b']]
[1] 5
> a[[t<-'b']]

Nothing gets printed!

I need to use parenthesis to see the expected result:

> a[[(t<-'b')]]
[1] 5


Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] install.packages(,type="source") broken on Windows

2006-09-08 Thread Herve Pages
OK thanks. Just FYI the same thing happens with R version 2.5.0
devel (2006-09-07 r39185).
Best,

H.

Peter Dalgaard wrote:
> [EMAIL PROTECTED] writes:
>
>   
>> Hi,
>>
>> On Windows, 'install.packages(,type="source")' displays the same output
>> than 'R CMD INSTALL --help' and doesn't install anything...
>> 
>
> Yes. The culprit is
>
> 
> r39127 | urbaneks | 2006-09-05 20:45:00 +0200 (Tue, 05 Sep 2006) | 1 line
>
> Bugfix: install.packages ignored unnamed configure.args
> 
>
> which forgets to check whether there are any args to name. (There is a
> generic issue in that available.packages has a configure.args argument
> which is passed on to --configure-args in the INSTALL script: The
> Windows INSTALL doesn't understand the argument since configure
> doesn't work on Windows. However, things should work when no such
> arguments are passed.)
>
>   


-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] download.packages() ignores too recent packages

2006-08-11 Thread Herve Pages
Hi list,

I don't know if it's a bug or a feature but I can't download a package
that requires a more recent version of R than the one I'm currently
using for the download:

 > rep <- "http://bioconductor.org/packages/1.9/bioc";
 > download.packages("Biobase", destdir=".", repos=rep)
Warning in download.packages("Biobase", destdir = ".", repos = rep) :
 no package 'Biobase' at the repositories
 [,1] [,2]

There _is_ a 'Biobase' package here but since it requires R >= 2.3.0
and I'm doing this from R 2.2.1 then it's ignored.
I realize that this is consistent with the behaviour of available.packages()
but... I guess the most confusing part here (at least for me) is the
warning message. May be it could say something like "no package 'Biobase'
at the repositories for your version of R"?

Best,

H.


-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
Phone: (206) 667-5791
Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] install.packages() does not warn when 'lib' arg is missing on Linux or Windows

2006-06-09 Thread Herve Pages
Simon Urbanek wrote:
> On Jun 8, 2006, at 9:08 PM, Herve Pages wrote:
>
>> Man page for 'install.packages' says that if the 'lib' arg is missing,
>> then it "defaults to '.libPaths()[1]' with a warning".
>
> Where are you quoting from? I read (R 2.3.1):

Hi Simon,

I was quoting from the description of the 'lib' argument (among all 
functions
described on this man page, the 'install.packages' function is the only 
one to
have a 'lib' arg):

lib: character vector giving the library directories where to
 install the packages.  Recycled as needed.  If missing,
 defaults to '.libPaths()[1]' with a warning.

I forgot to check the "Details" section for the 'install.packages' function
where the description is indeed correct. Sorry!

>> On my Mac OS X system:
>>
>>> .libPaths()
>> [1] "/Users/biocbuild/Library/R/Library"
>
> ^^ - this is your custom setting, it is not the default
>
>> [2] "/Library/Frameworks/R.Framework/Versions/2.3/Resources/ 
>> library"
>>
>
> The default on OS X is
>
> > .libPaths()
> [1] "/Library/Frameworks/R.framework/Resources/library"
> >
>
> in the shell and
>
> > .libPaths()
> [1] "/Library/Frameworks/R.framework/Versions/2.3/Resources/library"
> >
Yep right. Since it was the first time I was installing R-2.3.1 (from 
the CRAN
.dmg file, I choosed to install all components, including the R GUI) I 
was assuming
that my settings would have been the defaults but they were not :-/
I didn;t know about the Library/Preferences/org.R-project.R.plist file but a
real Mac OS X user showed it to me and reset my settings to the defaults.
Sorry for the trouble...

Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] install.packages("C", dep=TRUE) does not always install indirect dependencies

2006-06-09 Thread Herve Pages
Duncan Murdoch wrote:
> On 6/8/2006 3:31 PM, Herve Pages wrote:
>> May be this is the desired behavior, I don't know. Personally, I would
>> think of 'install.packages("C", dep=TRUE)' as a reliable way to get 
>> every
>> packages that C directly or indirectly relies on installed.
>
> This seems rather unreasonable.  If a user asked not to install B's 
> dependencies, and A is not listed as a dependency of C, then I don't 
> think a request to install C and its dependencies should install A.

Hi Duncan,

Maybe the user didn't _ask_ not to install B's deps. Maybe he simply 
made a mistake
(which is easy with 'dep=FALSE' being the default), or maybe this is a 
shared system and
someone else installed B without 'dep=TRUE' or removed A or...
But I realize that there would be a significant extra cost if 
'install.packages("C", dep=TRUE)'
had to check the whole deps tree instead of just checking whether only 
the packages listed
in C's Depends field are already installed (especially when those 
packages _are_ already
installed).
So if checking the whole deps tree can't be done everytime 
'install.packages(..., dep=TRUE)'
is used then may be it could be done on demand e.g. with something 
similar to what
"rpm -Va --nofiles" does on a Linux system (this checks the entire 
system for missing deps).

> Perhaps the installation of B without dependencies was a mistake?  I 
> think an argument could be made that dependencies=TRUE should be the 
> default (as it is when using the corresponding menu item in the 
> Windows GUI).
Yes having 'dependencies=TRUE' as the default would definetly help the
user to keep a system where all deps are satisfied.

Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] install.packages() does not warn when 'lib' arg is missing on Linux or Windows

2006-06-08 Thread Herve Pages
Prof Brian Ripley wrote:
> On Thu, 8 Jun 2006, Herve Pages wrote:
>
>> Man page for 'install.packages' says that if the 'lib' arg is missing,
>> then it "defaults to '.libPaths()[1]' with a warning".
>> But, given the 'install.packages' source code, it seems that this 
>> warning
>> is issued only when 'length(.libPaths()) > 1'.
>
> Yes, and what is the problem with that?
Since current behaviour is not in sync with documentation then
IMHO it's either a documentation bug or an implementation bug.
Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] install.packages("C", dep=TRUE) does not always install indirect dependencies

2006-06-08 Thread Herve Pages
Hello again,

I've found another issue with 'install.packages'.
'install.packages("C", dep=TRUE)' will "fail" when the 2 following
conditions are satisfied:
  - Package C depends on B which in turns depends on A but
the Depends field in C doesn't list A.
  - Package B is already installed but not package A (e.g.
the user did 'install.packages("B", dep=FALSE)').
Then 'install.packages("C", dep=TRUE)' will not install A.
May be this is the desired behavior, I don't know. Personally, I would
think of 'install.packages("C", dep=TRUE)' as a reliable way to get every
packages that C directly or indirectly relies on installed.

Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] install.packages() does not warn when 'lib' arg is missing on Linux or Windows

2006-06-08 Thread Herve Pages
Hello,

Man page for 'install.packages' says that if the 'lib' arg is missing,
then it "defaults to '.libPaths()[1]' with a warning".
But, given the 'install.packages' source code, it seems that this warning
is issued only when 'length(.libPaths()) > 1'.
So typically, this warning will appear on Mac OS X but not on a Linux
or Windows systems with default settings.
On my Mac OS X system:

 > .libPaths()
[1] "/Users/biocbuild/Library/R/Library"
[2] "/Library/Frameworks/R.Framework/Versions/2.3/Resources/library"

but on my Linux system:

 > .libPaths()
[1] "/home/hpages/arch/x86_64/R-2.3.1/library"

I'm using R-2.3.1 on both systems.
Best,


H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
 Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'R CMD build' ignoring segfaults occuring during the vignettes creation

2006-05-17 Thread Herve Pages
Sorry for erroneous subject of my previous post. Here it goes again.


Hi,

Something else I'd like to report.
If a segmentation fault occurs during the "creating vignettes" step,
then 'R CMD build' ignores the problem and end up building the source
package anyway:

   /loc/biocbuild/1.9d/R/bin/R CMD build RMAGEML
   * checking for file 'RMAGEML/DESCRIPTION' ... OK
   * preparing 'RMAGEML':

   ...

   * DONE (RMAGEML)
   * creating vignettes ...sh: line 1:  8070 Segmentation fault  
'/loc/biocbuild/1.9d/R/bin/R' --vanilla --no-save
--quiet /tmp/Rout656233925 2>&1
OK
   * cleaning src
   * removing junk files
   * checking for LF line-endings in source files
   * checking for empty or unneeded directories
   * building 'RMAGEML_2.7.0.tar.gz'

'R CMD check' behaves the same way during the "checking package vignettes" step.
I have observed this problem with R 2.3.0 and R 2.4.0 devel (r37925).

Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 'R CMD ' doesn't work on Windows

2006-05-17 Thread Herve Pages
Hi,

Something else I'd like to report.
If a segmentation fault occurs during the "creating vignettes" step,
then 'R CMD build' ignores the problem and end up building the source
package anyway:

   /loc/biocbuild/1.9d/R/bin/R CMD build RMAGEML
   * checking for file 'RMAGEML/DESCRIPTION' ... OK
   * preparing 'RMAGEML':

   ...

   * DONE (RMAGEML)
   * creating vignettes ...sh: line 1:  8070 Segmentation fault  
'/loc/biocbuild/1.9d/R/bin/R' --vanilla --no-save 
--quiet /tmp/Rout656233925 2>&1
OK
   * cleaning src
   * removing junk files
   * checking for LF line-endings in source files
   * checking for empty or unneeded directories
   * building 'RMAGEML_2.7.0.tar.gz'

'R CMD check' behaves the same way during the "checking package vignettes" step.
I have observed this problem with R 2.3.0 and R 2.4.0 devel (r37925).

Best,

H.

-- 

Hervé Pagès
E-mail: [EMAIL PROTECTED]
Phone: (206) 667-5791
   Fax: (206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  1   2   >