Re: [Rd] Assignment to string

2009-04-01 Thread Stavros Macrakis
On Wed, Apr 1, 2009 at 5:11 PM, Wacek Kusnierczyk <
waclaw.marcin.kusnierc...@idi.ntnu.no> wrote:

> Stavros Macrakis wrote:
> ...
> i think this concords with the documentation in the sense that in an
> assignment a string can work as a name.  note that
>
>`foo bar` = 1
>is.name(`foo`)
># FALSE
>
> the issue is different here in that in is.name("foo") "foo" evaluates to
> a string (it works as a string literal), while in is.name(`foo`) `foo`
> evaluates to the value of the variable named 'foo' (with the quotes
> *not* belonging to the name).
>

Wacek, surely you are joking here.  The object written `foo` (a name)
*evaluates to* its value.  The object written "foo" (a string) evaluates to
itself.  This has nothing to do with the case at hand, since the left-hand
side of an assignment statement is not evaluated in the normal way.


> ...with only a quick look at the sources (src/main/envir.c:1511), i guess
> the first element to an assignment operator (i mean the left-assignment
> operators) is converted to a name


Yes, clearly when the LHS of an assignment is a string it is being coerced
to a name.  I was simply pointing out that that is not consistent with the
documentation, which requires a name on the LHS.

- maclisp was designed by computer scientists in a research project,
> - r is being implemented by statisticians for practical purposes.
>

Well, I think it is overstating things to say that Maclisp was designed at
all.  Maclisp grew out of PDP-6 Lisp, with new features being added
regularly. Maclisp itself wasn't a research project -- there are vanishingly
few papers about it in the academic literature, unlike contemporary research
languages like Planner, EL/1, CLU, etc. In fact, there are many parallels
with R -- it was in some sense a service project supporting AI and symbolic
algebra research, with ad hoc features (a.k.a. hacks) being added regularly
to support some new idea in AI or algebra.  To circle back to the current
discussion, Maclisp didn't even have strings as a data type until the
mid-70's -- before that, atoms ('symbols' in more modern terminology) were
the only way to represent strings. (And that lived on in Maxima for many
decades...)  See http://www.softwarepreservation.org/projects/LISP/ for
documentation on the history of many different Lisps.

We learned many lessons with Maclisp.  Well, actually two different sets of
lessons were learned by two different communities.  The Scheme community
learned the importance of minimalist, clean, principled design.  The Common
Lisp community learned the importance of large, well-designed libraries.
Both learned the importance of standardization and clear specification.
There is much to learn.

   -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RGoogleDocs so close but errors with spreadsheet reading

2009-04-01 Thread Duncan Temple Lang


You might try my development version which I put at

   http://www.omegahat.org/Prerelease/RGoogleDocs_0.2-0.tar.gz

I am not certain if there are any substantive differences with the
one in the Omegahat repository, but it works for me with a document
and a spreadsheet in an Google Docs account.

  D.

Farrel Buchinsky wrote:

I got RGoogleDocs to work. It works on documents but not on spreadsheets.
I get the following error.

getDocs(con, what
="http://docs.google.com/feeds/documents/private/full/-/spreadsheet";)

assignment of an object of class "NULL" is not valid for slot "access"
in an object of class "GoogleSpreadsheet"; is(value, "character") is
not TRUE

How can I troubleshoot this?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870

Sent from Pittsburgh, Pennsylvania, United States

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RGoogleDocs so close but errors with spreadsheet reading

2009-04-01 Thread Farrel Buchinsky
Does it have anything to do with the following message when I load RGoogleDocs?

The following object(s) are masked from package:methods :

 getAccess


Farrel Buchinsky
Google Voice Tel: (412) 567-7870




On Wed, Apr 1, 2009 at 18:36, Farrel Buchinsky  wrote:
> I got RGoogleDocs to work. It works on documents but not on spreadsheets.
> I get the following error.
>
> getDocs(con, what
> ="http://docs.google.com/feeds/documents/private/full/-/spreadsheet";)
>
> assignment of an object of class "NULL" is not valid for slot "access"
> in an object of class "GoogleSpreadsheet"; is(value, "character") is
> not TRUE
>
> How can I troubleshoot this?
>
> Farrel Buchinsky
> Google Voice Tel: (412) 567-7870
>
> Sent from Pittsburgh, Pennsylvania, United States
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] RGoogleDocs so close but errors with spreadsheet reading

2009-04-01 Thread Farrel Buchinsky
I got RGoogleDocs to work. It works on documents but not on spreadsheets.
I get the following error.

getDocs(con, what
="http://docs.google.com/feeds/documents/private/full/-/spreadsheet";)

assignment of an object of class "NULL" is not valid for slot "access"
in an object of class "GoogleSpreadsheet"; is(value, "character") is
not TRUE

How can I troubleshoot this?

Farrel Buchinsky
Google Voice Tel: (412) 567-7870

Sent from Pittsburgh, Pennsylvania, United States

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Assignment to string

2009-04-01 Thread Wacek Kusnierczyk
Stavros Macrakis wrote:
> The documentation for assignment says:
>
>  In all the assignment operator expressions, 'x' can be a name or
>  an expression defining a part of an object to be replaced (e.g.,
>  'z[[1]]').  A syntactic name does not need to be quoted, though it
>  can be (preferably by backticks).
>
> But the implementation allows assignment to a character string (i.e. not a
> name), which it coerces to a name:
>
>  "foo" <- 23; foo
>  # returns 23
>  > is.name("foo")
>  [1] FALSE
>
> Is this a documentation error or an implementation error?
>   

i think this concords with the documentation in the sense that in an
assignment a string can work as a name.  note that

`foo bar` = 1
is.name(`foo`)
# FALSE

the issue is different here in that in is.name("foo") "foo" evaluates to
a string (it works as a string literal), while in is.name(`foo`) `foo`
evaluates to the value of the variable named 'foo' (with the quotes
*not* belonging to the name).

with only a quick look at the sources (src/main/envir.c:1511), i guess
the first element to an assignment operator (i mean the left-assignment
operators) is converted to a name, so that in

"foo" <- 1

"foo" evaluates to a string and not a name (hence is.name("foo") is
false), but internally it is sort of 'coerced' to a name, as in

as.name("foo")
# `foo`
is.name(as.name("foo"))
# TRUE

> The coercion is not happening at parse time:
>
> class(quote("foo"<-3)[[2]])
> [1] "character"
>   

i think the internal assignment op really receives a string in a case
like "foo" <- 1, it knows it has to treat it as a name without the
parser classifying the string as a name.  (pure guesswork, again.)

the documentation might avoid calling a plain string a 'quoted name',
though, it is confusing.  a quoted name is something like quote(name) or
quote(`name`):

is(quote(name))
# "name" "language"

is(quote(`name`))
# "name" "language"

but *not* something like "name":
   
is("name")
# "character" "vector" "data.frameRowLabels"

and *not* like quote("name"):
   
is(quote("name"))
# "character" "vector" "data.frameRowLabels"


> In fact, bizarrely, not only does it coerce to a name, it actually
> *modifies* the parse tree:
>
> > gg <- quote("hij" <- 4)
> > gg
> "hij" <- 4
> > eval(gg)
> > gg
> hij <- 4
>   

wow!  that's called 'functional programming' ;)
you're right:

gg = quote({"a" = 1})
is(gg[[2]][[2]])
# "character" ...
eval(gg)
is(gg[[2]][[2]])
# "name" ...
  

> *** The cases below only come up with expression trees generated
> programmatically as far as I know, so are much more marginal cases. ***
>
> The <- operator even allows the left-hand-side to be of length > 1, though
> it just ignores the other elements, with the same side effect as before:
>   

that's clear from the sources;  see src/main/envir.c:1521.  it should be
documented (maybe it is, i haven't investigated this issue).

> > gg <- quote(x<-44)
> > gg[[2]] <- c("x","y")
> > gg
> c("x", "y") <- 44
>   
> eval(gg)

but also this:

rm(list=ls())
do.call('=', list(letters, 1))
# just fine
a
# 1
b
# error


weird these work.  i think it deserves a warning, at the very least, as in

c('x', 'y') = 4
# error: assignment to non-language object
c(x, y) = 4
# error: could not find function c<-

(provided that x and y are already there)

btw., that's what you can do with rvalues (using the otherwise
semantically void operator `:=`).

these could seem equivalent, but they're (obviously) not:

'x' = 1
c('x') = 1

x = 1
c(x) = 1

> > x
> [1] 44
> > y
> Error: object "y" not found
> > gg
> x <- 44
>
> None of this is documented in ? <-, and it is rather a surprise that
> evaluating an expression tree can modify it.  I admit we had a feature
> (performance hack) like this in MacLisp years ago, where expanded syntax
> macros replaced the source code of the macro, but it was a documented,
> general, and optional part of the macro mechanism.
>   

but

- maclisp was designed by computer scientists in a research project,
- r is being implemented by statisticians for practical purposes.

almost every part differs here (and almost no pun intended).

> Another little glitch:
>
> gg <- quote(x<-44); gg[[2]] <- character(0); eval(gg)
> Error in eval(expr, envir, enclos) :
>   'getEncChar' must be called on a CHARSXP
>
> This looks like an internal error that users shouldn't see.
>   

by no means the only example that the interface is no blood-brain barrier.

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Assignment to string

2009-04-01 Thread Stavros Macrakis
On Wed, Apr 1, 2009 at 4:21 PM, Simon Urbanek
wrote:

>
> On Apr 1, 2009, at 15:49 , Stavros Macrakis wrote:
>
>  The documentation for assignment says:
>>
>>In all the assignment operator expressions, 'x' can be a name or
>>an expression defining a part of an object to be replaced (e.g.,
>>'z[[1]]').  A syntactic name does not need to be quoted, though it
>>can be (preferably by backticks).
>>
>> But the implementation allows assignment to a character string (i.e. not a
>> name), which it coerces to a name:
>>
>>"foo" <- 23; foo
>># returns 23
>>
>>> is.name("foo")
>>>
>>[1] FALSE
>>
>> Is this a documentation error or an implementation error?
>>
>>  Neither - what you're missing is that you are actually quoting foo
> namely with double-quotes. Hence both the documentation and the
> implementations are correct. (Technically "name" as referred above can be
> either a symbol or a character string).
>

In R, "name" is a technical term, a synonym for "symbol".  Names and
character strings are functionally distinct: eval("foo") is not the same as
eval(quote(foo)), though of course there are cases where R does an implicit
coercion, e.g. list(a=3) / list("a"=3).

I don't see how it makes sense to say "Technically "name" as referred above
can be either a symbol or a character string." except perhaps in the
Humpty-Dumpty sense where a word "means just what I choose it to mean",
which rather defeats the purpose of documentation.

-s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] License question (RUnit)

2009-04-01 Thread Christophe Dutang


Hi,

Try to contact the other authors... I spent 5 min on google and found  
this http://www.uni-konstanz.de/FuF/Verwiss/koenig/


Regards

Christophe

Le 1 avr. 09 à 10:58, Pierre-Yves a écrit :


Dear list,

Sorry for the noise but I have a question regarding the license used  
in

RUnit [1], I contacted the maintainer( burgerm -at- users -dot-
sourceforge -dot- net ) on March 20th but I have received no answer.

Could anyone help to solve this question ?

Basically, my problem is that the website and the DESCRIPTION file say
that the license is GPLv2 while the header in the code says it is  
GPLv2

or any later version.

Thanks in advance for your help,

Best regards,

Pierre

[1] http://cran.r-project.org/web/packages/RUnit/index.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Christophe Dutang
Ph. D. student at ISFA, Lyon, France
website: http://dutangc.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Assignment to string

2009-04-01 Thread Simon Urbanek


On Apr 1, 2009, at 15:49 , Stavros Macrakis wrote:


The documentation for assignment says:

In all the assignment operator expressions, 'x' can be a name or
an expression defining a part of an object to be replaced (e.g.,
'z[[1]]').  A syntactic name does not need to be quoted, though it
can be (preferably by backticks).

But the implementation allows assignment to a character string (i.e.  
not a

name), which it coerces to a name:

"foo" <- 23; foo
# returns 23

is.name("foo")

[1] FALSE

Is this a documentation error or an implementation error?



Neither - what you're missing is that you are actually quoting foo  
namely with double-quotes. Hence both the documentation and the  
implementations are correct. (Technically "name" as referred above can  
be either a symbol or a character string).


Cheers,
Simon



The coercion is not happening at parse time:

   class(quote("foo"<-3)[[2]])
   [1] "character"

In fact, bizarrely, not only does it coerce to a name, it actually
*modifies* the parse tree:


gg <- quote("hij" <- 4)
gg

   "hij" <- 4

eval(gg)
gg

   hij <- 4

*** The cases below only come up with expression trees generated
programmatically as far as I know, so are much more marginal cases.  
***


The <- operator even allows the left-hand-side to be of length > 1,  
though
it just ignores the other elements, with the same side effect as  
before:



gg <- quote(x<-44)
gg[[2]] <- c("x","y")
gg

   c("x", "y") <- 44

eval(gg)
x

   [1] 44

y

   Error: object "y" not found

gg

   x <- 44

None of this is documented in ? <-, and it is rather a surprise that
evaluating an expression tree can modify it.  I admit we had a feature
(performance hack) like this in MacLisp years ago, where expanded  
syntax

macros replaced the source code of the macro, but it was a documented,
general, and optional part of the macro mechanism.

Another little glitch:

   gg <- quote(x<-44); gg[[2]] <- character(0); eval(gg)
   Error in eval(expr, envir, enclos) :
 'getEncChar' must be called on a CHARSXP

This looks like an internal error that users shouldn't see.

  -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Assignment to string

2009-04-01 Thread Stavros Macrakis
The documentation for assignment says:

 In all the assignment operator expressions, 'x' can be a name or
 an expression defining a part of an object to be replaced (e.g.,
 'z[[1]]').  A syntactic name does not need to be quoted, though it
 can be (preferably by backticks).

But the implementation allows assignment to a character string (i.e. not a
name), which it coerces to a name:

 "foo" <- 23; foo
 # returns 23
 > is.name("foo")
 [1] FALSE

Is this a documentation error or an implementation error?

The coercion is not happening at parse time:

class(quote("foo"<-3)[[2]])
[1] "character"

In fact, bizarrely, not only does it coerce to a name, it actually
*modifies* the parse tree:

> gg <- quote("hij" <- 4)
> gg
"hij" <- 4
> eval(gg)
> gg
hij <- 4

*** The cases below only come up with expression trees generated
programmatically as far as I know, so are much more marginal cases. ***

The <- operator even allows the left-hand-side to be of length > 1, though
it just ignores the other elements, with the same side effect as before:

> gg <- quote(x<-44)
> gg[[2]] <- c("x","y")
> gg
c("x", "y") <- 44
> eval(gg)
> x
[1] 44
> y
Error: object "y" not found
> gg
x <- 44

None of this is documented in ? <-, and it is rather a surprise that
evaluating an expression tree can modify it.  I admit we had a feature
(performance hack) like this in MacLisp years ago, where expanded syntax
macros replaced the source code of the macro, but it was a documented,
general, and optional part of the macro mechanism.

Another little glitch:

gg <- quote(x<-44); gg[[2]] <- character(0); eval(gg)
Error in eval(expr, envir, enclos) :
  'getEncChar' must be called on a CHARSXP

This looks like an internal error that users shouldn't see.

   -s

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R in standalone application

2009-04-01 Thread Simon Urbanek

Eric,

On Mar 31, 2009, at 10:50 PM, Eric wrote:

I am trying to build a C application where I need to compute some  
statistics to take decisions about the direction to give to a user,  
knowing his/her habits. Because I used R back at school, I thought I  
can use some of his functions in my application,  as a shared  
library. I reviewed the "Rinternals" and "R extensions"  documents,  
and decided to give a try to the REmbedded.c file. Compilation and  
linking went well. But the execution failed with a "Fatal error: R  
home directory is not defined". Does it mean that R has to to be  
distributed with my application or, did I miss something in my  
readings ?


Yes - you didn't read the 8.1 section of R-ext attentively enough. It  
tells you that you need to setup the environment properly and the  
easiest way to do that on unix is to run

R CMD yourApp

Clearly, R must be installed in order to use your application, since  
your application is using R ;). It is common for the embedding  
application to determine the correct settings before starting R (see  
the R.app GUI on how to setup the environment on a Mac in a GUI  
application [you can still use R CMD though], see for example Rserve  
on how to find the R settings from the registry on Windows).


Cheers,
Simon


If that is of any importance, I am working on unix but aim for full  
portability (i.e Windows too)


Thanks for any assistance.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Possible bug in summary.survfit - 'scale' argument ignored?

2009-04-01 Thread Thomas Lumley


I've sent a fixed version 2.35-4 to CRAN.  It turned out to be a fairly simple 
change.

-thomas


On Tue, 31 Mar 2009, Marc Schwartz wrote:


On Mar 30, 2009, at 5:55 PM, Marc Schwartz wrote:


Hi all,

Using:

 R version 2.8.1 Patched (2009-03-07 r48068)

on OSX (10.5.6) with survival version:

 Version:2.35-3
 Date:   2009-02-10


I get the following using the first example in ?summary.survfit:

> summary( survfit( Surv(futime, fustat)~1, data=ovarian))
Call: survfit(formula = Surv(futime, fustat) ~ 1, data = ovarian)

time n.risk n.event survival std.err lower 95% CI upper 95% CI
  59 26   10.962  0.03770.8901.000
 115 25   10.923  0.05230.8261.000
 156 24   10.885  0.06270.7701.000
 268 23   10.846  0.07080.7180.997
 329 22   10.808  0.07730.6700.974
 353 21   10.769  0.08260.6230.949
 365 20   10.731  0.08700.5790.923
 431 17   10.688  0.09190.5290.894
 464 15   10.642  0.09650.4780.862
 475 14   10.596  0.09990.4290.828
 563 12   10.546  0.10320.3770.791
 638 11   10.497  0.10510.3280.752


> summary( survfit( Surv(futime, fustat)~1, data=ovarian), scale = 365.25)
Call: survfit(formula = Surv(futime, fustat) ~ 1, data = ovarian)

time n.risk n.event survival std.err lower 95% CI upper 95% CI
  59 26   10.962  0.03770.8901.000
 115 25   10.923  0.05230.8261.000
 156 24   10.885  0.06270.7701.000
 268 23   10.846  0.07080.7180.997
 329 22   10.808  0.07730.6700.974
 353 21   10.769  0.08260.6230.949
 365 20   10.731  0.08700.5790.923
 431 17   10.688  0.09190.5290.894
 464 15   10.642  0.09650.4780.862
 475 14   10.596  0.09990.4290.828
 563 12   10.546  0.10320.3770.791
 638 11   10.497  0.10510.3280.752

Of course the time periods in the second output should be scaled to years, 
that is (time / 365.25).


I noted this today running some Sweave code, but not sure when the actual 
change in behavior occurred.  I can replicate the same behavior on a Windows 
machine here as well, so this is not OSX specific.



A quick follow up here. I reverted to:

 R version 2.8.1 (2008-12-22)

which includes survival version:

Version:   2.34-1
Date:  2008-03-31


In that version, I get:


summary( survfit( Surv(futime, fustat)~1, data=ovarian), scale = 365.25)

Call: survfit(formula = Surv(futime, fustat) ~ 1, data = ovarian)

 time n.risk n.event survival std.err lower 95% CI upper 95% CI
0.162 26   10.962  0.03770.8901.000
0.315 25   10.923  0.05230.8261.000
0.427 24   10.885  0.06270.7701.000
0.734 23   10.846  0.07080.7180.997
0.901 22   10.808  0.07730.6700.974
0.966 21   10.769  0.08260.6230.949
0.999 20   10.731  0.08700.5790.923
1.180 17   10.688  0.09190.5290.894
1.270 15   10.642  0.09650.4780.862
1.300 14   10.596  0.09990.4290.828
1.541 12   10.546  0.10320.3770.791
1.747 11   10.497  0.10510.3280.752


So the functional loss of the 'scale' argument took place subsequent to that 
release. From a review of the code in both versions, it would appear that 
substantive changes took place to the function in the intervening time frame, 
including the addition of the 'rmean' and 'extend' arguments. One of the 
changes appears to be the setting of:


 stime <- fit$time/scale

in the old version and I do not see a parallel adjustment in the time scale in 
the new version and the subsequent use of fit$time later in the new function.


Given the substantive changes to the function code, I am hesitant to propose 
patches for fear of introducing breakage elsewhere. I also need to get some 
work done for a client today, before I leave for vacation tomorrow for a week, 
otherwise I would spend more time evaluating possible patches.


I hope that the above is enough to give Terry and Thomas some narrowed focus.

Regards,

Marc

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

___

[Rd] Typo in the documentation of model.extract()?

2009-04-01 Thread Christian Ritz
Hi,

it seems that there is a minor typo in the last line of the "Details" section.

Shouldn't "model.frame" be "model.extract" in the sentence


"model.weights is slightly different from model.frame(, "weights") in not 
naming the
vector it returns."


?


Christian

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] License question (RUnit)

2009-04-01 Thread Pierre-Yves
Dear list,

Sorry for the noise but I have a question regarding the license used in
RUnit [1], I contacted the maintainer( burgerm -at- users -dot-
sourceforge -dot- net ) on March 20th but I have received no answer.

Could anyone help to solve this question ?

Basically, my problem is that the website and the DESCRIPTION file say
that the license is GPLv2 while the header in the code says it is GPLv2
or any later version.

Thanks in advance for your help,

Best regards,

Pierre

[1] http://cran.r-project.org/web/packages/RUnit/index.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Gamma funtion(s) bug

2009-04-01 Thread Wacek Kusnierczyk
Martin Maechler wrote:
>
> Using 'bug' (without any qualifying "?" or "possible" ..) 
> in the subject line is still a bit unfriendly...
>   


is suggesting that a poster includes 'excel bug' in the subject line [1]
friendly??

vQ



[1] https://stat.ethz.ch/pipermail/r-help/2009-March/190119.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] variance/mean

2009-04-01 Thread Wacek Kusnierczyk
Martin Maechler wrote:
>
> Your patch is basically only affecting the default  
> method = "pearson". For (most) other cases, 'y = NULL' would
> still remain  *the* way to save computations, unless we'd start
> to use an R-level equivalent [which I think does not exist] of
> your C  trick   (DATAPTR(x) == DATAPTR(y)).
>
>   

yes, my patch was constrained to the c code, but i don't think it would
be particularly difficult to fix the relevant r-level code as well.  i
did think about it, but didn't want to invest more time in this until
(or unless) someone would respond.  (thanks for the response.)

> Also, for S- and R- backcompatibility reasons, we'd need to
> continue allowing  y = NULL (as your patch would, too), 

only in its current for -- indeed, the (unimplemented) intention was to
detach from the old misdesign, and fix everything so that y=x by default
anywhere.

> so
> currently I think this whole idea -- as slick as it is, I
> learned something!  --  
> does not make sense applying here.
>   

i think it does, because the current state is somewhat funny, including
both the difference in performance between var(x) and var(x,x) (with x
being a matrix), and the respective comment in ?var.

> > the attached patch suggests modifications to src/main/cov.c and
> > src/library/stats/man/cor.Rd.
>
> BTW: since you didn't (and shouldn't , because of method != "pearson" !) 
>  change the R code, 

i would suggest it be done, though.

> the docs  \usage{.} part should not have been
>  changed either ! 
>   

indeed, the change in the docs didn't match what i *have* actually fixed
in the code.

>  and as I mentioned: using 'y = NULL' in the function call must
>   

*MUST* ?

>  continue to work, hence should also be documented as
>  possibility
>  ==>  the docs would not really become more clear, I think 
>   

no, of course, without the change in r code having the docs say y=x by
default would be a nonsense.  but again, this was a start, not a
complete modification (and i admit i failed to acknowledge this).

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wishlist: optional svn-revision number tag in package DESCRIPTION file

2009-04-01 Thread Peter Ruckdeschel
Thanks Gabor, Duncan, and Dirk,

for your replies.

Gabor Grothendieck wrote:
> We need to make sure we understand the implications
> for packages developed under the other major version
> control systems like git, bzr and hg.

Ok for this --- of course it would even be "greater" to have
a universal replacement scheme for general version control
systems in R for DESCRIPTION files, but actually for the moment
I would already be content with some R tools (possibly a collection
of them) for each version control system individually.

> On 31 March 2009 at 12:58, Duncan Murdoch wrote:
> | On 3/31/2009 10:41 AM, Peter Ruckdeschel wrote:
> | > Could we have one (or maybe more) standardized optional tag(s)
> | > for package DESCRIPTION files to cover svn revision info?
> | > This would be very useful for bug reporting...
> 
> Indeed. I am doing something similar with local packages at work.
> 
> | > I know that any developer is already free to append corresponding lines
> | > to DESCRIPTION files to do something of this sort --- e.g. lines like
> | > 
> | > LastChangedDate: {$LastChangedDate: 2009-03-31 $}
> | > LastChangedRevision: {$LastChangedRevision: 447 $}
> | 
> | That will give you the last change to the DESCRIPTION file, not the last 
> | change to the package, so it could be misleading.  Last time I looked, 
> | there wasn't a way in svn to auto update a file that wasn't involved in 
> | a changeset.  

Ouch. I stand corrected; and I have to say: this even is an FAQ
in the SVN documentation... So using svn properties will not work indeed.

Still, my wish for a better integration of version control
information into R persists...

So if I understand correctly, under linux / cygwin (Mac I don't know)
you would use some scripting to read out the output of svnversion;
let me add that under Windows + Tortoise SVN you would have
SubWCRev (http://tortoisesvn.tigris.org/faq.html#subwcrev) to help you.

> (You could put something into your build script to call 
> | svnversion, but I don't know anything simpler.)
> 
> Yes, I have been using configure for that (which can be really any type of
> executable script rather than something from autoconf). One can then either
> update a placeholder in DESCRIPTION.in to substitute the revision number
> and/or create a package-local function reporting svn revision, build time,
> etc.  
> 
> It may make sense to think about a more general scheme. A common problem is
> of course once again portability and the set of required tools.

If we are talking about R functions for reporting version control
information --- what about the following scheme:
-have some version control system individual functions (one for svn, one
 for git and so on)
-have some S4 control class for each of these version control systems
-have an S4 generic VCinfo() which dispatches according to an argument
VCsystem of this control class

This would give some additional flexibility to integrate infra-structure
for new version control systems ---even by other programmers--- without
interfering with the generic.

For the scripting approach --- what about some extra options for
   R CMD build
for instance --withSVN or --withTortoiseSVN ?

Thanks again for your comments --- and apologies for my wrong idea
using svn properties.

Best, Peter

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] duplicated.data.frame {was "[R] which rows are duplicates?"}

2009-04-01 Thread Wacek Kusnierczyk
Martin Maechler wrote:
>
> WK> i attach the patch post for reference.  note that you need to fix all 
> of
> WK> the functions in duplicated.R that share the buggy code.  (yes, this 
> was
> WK> another thread;  i submitted a bug report, and then sent a follow-up
> WK> post with a patch).
>
> Thank you; yes, in the mean time I have also seen your bug
> report and patch.  
> Interestingly (or not), I have myself patched identically to
> what you propose, withOUT even having known about your bug report + patch.
>   

this means, the solution has greater chances to be correct.

> 
>
> { hmmm, it seems your thinking can be very close to mine, so why
>   can't you like R properly  ;-b }
>   

actually, i think i *do* like r properly.

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] variance/mean

2009-04-01 Thread Martin Maechler
> Wacek Kusnierczyk 
> on Tue, 24 Mar 2009 00:39:58 +0100 writes:

> (this post suggests a patch to the sources, so i allow myself to divert
> it to r-devel)

> Bert Gunter wrote:
>> x a numeric vector, matrix or data frame. 
>> y NULL (default) or a vector, matrix or data frame with compatible
>> dimensions to x. The default is equivalent to y = x (but more 
efficient). 
>> 
>> 
> bert points to an interesting fragment of ?var:  it suggests that
> computing var(x) is more efficient than computing var(x,x), for any x
> valid as input to var.  indeed:

> set.seed(0)
> x = matrix(rnorm(1), 100, 100)

> library(rbenchmark)
> benchmark(replications=1000, columns=c('test', 'elapsed'),
> var(x),
> var(x, x))
> #test elapsed
> # 1var(x)   1.091
> # 2 var(x, x)   2.051

> that's of course, so to speak, unreasonable:  for what var(x) does is
> actually computing the covariance of x and x, which should be the same
> as var(x,x). 

> the hack is that if y is given, there's an overhead of memory allocation
> for *both* x and y when y is given, as seen in src/main/cov.c:720+.
> incidentally, it seems that the problem can be solved with a trivial fix
> (see the attached patch), so that

> set.seed(0)
> x = matrix(rnorm(1), 100, 100)

> library(rbenchmark)
> benchmark(replications=1000, columns=c('test', 'elapsed'),
> var(x),
> var(x, x))
> #test elapsed
> # 1var(x)   1.121
> # 2 var(x, x)   1.107

> with the quick checks

> all.equal(var(x), var(x, x))
> # TRUE
   
> all(var(x) == var(x, x))
> # TRUE

> and for cor it seems to make cor(x,x) slightly faster than cor(x), while
> originally it was twice slower:

> # original
> benchmark(replications=1000, columns=c('test', 'elapsed'),
> cor(x),
> cor(x, x))
> #test elapsed
> # 1cor(x)   1.196
> # 2 cor(x, x)   2.253
   
> # patched
> benchmark(replications=1000, columns=c('test', 'elapsed'),
> cor(x),
> cor(x, x))
> #test elapsed
> # 1cor(x)   1.207
> # 2 cor(x, x)   1.204

> (there is a visible penalty due to an additional pointer test, but it's
> 10ms on 1000 replications with 1 data points, which i think is
> negligible.)

>> This is as clear as I would know how to state. 

> i believe bert is right.

> however, with the above fix, this can now be rewritten as:

> "
> x: a numeric vector, matrix or data frame. 
> y: a vector, matrix or data frame with dimensions compatible to those of 
x. 
> By default, y = x. 
> "

> which, to my simple mind, is even more clear than what bert would know
> how to state, and less likely to cause the sort of confusion that
> originated this thread.

Your patch is basically only affecting the default  
method = "pearson". For (most) other cases, 'y = NULL' would
still remain  *the* way to save computations, unless we'd start
to use an R-level equivalent [which I think does not exist] of
your C  trick   (DATAPTR(x) == DATAPTR(y)).

Also, for S- and R- backcompatibility reasons, we'd need to
continue allowing  y = NULL (as your patch would, too), so
currently I think this whole idea -- as slick as it is, I
learned something!  --  
does not make sense applying here.

> the attached patch suggests modifications to src/main/cov.c and
> src/library/stats/man/cor.Rd.

BTW: since you didn't (and shouldn't , because of method != "pearson" !) 
 change the R code, the docs  \usage{.} part should not have been
 changed either ! 
 and as I mentioned: using 'y = NULL' in the function call must
 continue to work, hence should also be documented as
 possibility
 ==>  the docs would not really become more clear, I think 

Martin Maechler, ETH Zurich



> it has been prepared and checked as follows:

> svn co https://svn.r-project.org/R/trunk trunk
> cd trunk
> # edited the sources
> svn diff > cov.diff
> svn revert -R src
> patch -p0 < cov.diff

> tools/rsync-recommended
> ./configure
> make
> make check
> bin/R
> # subsequent testing within R

> if you happen to consider this patch for a commit, please be sure to
> examine and test it carefully first.

> vQ
> Content-Type: text/x-diff; name="cov.diff"
> Content-Disposition: inline; filename="cov.diff"
> Content-ID: <18899.7024.520234.153...@lynne.math.ethz.ch>
> Content-Transfer-Encoding: binary

> [Deleted text/x-diff]

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 'sep' argument in reshape()

2009-04-01 Thread Thomas Lumley

On Tue, 31 Mar 2009, Stephen Weigand wrote:


I wonder if the 'sep' argument in reshape() is being ignored
unintentionally:

## From example(reshape)
df <- data.frame(id=rep(1:4,rep(2,4)),
 visit=I(rep(c("Before","After"),4)),
 x=rnorm(4), y=runif(4))

reshape(df, timevar="visit", idvar="id", direction="wide", sep = "_")

 id x.Before y.Before x.After y.After
1  10.7730.293  -0.021   0.658
3  2   -0.5180.351  -0.623   0.946
5  30.7730.293  -0.021   0.658
7  4   -0.5180.351  -0.623   0.946

Is this more of the intended result when 'sep = "_"'?


No. sep= is designed for going the other way.  If you have wide-format data with variable 
names x.Before y.Before x.After y.After, using sep="." will let reshape() work 
out that the long-format variable names are x and y and the conditions to be put in the 
time variable are Before and After.

   -thomas


Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Gamma funtion(s) bug

2009-04-01 Thread Martin Maechler
> "BB" == Ben Bolker 
> on Tue, 31 Mar 2009 15:08:45 + (UTC) writes:

BB> Martin Maechler  stat.math.ethz.ch> writes:
>> >> But lgamma(x) is log(abs(gamma(x))), so it looks okay to me.
>> >> 
>> >> Duncan Murdoch
>> 
TH> Oops, yes! That's what comes of talking off the top of my head
TH> (I don't think I've ever had occasion to evaluate lgamma(x)
TH> for negative x, so never consciously checked in ?lgamma).
>> 
TH> Thanks, Duncan!
>> 
>> Indeed as we all know, a picture can be worth a thousand words,
>> and a simple R call such as
>> plot(lgamma, -7, 0, n=1000)
>> would have saved many words, and notably spared us from
>> yet-another erroneous non-bug report.
>> 
>> Martin

BB> In Kjetil's defense, he didn't submit an actual bug report --
BB> and although his subject line does contain the word "bug",
BB> I read his "bug report" as asking a question.  

BB> People are allowed to make mistakes ...

definitely! We all are.

Using 'bug' (without any qualifying "?" or "possible" ..) 
in the subject line is still a bit unfriendly...

BB> While I was reading ?lgamma I noticed that the "See Also"
BB> section refers to gammaCody(), which is now defunct.  Perhaps
BB> remove the sentence?

Yes. Thank you, Ben!
Regards, Martin Maechler

BB> Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 'sep' argument in reshape()

2009-04-01 Thread Martin Maechler
> "SW" == Stephen Weigand 
> on Tue, 31 Mar 2009 18:33:05 -0500 writes:

SW> I wonder if the 'sep' argument in reshape() is being ignored
SW> unintentionally:

No.  It is used much differently than you *assume* it's used.

As always,   ?reshape   contains the answer.


SW> ## From example(reshape)
SW> df <- data.frame(id=rep(1:4,rep(2,4)),
SW> visit=I(rep(c("Before","After"),4)),
SW> x=rnorm(4), y=runif(4))

SW> reshape(df, timevar="visit", idvar="id", direction="wide", sep = "_")

SW> id x.Before y.Before x.After y.After
SW> 1  10.7730.293  -0.021   0.658
SW> 3  2   -0.5180.351  -0.623   0.946
SW> 5  30.7730.293  -0.021   0.658
SW> 7  4   -0.5180.351  -0.623   0.946

SW> Is this more of the intended result when 'sep = "_"'?

SW> id x_Before y_Before x_After y_After
SW> 1  10.7730.293  -0.021   0.658
SW> 3  2   -0.5180.351  -0.623   0.946
SW> 5  30.7730.293  -0.021   0.658
SW> 7  4   -0.5180.351  -0.623   0.946

no it is not.

I tend to agree that I would have preferred a different argument
name than 'sep' for the current 'sep',
and then a *further* argument 'sep' with the functionality that
you'd like would be straightforward.

Martin Maechler, ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] duplicated.data.frame {was "[R] which rows are duplicates?"}

2009-04-01 Thread Martin Maechler
> "WK" == Wacek Kusnierczyk 
> on Tue, 31 Mar 2009 21:58:52 +0200 writes:

WK> Martin Maechler wrote:
>> 
>> >> 
>> >> and then be helpful to the R community and send a bug report
>> >> *with* a patch if {as in this case} you are able to...
>> >> 
>> >> Well, that' no longer needed here,
>> >> I'll fix that easily myself.
>> >> 
>> 
WK> but i *have* sent a patch already!
>> 
>> Ok, I believe you.  But I think you did not mention that during
>> this thread, ... and/or I must have overlooked your patch.
>> 
>> In any case the problem is now solved
>> [well, a better solution of course would add the "not-yet"
>> functionality..]; 
>> thank you for the contribution.
>> 

WK> i attach the patch post for reference.  note that you need to fix all of
WK> the functions in duplicated.R that share the buggy code.  (yes, this was
WK> another thread;  i submitted a bug report, and then sent a follow-up
WK> post with a patch).

Thank you; yes, in the mean time I have also seen your bug
report and patch.  
Interestingly (or not), I have myself patched identically to
what you propose, withOUT even having known about your bug report + patch.


{ hmmm, it seems your thinking can be very close to mine, so why
  can't you like R properly  ;-b }

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel