from:"Gabor Grothendieck"

Re: [Rd] capture "->"

2024-03-02 Thread Gabor Grothendieck

Would it be good enough to pass it as a formula?  Using your definition of foo

  foo(~ A -> result)
  ## result <- ~A

  foo(~ result <- A)
  ## ~result <- A

On Fri, Mar 1, 2024 at 4:18 AM Dmitri Popavenko
 wrote:
>
> Hi everyone,
>
> I am aware this is a parser issue, but is there any possibility to capture
> the use of the inverse assignment operator into a formula?
>
> Something like:
>
> > foo <- function(x) substitute(x)
>
> gives:
>
> > foo(A -> B)
> B <- A
>
> I wonder if there is any possibility whatsoever to signal the use of ->
> instead of <-
>
> Thank you,
> Dmitri
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] eval(parse()) within mutate() returning same value for all rows

2023-12-30 Thread Gabor Grothendieck

Here is a solution that does not hard code 3:

  library(dplyr)
  library(purrr)
  library(tidyr)

  df %>%
 separate_wider_delim(args, ",", names_sep = "") %>%
 mutate(combined = exec(sprintf, .[[1]], !!!.[-1]))
  ## # A tibble: 3 × 5
  ## wordsargs1 args2 args3 combined
  ##
  ## 1 %s plus %s equals %s 1 1 2 1 plus 1 equals 2
  ## 2 %s plus %s equals %s 2 2 4 2 plus 2 equals 4
  ## 3 %s plus %s equals %s 3 3 6 3 plus 3 equals 6

On Fri, Dec 29, 2023 at 1:45 PM Gabor Grothendieck
 wrote:
>
> If the question is how to accomplish this as opposed to how to use eval
> then we can do it without eval like this provided we can assume that words
> contains three %s .
>
>   library(dplyr)
>   library(tidyr)
>   df <- tibble(words=c("%s plus %s equals 
> %s"),args=c("1,1,2","2,2,4","3,3,6"))
>
>   df |>
> separate_wider_delim(args, ",", names = c("a", "b", "c")) |>
> mutate(combined = sprintf(words, a, b, c))
>   ## # A tibble: 3 × 5
>   ## wordsa b c combined
>   ##
>   ## 1 %s plus %s equals %s 1 1 2 1 plus 1 equals 2
>   ## 2 %s plus %s equals %s 2 2 4 2 plus 2 equals 4
>   ## 3 %s plus %s equals %s 3 3 6 3 plus 3 equals 6
>
> On Fri, Dec 29, 2023 at 9:14 AM Mateo Obregón  wrote:
> >
> > Hi all-
> >
> > Looking through stackoverflow for R string combining examples, I found the
> > following from 3 years ago:
> >
> > <https://stackoverflow.com/questions/63881854/how-to-format-strings-using-values-from-other-column-in-r>
> >
> > The top answer suggests to use eval(parse(sprintf())). I tried the 
> > suggestion
> > and it did not return the expected combines strings. I thought that this 
> > might
> > be an issue with some leftover values being reused, so I explicitly eval()
> > with a new.env():
> >
> > > library(dplyr)
> > > df <- tibble(words=c("%s plus %s equals %s"),
> > args=c("1,1,2","2,2,4","3,3,6"))
> > > df |> mutate(combined = eval(parse(text=sprintf("sprintf('%s', %s)", 
> > > words,
> > args)), envir=new.env()))
> >
> > # A tibble: 3 × 3
> >   wordsargs  combined
> >
> > 1 %s plus %s equals %s 1,1,2 3 plus 3 equals 6
> > 2 %s plus %s equals %s 2,2,4 3 plus 3 equals 6
> > 3 %s plus %s equals %s 3,3,6 3 plus 3 equals 6
> >
> > The `combined`  is not what I was expecting, as the same last eval() is
> > returned for all three rows.
> >
> > Am I missing something? What has changed in the past three years?
> >
> > Mateo.
> > --
> > Mateo Obregón
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] eval(parse()) within mutate() returning same value for all rows

2023-12-29 Thread Gabor Grothendieck

If the question is how to accomplish this as opposed to how to use eval
then we can do it without eval like this provided we can assume that words
contains three %s .

  library(dplyr)
  library(tidyr)
  df <- tibble(words=c("%s plus %s equals %s"),args=c("1,1,2","2,2,4","3,3,6"))

  df |>
separate_wider_delim(args, ",", names = c("a", "b", "c")) |>
mutate(combined = sprintf(words, a, b, c))
  ## # A tibble: 3 × 5
  ## wordsa b c combined
  ##
  ## 1 %s plus %s equals %s 1 1 2 1 plus 1 equals 2
  ## 2 %s plus %s equals %s 2 2 4 2 plus 2 equals 4
  ## 3 %s plus %s equals %s 3 3 6 3 plus 3 equals 6

On Fri, Dec 29, 2023 at 9:14 AM Mateo Obregón  wrote:
>
> Hi all-
>
> Looking through stackoverflow for R string combining examples, I found the
> following from 3 years ago:
>
> 
>
> The top answer suggests to use eval(parse(sprintf())). I tried the suggestion
> and it did not return the expected combines strings. I thought that this might
> be an issue with some leftover values being reused, so I explicitly eval()
> with a new.env():
>
> > library(dplyr)
> > df <- tibble(words=c("%s plus %s equals %s"),
> args=c("1,1,2","2,2,4","3,3,6"))
> > df |> mutate(combined = eval(parse(text=sprintf("sprintf('%s', %s)", words,
> args)), envir=new.env()))
>
> # A tibble: 3 × 3
>   wordsargs  combined
>
> 1 %s plus %s equals %s 1,1,2 3 plus 3 equals 6
> 2 %s plus %s equals %s 2,2,4 3 plus 3 equals 6
> 3 %s plus %s equals %s 3,3,6 3 plus 3 equals 6
>
> The `combined`  is not what I was expecting, as the same last eval() is
> returned for all three rows.
>
> Am I missing something? What has changed in the past three years?
>
> Mateo.
> --
> Mateo Obregón
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] data.frame weirdness

2023-11-14 Thread Gabor Grothendieck

Seems like a leaky abstraction.  If both representations are supposed
to be outwardly the same to the user then they should act the same and
if not then identical should not be TRUE.

On Tue, Nov 14, 2023 at 9:56 AM Deepayan Sarkar
 wrote:
>
> On Tue, 14 Nov 2023 at 09:41, Gabor Grothendieck
>  wrote:
> >
> > Also why should that difference result in different behavior?
>
> That's justifiable, I think; consider:
>
> > d1 = data.frame(a = 1:4)
> > d2 = d3 = data.frame(b = 1:2)
> > row.names(d3) = c("a", "b")
> > data.frame(d1, d2)
>   a b
> 1 1 1
> 2 2 2
> 3 3 1
> 4 4 2
> > data.frame(d1, d2)
>   a b
> 1 1 1
> 2 2 2
> 3 3 1
> 4 4 2
> > data.frame(d1, d3)
>   a b
> 1 1 1
> 2 2 2
> 3 3 1
> 4 4 2
> Warning message:
> In data.frame(d1, d3) :
>   row names were found from a short variable and have been discarded
> > data.frame(d2, d3)
>   b b.1
> a 1   1
> b 2   2
>
>
> > On Tue, Nov 14, 2023 at 9:38 AM Gabor Grothendieck
> >  wrote:
> > >
> > > In that case identical should be FALSE but  it is TRUE
>
> Yes, or at least both cases should warn (or not warn). Certainly not
> ideal, but one of the inevitable side effects of having two different
> ways of storing row names that R tries to pretend should be
> exchangeable, but are not (and some code not having caught up).
>
> Part of the problem, I think, is that it's not clear what the ideal
> behaviour should be in such cases (to warn or not to warn).
>
> Best,
> -Deepayan
>
> > > identical(a1, a2)
> > > ## [1] TRUE
> > >
> > >
> > > On Tue, Nov 14, 2023 at 8:58 AM Deepayan Sarkar
> > >  wrote:
> > > >
> > > > They differ in whether the row names are "automatic":
> > > >
> > > > > .row_names_info(a1)
> > > > [1] -3
> > > > > .row_names_info(a2)
> > > > [1] 3
> > > >
> > > > Best,
> > > > -Deepayan
> > > >
> > > > On Tue, 14 Nov 2023 at 08:23, Gabor Grothendieck
> > > >  wrote:
> > > > >
> > > > > What is going on here?  In the lines ending in  the inputs and 
> > > > > outputs
> > > > > are identical yet one gives a warning and the other does  not.
> > > > >
> > > > > a1 <- `rownames<-`(anscombe[1:3, ],  NULL)
> > > > > a2 <- anscombe[1:3, ]
> > > > >
> > > > > ix <- 5:8
> > > > >
> > > > > # input arguments to  are identical in both cases
> > > > >
> > > > > identical(stack(a1[ix]), stack(a2[ix]))
> > > > > ## [1] TRUE
> > > > > identical(a1[-ix], a2[-ix])
> > > > > ## [1] TRUE
> > > > >
> > > > >
> > > > > res1 <- data.frame(stack(a1[ix]), a1[-ix]) 
> > > > > res2 <- data.frame(stack(a2[ix]), a2[-ix]) 
> > > > > ## Warning message:
> > > > > ## In data.frame(stack(a2[ix]), a2[-ix]) :
> > > > > ##   row names were found from a short variable and have been 
> > > > > discarded
> > > > >
> > > > > # results are identical
> > > > > identical(res1, res2)
> > > > > ## [1] TRUE
> > > > >
> > > > >
> > > > > --
> > > > > Statistics & Software Consulting
> > > > > GKX Group, GKX Associates Inc.
> > > > > tel: 1-877-GKX-GROUP
> > > > > email: ggrothendieck at gmail.com
> > > > >
> > > > > __
> > > > > R-devel@r-project.org mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> > >
> > >
> > > --
> > > Statistics & Software Consulting
> > > GKX Group, GKX Associates Inc.
> > > tel: 1-877-GKX-GROUP
> > > email: ggrothendieck at gmail.com
> >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] data.frame weirdness

2023-11-14 Thread Gabor Grothendieck

Also why should that difference result in different behavior?

On Tue, Nov 14, 2023 at 9:38 AM Gabor Grothendieck
 wrote:
>
> In that case identical should be FALSE but  it is TRUE
>
> identical(a1, a2)
> ## [1] TRUE
>
>
> On Tue, Nov 14, 2023 at 8:58 AM Deepayan Sarkar
>  wrote:
> >
> > They differ in whether the row names are "automatic":
> >
> > > .row_names_info(a1)
> > [1] -3
> > > .row_names_info(a2)
> > [1] 3
> >
> > Best,
> > -Deepayan
> >
> > On Tue, 14 Nov 2023 at 08:23, Gabor Grothendieck
> >  wrote:
> > >
> > > What is going on here?  In the lines ending in  the inputs and outputs
> > > are identical yet one gives a warning and the other does  not.
> > >
> > > a1 <- `rownames<-`(anscombe[1:3, ],  NULL)
> > > a2 <- anscombe[1:3, ]
> > >
> > > ix <- 5:8
> > >
> > > # input arguments to  are identical in both cases
> > >
> > > identical(stack(a1[ix]), stack(a2[ix]))
> > > ## [1] TRUE
> > > identical(a1[-ix], a2[-ix])
> > > ## [1] TRUE
> > >
> > >
> > > res1 <- data.frame(stack(a1[ix]), a1[-ix]) 
> > > res2 <- data.frame(stack(a2[ix]), a2[-ix]) 
> > > ## Warning message:
> > > ## In data.frame(stack(a2[ix]), a2[-ix]) :
> > > ##   row names were found from a short variable and have been discarded
> > >
> > > # results are identical
> > > identical(res1, res2)
> > > ## [1] TRUE
> > >
> > >
> > > --
> > > Statistics & Software Consulting
> > > GKX Group, GKX Associates Inc.
> > > tel: 1-877-GKX-GROUP
> > > email: ggrothendieck at gmail.com
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] data.frame weirdness

2023-11-14 Thread Gabor Grothendieck

In that case identical should be FALSE but  it is TRUE

identical(a1, a2)
## [1] TRUE


On Tue, Nov 14, 2023 at 8:58 AM Deepayan Sarkar
 wrote:
>
> They differ in whether the row names are "automatic":
>
> > .row_names_info(a1)
> [1] -3
> > .row_names_info(a2)
> [1] 3
>
> Best,
> -Deepayan
>
> On Tue, 14 Nov 2023 at 08:23, Gabor Grothendieck
>  wrote:
> >
> > What is going on here?  In the lines ending in  the inputs and outputs
> > are identical yet one gives a warning and the other does  not.
> >
> > a1 <- `rownames<-`(anscombe[1:3, ],  NULL)
> > a2 <- anscombe[1:3, ]
> >
> > ix <- 5:8
> >
> > # input arguments to  are identical in both cases
> >
> > identical(stack(a1[ix]), stack(a2[ix]))
> > ## [1] TRUE
> > identical(a1[-ix], a2[-ix])
> > ## [1] TRUE
> >
> >
> > res1 <- data.frame(stack(a1[ix]), a1[-ix]) 
> > res2 <- data.frame(stack(a2[ix]), a2[-ix]) 
> > ## Warning message:
> > ## In data.frame(stack(a2[ix]), a2[-ix]) :
> > ##   row names were found from a short variable and have been discarded
> >
> > # results are identical
> > identical(res1, res2)
> > ## [1] TRUE
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] data.frame weirdness

2023-11-14 Thread Gabor Grothendieck

What is going on here?  In the lines ending in  the inputs and outputs
are identical yet one gives a warning and the other does  not.

a1 <- `rownames<-`(anscombe[1:3, ],  NULL)
a2 <- anscombe[1:3, ]

ix <- 5:8

# input arguments to  are identical in both cases

identical(stack(a1[ix]), stack(a2[ix]))
## [1] TRUE
identical(a1[-ix], a2[-ix])
## [1] TRUE


res1 <- data.frame(stack(a1[ix]), a1[-ix]) 
res2 <- data.frame(stack(a2[ix]), a2[-ix]) 
## Warning message:
## In data.frame(stack(a2[ix]), a2[-ix]) :
##   row names were found from a short variable and have been discarded

# results are identical
identical(res1, res2)
## [1] TRUE


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Multiple Assignment built into the R Interpreter?

2023-03-13 Thread Gabor Grothendieck

The gsubfn package can do that.

library(gsubfn)

# swap a and b without explicitly creating a temporary
 a <- 1; b <- 2
 list[a,b] <- list(b,a)

 # get eigenvectors and eigenvalues
 list[eval, evec] <- eigen(cbind(1,1:3,3:1))

 # get today's month, day, year
 require(chron)
 list[Month, Day, Year] <- month.day.year(unclass(Sys.Date()))

 # get first two components of linear model ignoring rest
 list[Coef, Resid] <- lm(rnorm(10) ~ seq(10))

 # assign Green and Blue (but not Red) components
 list[,Green,Blue]  <- col2rgb("aquamarine")

 # Assign QR and QRaux but not other components
 list[QR,,QRaux]  <- qr(c(1,1:3,3:1))


On Sat, Mar 11, 2023 at 7:47 AM Sebastian Martin Krantz
 wrote:
>
> Dear R Core,
>
> working on my dynamic factor modelling package, which requires several
> subroutines to create and update several system matrices, I come back to
> the issue of being annoyed by R not supporting multiple assignment out of
> the box like Matlab, Python and julia. e.g. something like
>
> A, C, Q, R = init_matrices(X, Y, Z)
>
> would be a great addition to the language. I know there are several
> workarounds such as the %<-% operator in the zeallot package or my own %=%
> operator in collapse, but these don't work well for package development as
> R CMD Check warns about missing global bindings for the created variables,
> e.g. I would have to use
>
> A <- C <- Q <- R <- NULL
> .c(A, C, Q, R) %=% init_matrices(X, Y, Z)
>
> in a package, which is simply annoying. Of course the standard way of
>
> init <- init_matrices(X, Y, Z)
>  A <- init$A; C <- init$C; Q <- init$Q; R <- init$R
> rm(init)
>
> is also super cumbersome compared to Python or Julia. Another reason is of
> course performance, even my %=% operator written in C has a non-negligible
> performance cost for very tight loops, compared to a solution at the
> interpretor level or in a primitive function such as `=`.
>
> So my conclusion at this point is that it is just significantly easier to
> implement such codes in Julia, in addition to the greater performance it
> offers. There are obvious reasons why I am still coding in R and C, thanks
> to the robust API and great ecosystem of packages, but adding this could be
> a presumably low-hanging fruit to make my life a bit easier. Several issues
> for this have been filed on Stackoverflow, the most popular one (
> https://stackoverflow.com/questions/7519790/assign-multiple-new-variables-on-lhs-in-a-single-line)
> has been viewed 77 thousand times.
>
> But maybe this has already been discussed here and already decided against.
> In that case, a way to browse R-devel archives to find out would be nice.
>
> Best regards,
>
> Sebastian
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Augment base::replace(x, list, value) to allow list= to be a predicate?

2023-03-08 Thread Gabor Grothendieck

This is getting way off topic. I wasn't suggesting that gsubfn, which
does a lot more than this simple
example, as the implementation.

I was pointing out that the replace function idea can be extended to
sub and gsub and showing what
it would do.

On Tue, Mar 7, 2023 at 9:41 PM Steve Martin  wrote:
>
> That's an interesting example, as it's conceptually similar to what
> Pavel is proposing, but structurally different. gsubfn() is more
> complicated than a simple switch in the body of the function, and
> wouldn't work well as an anonymous function.
>
> Multiple dispatch can nicely encompass both of these cases. For replace(),
>
> library(S7)
>
> replace <- new_generic("replace", c("x", "list"), function(x, list,
> values, ...) {
>   S7_dispatch()
> })
>
> method(replace, list(class_any, class_any)) <- base::replace
>
> method(replace, list(class_any, class_function)) <- function(x, list,
> values, ...) {
>   replace(x, list(x, ...), values)
> }
>
> x <- c(1 ,2, NA, 3)
> replace(x, is.na(x), 0)
> [1] 1 2 0 3
>
> replace(x, is.na, 0)
> [1] 1 2 0 3
>
> And for gsub(),
>
> gsub <- new_generic("gsub", c("pattern", "replacement"),
> function(pattern, replacement, x, ...) {
>   S7_dispatch()
> })
>
> method(gsub, list(class_character, class_character)) <- base::gsub
>
> # My quick-and-dirty implementation as an example
> method(gsub, list(class_character, class_function)) <-
> function(pattern, replacement, x) {
>   m <- regexpr(pattern, x)
>   res <- replacement(regmatches(x, m))
>   mapply(gsub, pattern, as.character(res), x, USE.NAMES = FALSE)
> }
>
> gsub("^..", toupper, c("abc", "xyz"))
> [1] "ABc" "XYz"
>
> But this isn't a simple change to replace() anymore, and I may just be
> spending too much time tinkering with Julia.
>
> Steve
>
> On Tue, 7 Mar 2023 at 07:34, Gabor Grothendieck  
> wrote:
> >
> > This could be extended to sub and gsub as well which gsubfn in the
> > gusbfn package already does:
> >
> >   library(gsubfn)
> >   gsubfn("^..", toupper, c("abc", "xyz"))
> >   ## [1] "ABc" "XYz"
> >
> > On Fri, Mar 3, 2023 at 7:22 PM Pavel Krivitsky  
> > wrote:
> > >
> > > Dear All,
> > >
> > > Currently, list= in base::replace(x, list, value) has to be an index
> > > vector. For me, at least, the most common use case is for list= to be
> > > some simple property of elements of x, e.g.,
> > >
> > > x <- c(1,2,NA,3)
> > > replace(x, is.na(x), 0)
> > >
> > > Particularly when using R pipes, which don't allow multiple
> > > substitutions, it would simplify many of such cases if list= could be a
> > > function that returns an index, e.g.,
> > >
> > > replace <- function (x, list, values, ...) {
> > >   # Here, list() refers to the argument, not the built-in.
> > >   if(is.function(list)) list <- list(x, ...)
> > >   x[list] <- values
> > >   x
> > > }
> > >
> > > Then, the following is possible:
> > >
> > > c(1,2,NA,3) |> replace(is.na, 0)
> > >
> > > Any thoughts?
> > > Pavel
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Augment base::replace(x, list, value) to allow list= to be a predicate?

2023-03-07 Thread Gabor Grothendieck

This could be extended to sub and gsub as well which gsubfn in the
gusbfn package already does:

  library(gsubfn)
  gsubfn("^..", toupper, c("abc", "xyz"))
  ## [1] "ABc" "XYz"

On Fri, Mar 3, 2023 at 7:22 PM Pavel Krivitsky  wrote:
>
> Dear All,
>
> Currently, list= in base::replace(x, list, value) has to be an index
> vector. For me, at least, the most common use case is for list= to be
> some simple property of elements of x, e.g.,
>
> x <- c(1,2,NA,3)
> replace(x, is.na(x), 0)
>
> Particularly when using R pipes, which don't allow multiple
> substitutions, it would simplify many of such cases if list= could be a
> function that returns an index, e.g.,
>
> replace <- function (x, list, values, ...) {
>   # Here, list() refers to the argument, not the built-in.
>   if(is.function(list)) list <- list(x, ...)
>   x[list] <- values
>   x
> }
>
> Then, the following is possible:
>
> c(1,2,NA,3) |> replace(is.na, 0)
>
> Any thoughts?
> Pavel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] summary.lm fails for difftime objects

2023-02-18 Thread Gabor Grothendieck

lm works with difftime objects but then if you try to get the summary
it fails with an error:

  fit <- lm(as.difftime(Time, units = "mins") ~ demand, BOD)
  summary(fit)
  ## Error in Ops.difftime((f - mean(f)), 2) :
  ##  '^' not defined for "difftime" objects

A number of other lm methods also fail, e.g. plot(fit), but others
work, e.g. coef(fit), resid(fit)

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] package.skeleton hello* files

2023-02-14 Thread Gabor Grothendieck

Is there some way to avoid the automatic generation of hello* files in
package.skeleton?
I found that the following does it on Windows but then it does not
create an R directory which I still
want and also it gives warnings which I don't want.
package.skeleton(code_files = "NUL")

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] anova and intercept

2022-12-27 Thread Gabor Grothendieck

Good idea.

On Mon, Dec 26, 2022 at 12:59 PM peter dalgaard  wrote:
>
> My usual advice on getting nonstandard F tests out of anova() is to fit the 
> models explicitly and compare.
>
> So how about this?
>
> fit1 <- lm(diff(extra,10) ~ 1, sleep)
> fit0 <- update(fit1, ~ -1)
> anova(fit0, fit1)
>
> -pd
>
> > On 26 Dec 2022, at 13:49 , Gabor Grothendieck  
> > wrote:
> >
> > Suppose we want to perform a paired test using the sleep data frame
> > with anova in R.  Then this works and gives the same p value as
> > t.test(extra ~ group, sleep, paired = TRUE, var.equal = TRUE)
> >
> >   ones <- rep(1, 10)
> >   anova(lm(diff(extra, 10) ~ ones + 0, sleep)
> >
> > This gives output but does not give an F test at all.
> >
> >   ones <- rep(1, 10)
> >   anova(lm(diff(extra, 10) ~ 1, sleep)
> >
> > Maybe there should be some way to force an F test to be produced for
> > the intercept in anova for consistency with t.test so that the second
> > code can be used.
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] anova and intercept

2022-12-26 Thread Gabor Grothendieck

Suppose we want to perform a paired test using the sleep data frame
with anova in R.  Then this works and gives the same p value as
t.test(extra ~ group, sleep, paired = TRUE, var.equal = TRUE)

   ones <- rep(1, 10)
   anova(lm(diff(extra, 10) ~ ones + 0, sleep)

This gives output but does not give an F test at all.

   ones <- rep(1, 10)
   anova(lm(diff(extra, 10) ~ 1, sleep)

Maybe there should be some way to force an F test to be produced for
the intercept in anova for consistency with t.test so that the second
code can be used.


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] pipes and setNames

2022-04-17 Thread Gabor Grothendieck

When trying to transform names in a pipeline one can do the following
where for this example we are making names upper case.

  BOD |> (\(x) setNames(x, toupper(names(x()

but that seems a bit ugly and verbose.

1. One possibility is to enhance setNames to allow a function as a
second argument.  In that case one could write:

  BOD |> setNames(toupper)

2. One can already do the following with the existing `with` but is
quite verbose:
  BOD |> list() |> setNames(".") |> with(setNames(., toupper(names(.
but could be made simpler with a utility function.

This utility function is not as good for setNames but would still
result in shorter code than the anonymous function in the example at
the top of this email and is more general so it would also apply in
other situations too.  Here R would define a function with. (note dot
at end) which would be defined and used as follows.

  with. <- function(data, expr, ...) {
eval(substitute(expr), list(. = data), enclos = parent.frame())
  }

  BOD |> with.(setNames(., toupper(names(.

with. is not as efficient as straight pipes but in many cases such as
this it does not really matter and one just wants to get it done
without the parenthesis laden anonymous function.

Having both of these two would be nice to make it easier to use R pipes.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] aggregate.formula and pipes

2022-01-26 Thread Gabor Grothendieck

Because aggregate.formula has a formula argument but the generic
has an x argument neither of these work:

  mtcars |> aggregate(x = mpg ~ cyl, FUN = mean)
  mtcars |> aggregate(formula = mpg ~ cyl, FUN = mean)

This does work:

  mtcars |> stats:::aggregate.formula(formula = mpg ~ cyl, FUN = mean)

Suggest that aggregate.formula be exported.

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] assignment

2021-12-27 Thread Gabor Grothendieck

In a recent SO post this came up (changed example to simplify it
here).  It seems that `test` still has the value sin.

  test <- sin
  environment(test)$test <- cos
  test(0)
  ## [1] 0

It appears to be related to the double use of `test` in `$<-` since if
we break it up it works as expected:

  test <- sin
  e <- environment(test)
  e$test <- cos
  test(0)
  ## [1] 1

`assign` also works:

  test <- sin
  assign("test", cos, environment(test))
  test(0)
  ## [1] 1

Can anyone shed some light on this?


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] socketConnection, delay when reading from

2021-11-27 Thread Gabor Grothendieck

Whether the length is variable or not isn't relevant. The point is
whether the message is prefaced by a length or command from which the
length can be derived.  Maybe it is not and you will have to rely on
inefficient methods but in many cases protocols are designed to avoid
such problems.

On Sat, Nov 27, 2021 at 9:40 AM Ben Engbers  wrote:
>
> No, according to the specification the minimal number of bytes that is
> returned is 2. There is no maximum. (When querying a database you never
> know on forehand how many records match the query so by definition you
> can't calculate the size of the message).
>
> In some C, C++ or Java-code I found on internet, it was possible to
> change the timeout settings so that there would be no delay. Of course
> this would have as consequence that in your code you have to deal with
> the possibility that the message has not been completely returned.
>
> In R you can set the timeout to 0 but that results in errors (at least
> on Windows)
>
> Op 27-11-2021 om 14:57 schreef Gabor Grothendieck:
> > Does the message start with a length or a command whose argument length is 
> > known
> > depending on the particular command?
> > If so first read the length or command and from that the length of the
> > remainder of
> > the message can be determined.
> >
> > On Sat, Nov 27, 2021 at 4:09 AM Ben Engbers  
> > wrote:
> >>
> >>
> >> Hi,
> >>
> >> I have been working on a R-client for the BaseX XML-database and version
> >> 0.9.2 is nearly finished (submitting version 0.9.0 was rejected by CRAN).
> >> Version 0.3 of RBaseX can be found here
> >> (https://cran.microsoft.com/web/packages/RBaseX/index.html).
> >>
> >> The client-server protocol specifies that the communication between the
> >> client and the database is based on a socket. The code (below) shows how
> >> I create that socket.
> >>
> >> Writing to the socket works perfect. Reading from the sockets (see
> >> second codeblock) also produces correct results. The problem however is
> >> that the timeout, as specified when initializing the socket, causes a 1
> >> second delay for every read-operation.
> >> I have experimented a lot with different settings and have been
> >> searching a lot on internet, but I can't find any method to get rid of
> >> that delay. (In C or C++ that should be easier but I have never before
> >> had any need to use those languages).
> >> The very first version of my client used a block-size of 1 when reading.
> >> That gave acceptable response times for small query-results but reading
> >> large responses from the database took very long time.
> >>
> >> Do you have any suggestions on how to cope with this problem?
> >>
> >> Ben Engbers
> >>
> >> -
> >>   CreateSocket = function(host, port = 1984L, username, password) {
> >> tryCatch(
> >>   {conn <- private$conn <- socketConnection(
> >> host = "localhost", port,
> >> open = "w+b", server = FALSE, blocking = TRUE, encoding =
> >> "UTF-8", timeout = 1)
> >>   }, error = function(e) {
> >> stop("Cannot open the connection")
> >>   }
> >> )
> >>
> >> -
> >>
> >> readBin_ <- function(conn) {
> >> chars_read <- raw(0)
> >> rd <- readBin(conn, what = "raw", 1024)
> >> while(length(rd) == 1024) {
> >>   chars_read <- c(chars_read, rd)
> >>   rd <- readBin(conn, "raw", 1024)
> >>   }
> >> if (length(rd) > 0) chars_read <- c(chars_read, rd)
> >> return(chars_read)
> >> }
> >>
> >> __
> >> R-package-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
> >
> >
>


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] socketConnection, delay when reading from

2021-11-27 Thread Gabor Grothendieck

Does the message start with a length or a command whose argument length is known
depending on the particular command?
If so first read the length or command and from that the length of the
remainder of
the message can be determined.

On Sat, Nov 27, 2021 at 4:09 AM Ben Engbers  wrote:
>
>
> Hi,
>
> I have been working on a R-client for the BaseX XML-database and version
> 0.9.2 is nearly finished (submitting version 0.9.0 was rejected by CRAN).
> Version 0.3 of RBaseX can be found here
> (https://cran.microsoft.com/web/packages/RBaseX/index.html).
>
> The client-server protocol specifies that the communication between the
> client and the database is based on a socket. The code (below) shows how
> I create that socket.
>
> Writing to the socket works perfect. Reading from the sockets (see
> second codeblock) also produces correct results. The problem however is
> that the timeout, as specified when initializing the socket, causes a 1
> second delay for every read-operation.
> I have experimented a lot with different settings and have been
> searching a lot on internet, but I can't find any method to get rid of
> that delay. (In C or C++ that should be easier but I have never before
> had any need to use those languages).
> The very first version of my client used a block-size of 1 when reading.
> That gave acceptable response times for small query-results but reading
> large responses from the database took very long time.
>
> Do you have any suggestions on how to cope with this problem?
>
> Ben Engbers
>
> -
>  CreateSocket = function(host, port = 1984L, username, password) {
>tryCatch(
>  {conn <- private$conn <- socketConnection(
>host = "localhost", port,
>open = "w+b", server = FALSE, blocking = TRUE, encoding =
> "UTF-8", timeout = 1)
>  }, error = function(e) {
>stop("Cannot open the connection")
>  }
>)
>
> -
>
> readBin_ <- function(conn) {
>chars_read <- raw(0)
>rd <- readBin(conn, what = "raw", 1024)
>while(length(rd) == 1024) {
>  chars_read <- c(chars_read, rd)
>  rd <- readBin(conn, "raw", 1024)
>  }
>if (length(rd) > 0) chars_read <- c(chars_read, rd)
>return(chars_read)
> }
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] order of operations

2021-08-27 Thread Gabor Grothendieck

It could be that the two sides of * are run in parallel in the future and maybe
not having a guarantee would simplify implementation?


On Fri, Aug 27, 2021 at 12:35 PM Avi Gross via R-devel
 wrote:
>
> Does anyone have a case where this construct has a valid use?
>
> Didn't Python  add a := operator recently that might be intended more for
> such uses as compared to using the standard assignment operators? I wonder
> if that has explicit guarantees of what happens in such cases, but that is
> outside what this forum cares about. Just for the heck of it, I tried the
> example there:
>
> >>> (x := 1) * (x := 2)
> 2
> >>> x
> 2
>
> Back to R, ...
>
> The constructs can get arbitrarily complex as in:
>
> (x <- (x <- 0) + 1) * (x <- (x <-2) + 1)
>
> My impression is that when evaluation is left to right and also innermost
> parentheses before outer ones, then something like the above goes in stages.
> The first of two parenthetical expressions is evaluated first.
>
> (x <- (x <- 0) + 1)
>
> The inner parenthesis set x to zero then the outer one increments x to 1.
> The full sub-expression evaluates to 1 and that value is set aside for a
> later multiplication.
>
> But then the second parenthesis evaluates similarly, from inside out:
>
> (x <- (x <-2) + 1)
>
> It clearly resets x to 2 then increments it by 1 to 3 and returns a value of
> 3. That is multiplied by the first sub-expression to result in 3.
>
> So for simple addition, even though it is commutative, is there any reason
> any compiler or interpreter should not follow rules like the above?
> Obviously with something like matrices, some operations are not abelian and
> require more strict interpretation in the right order.
>
> And note the expressions like the above can run into more complex quandaries
> such as when you have a conditional with OR or AND parts that may be
> short-circuited and in some cases, a variable you expected to be set, may
> remain unset or ...
>
> This reminds me a bit of languages that allow pre/post increment/decrement
> operators like ++ and -- and questions about what order things happen.
> Ideally, anything in which a deterministic order is not guaranteed should be
> flagged by the language at compile time (or when interpreted) and refuse to
> go on.
>
> All I can say with computer languages and adding ever more features,
> with greater power comes greater responsibility and often greater
> confusion.
>
>
> -Original Message-
> From: R-devel  On Behalf Of Gabor
> Grothendieck
> Sent: Friday, August 27, 2021 11:32 AM
> To: Thierry Onkelinx 
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] order of operations
>
> I agree and personally never do this but I would still like to know if it is
> guaranteed behavior or not.
>
> On Fri, Aug 27, 2021 at 11:28 AM Thierry Onkelinx 
> wrote:
>
> > IMHO this is just bad practice. Whether the result is guaranteed or
> > not, doesn't matter.
> >
> > ir. Thierry Onkelinx
> > Statisticus / Statistician
> >
> > Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN
> > BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie
> > & Kwaliteitszorg / Team Biometrics & Quality Assurance
> > thierry.onkel...@inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be
> >
> >
> > //
> > / To call in the statistician after the experiment
> > is done may be no more than asking him to perform a post-mortem
> > examination: he may be able to say what the experiment died of. ~ Sir
> > Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger
> > Brinner The combination of some data and an aching desire for an
> > answer does not ensure that a reasonable answer can be extracted from
> > a given body of data.
> > ~ John Tukey
> >
> > //
> > /
> >
> > <https://www.inbo.be>
> >
> >
> > Op vr 27 aug. 2021 om 17:18 schreef Gabor Grothendieck <
> > ggrothendi...@gmail.com>:
> >
> >> Are there any guarantees of whether x will equal 1 or 2 after this is
> run?
> >>
> >> (x <- 1) * (x <- 2)
> >> ## [1] 2
> >> x
> >> ## [1] 2
> >>
> >> --
> >> Statistics & Software Consulting
> >> GKX Group, GKX Associates Inc.
> >> tel: 1-877-GKX-GROUP
> >> email: ggrothendieck at gmail.com
> >>
>

Re: [Rd] order of operations

2021-08-27 Thread Gabor Grothendieck

I agree and personally never do this but I would still like to know if it
is guaranteed behavior or not.

On Fri, Aug 27, 2021 at 11:28 AM Thierry Onkelinx 
wrote:

> IMHO this is just bad practice. Whether the result is guaranteed or not,
> doesn't matter.
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkel...@inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
>
> ///
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> ///
>
> <https://www.inbo.be>
>
>
> Op vr 27 aug. 2021 om 17:18 schreef Gabor Grothendieck <
> ggrothendi...@gmail.com>:
>
>> Are there any guarantees of whether x will equal 1 or 2 after this is run?
>>
>> (x <- 1) * (x <- 2)
>> ## [1] 2
>> x
>> ## [1] 2
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] order of operations

2021-08-27 Thread Gabor Grothendieck

Are there any guarantees of whether x will equal 1 or 2 after this is run?

(x <- 1) * (x <- 2)
## [1] 2
x
## [1] 2

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] problem with pipes, textConnection and read.dcf

2021-08-10 Thread Gabor Grothendieck

This gives an error bit if the first gsub line is commented out then there is no
error even though it is equivalent code.

  L <- c("Variable:id", "Length:112630 ")

  L |>
gsub(pattern = " ", replacement = "") |>
gsub(pattern = " ", replacement = "") |>
textConnection() |>
read.dcf()
  ## Error in textConnection(gsub(gsub(L, pattern = " ", replacement = ""),  :
  ##  argument 'object' must deparse to a single character string

That is this works:

  L |>
# gsub(pattern = " ", replacement = "") |>
gsub(pattern = " ", replacement = "") |>
textConnection() |>
read.dcf()
  ##  Variable Length
  ## [1,] "id" "112630"

  R.version.string
  ## [1] "R version 4.1.0 RC (2021-05-16 r80303)"
  win.version()
  ## [1] "Windows 10 x64 (build 19042)"

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Feature request: Change default library path on Windows

2021-07-25 Thread Gabor Grothendieck

At the very least it would be nice if there were a function that displays all
the locations/paths currently being used in R.

On Sat, Jul 24, 2021 at 6:15 PM Steve Haroz  wrote:
>
> Hello,
>
> I'd like to propose moving the default library install location on Windows 
> from:
> %USERPROFILE%/Documents/R
> to some other location such as:
> %USERPROFILE%/R
>
> For many users the Documents folder is backed up or synchronized.
> Installing libraries thrashes Documents, and it causes synchronization
> issues with Dropbox (I confirm this one), OneDrive, and users with
> Network IT policies.
>
> The vast majority of R users won't touch that folder and don't need it
> backed up. And, its contents are not really "documents".
>
> There are many blog posts and websites with people complaining about
> it or offering workarounds that involve hand editing setting and
> environment files, which reduces R's usability and accessibility.
> * 
> https://community.rstudio.com/t/help-regarding-package-installation-renviron-rprofile-r-libs-r-libs-site-and-r-libs-user-oh-my/13888/5
> * https://accelebrate.com/library/how-to-articles/r-rstudio-library
> * 
> https://community.rstudio.com/t/r-studio-library-installation-directory/30725/2
> * https://twitter.com/sharoz/status/1418712098444546057
> * https://twitter.com/JoeHilgard/status/1419025358070878210
>
> This change should not interfere with any project environment managers
> like renv. It should just change the global default for Windows R
> users.
> Also, I believe that on Mac it is not in Documents, but it's in the
> equivalent of %USERPROFILE%/R.
>
> Thanks,
> Steve Haroz
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] S3 weirdness

2021-06-24 Thread Gabor Grothendieck

The fact that zoo:: in one part of the code has a side effect in
another seems not to be in the spirit of functional programming or
modularity.

On Thu, Jun 24, 2021 at 6:51 PM Simon Urbanek
 wrote:
>
> Gabor,
>
> just by using zoo::read.zoo() you *do* load the namespace:
>
> > args(zoo::read.zoo)
> function (file, format = "", tz = "", FUN = NULL, regular = FALSE,
> index.column = 1, drop = TRUE, FUN2 = NULL, split = NULL,
> aggregate = FALSE, ..., text, read = read.table)
> NULL
> > sessionInfo()
> R Under development (unstable) (2021-06-23 r80548)
> Platform: x86_64-apple-darwin19.6.0 (64-bit)
> Running under: macOS Catalina 10.15.7
>
> Matrix products: default
> BLAS:   /Volumes/Builds/R/build/lib/libRblas.dylib
> LAPACK: /Volumes/Builds/R/build/lib/libRlapack.dylib
>
> locale:
> [1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] zoo_1.8-9   compiler_4.2.0  grid_4.2.0  lattice_0.20-44
>
> which includes S3 method dispatch tables:
>
> > methods(as.ts)
> [1] as.ts.default* as.ts.zoo* as.ts.zooreg*
> see '?methods' for accessing help and source code
>
> so the behavior is as expected.
>
> Cheers,
> Simon
>
>
> > On 25/06/2021, at 9:56 AM, Gabor Grothendieck  
> > wrote:
> >
> > If we start up a vanilla session of R with no packages loaded and
> > type the single line of code below as the first line entered then
> > we get the output shown below.  The NA in the output and the length
> > of 7 indicate that as.ts dispatched as.ts.zoo since as.ts.default
> > would have resulted in a length of 6 with no NA's. It should not have
> > known about as.ts.zoo since we never  explicitly loaded the zoo
> > package using library or require.
> > zoo:: was only used to refer to read.zoo.  This seems to be a bug in
> > the way R is currently working.
> >
> >  as.ts(zoo::read.zoo(BOD))
> >  ## Time Series:
> >  ## Start = 1
> >  ## End = 7
> >  ## Frequency = 1
> >  ## [1]  8.3 10.3 19.0 16.0 15.6   NA 19.8
> >
> >  R.version.string
> >  ## [1] "R version 4.1.0 RC (2021-05-16 r80303)"
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] S3 weirdness

2021-06-24 Thread Gabor Grothendieck

If we start up a vanilla session of R with no packages loaded and
type the single line of code below as the first line entered then
we get the output shown below.  The NA in the output and the length
of 7 indicate that as.ts dispatched as.ts.zoo since as.ts.default
would have resulted in a length of 6 with no NA's. It should not have
known about as.ts.zoo since we never  explicitly loaded the zoo
package using library or require.
zoo:: was only used to refer to read.zoo.  This seems to be a bug in
the way R is currently working.

  as.ts(zoo::read.zoo(BOD))
  ## Time Series:
  ## Start = 1
  ## End = 7
  ## Frequency = 1
  ## [1]  8.3 10.3 19.0 16.0 15.6   NA 19.8

  R.version.string
  ## [1] "R version 4.1.0 RC (2021-05-16 r80303)"

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Additional an example for the forward pipe operator's documentation

2021-06-19 Thread Gabor Grothendieck

These also work in this particular case although not in general and the Call:
line in the output differs:

  mtcars |> subset(cyl == 4) |> with(lm(mpg ~ disp))
  mtcars |> with(lm(mpg ~ disp, subset = cyl == 4))

On Sat, Jun 19, 2021 at 7:23 AM Erez Shomron  wrote:
>
> Hello,
>
>
> While playing around with the new forward pipe operator I've noticed
> there's a possibly overlooked usage for the operator, which would be
> very beneficial to document.
>
> Whenever you want the LHS to be passed to an argument other than the
> first, the documented example demonstrates how to do that with an
> anonymous function.
>
>
> However the syntax is less than ideal (aesthetically):
>
>
> mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
>
>
> Fortunately there's a better, undocumented option using named arguments:
>
>
> mtcars |> subset(cyl == 4) |> lm(formula = mpg ~ disp)
>
>
> The reason this works, is because of how R matches arguments. As the
> language definition states, first named arguments are matched, then
> partial matching, and only afterwards positional arguments are matched.
>
>
> I think people that are frustrated with former syntax would be happy to
> know the latter option exists.
>
>
> That's just my opinion.
>
>
> Thank you for reading,
>
> Erez
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] reshape documentation

2021-04-11 Thread Gabor Grothendieck

One thing about varying is that reshape ignores the names on the
varying list and makes you
specify them all over again even though it could know what they are.
Note that we  had to
specify that names(varying) is the v.names.

  DF <- structure(list(A1 = 10L, A2 = 5L, B1 = 11L, B2 = 5L, C1 = 21L,
  C2 = 10L), class = "data.frame", row.names = c(NA, -1L))

  let <- gsub("\\d", "", names(DF))
  num <- gsub("\\D", "", names(DF))

  varying <- split(names(DF), num)
  reshape(DF, dir = "long", varying = varying, v.names = names(varying),
times = unique(let), timevar = "let")[-4]

On Sun, Apr 11, 2021 at 6:01 AM Deepayan Sarkar
 wrote:
>
> On Wed, Mar 17, 2021 at 7:55 PM Michael Dewey  wrote:
> >
> > Comments in line
> >
> > On 13/03/2021 09:50, SOEIRO Thomas wrote:
> > > Dear list,
> > >
> > > I have some questions/suggestions about reshape.
> > >
> > > 1) I think a good amount of the popularity of base::reshape alternative 
> > > is due to the complexity of reshape documentation. It is quite hard (at 
> > > least it is for me) to figure out what argument is needed for 
> > > respectively "long to wide" and "wide to long", because reshapeWide and 
> > > reshapeLong are documented together.
> > > - Do you agree with this?
> > > - Would you consider a proposal to modify the documentation?
> > > - If yes, what approach do you suggest? e.g. split in two pages?
> >
> > The current documentation is much clearer than it was when I first
> > started using R but we should always strive for more.
> >
> > I would suggest leaving the documentation in one place but it might be
> > helpful to add which direction is relevant for each parameter by placing
> > (to wide) or (to long) as appropriate. I think having completely
> > separate lists is not needed
>
> I have just checked in some updates to the documentation (in R-devel)
> which hopefully makes usage clearer. Any further suggestions are
> welcome. We are planning to add a short vignette as well, hopefully in
> time for R 4.1.0.
>
> > > 2) I do not think the documentation indicates that we can use varying 
> > > argument to rename variables in reshapeWide.
> > > - Is this worth documenting?
> > > - Is the construct list(c()) really needed?
> >
> > Yes, because you may have more than one set of variables which need to
> > correspond to a single variable in long format. So in your example if
> > you also had 11 variables for the temperature as well as the
> > concentration each would need specifying as a separate vector in the list.
>
> That's a valid point, but on the other hand, direction="long" already
> supports specifying 'varying' as a vector, and it does simplify the
> single variable case. So we decided to be consistent and allow it for
> direction="wide" too, hopefully with loud enough warnings in the
> documentation about using the feature carelessly.
>
> Best,
> -Deepayan
>
> > Michael
> >
> > >
> > > reshape(Indometh,
> > >  v.names = "conc",
> > >  idvar = "Subject",
> > >  timevar = "time",
> > >  direction = "wide",
> > >  varying = list(c("conc_0.25hr",
> > >   "conc_0.5hr",
> > >   "conc.0.75hr",
> > >   "conc_1hr",
> > >   "conc_1.25hr",
> > >   "conc_2hr",
> > >   "conc_3hr",
> > >   "conc_4hr",
> > >   "conc_5hr",
> > >   "conc_6hr",
> > >   "conc_8hr")))
> > >
> > > Thanks,
> > >
> > > Thomas
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > --
> > Michael
> > http://www.dewey.myzen.co.uk/home.html
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] replicate evaluates its second argument in wrong environment

2021-02-13 Thread Gabor Grothendieck

Currently replicate used within sapply within a function can fail
because it gets the environment for its second argument, which is
currently hard coded to be the parent frame, wrong.  See this link for
a full example of how it goes wrong and how it could be made to work
if it were possible to pass an envir argument to it.

https://stackoverflow.com/questions/66184446/sapplya-replicate-b-expression-no-longer-works-inside-a-function/66185079#66185079

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] brief update on the pipe operator in R-devel

2021-01-15 Thread Gabor Grothendieck

These are documented but still seem like serious deficiencies:

> f <- function(x, y) x + 10*y
> 3 |> x => f(x, x)
Error in f(x, x) : pipe placeholder may only appear once

> 3 |> x => f(1+x, 1)
Error in f(1 + x, 1) :
  pipe placeholder must only appear as a top-level argument in the RHS call

Also note:

 ?"=>"
No documentation for ‘=>’ in specified packages and libraries:
you could try ‘??=>’

On Tue, Dec 22, 2020 at 5:28 PM  wrote:
>
> It turns out that allowing a bare function expression on the
> right-hand side (RHS) of a pipe creates opportunities for confusion
> and mistakes that are too risky. So we will be dropping support for
> this from the pipe operator.
>
> The case of a RHS call that wants to receive the LHS result in an
> argument other than the first can be handled with just implicit first
> argument passing along the lines of
>
>  mtcars |> subset(cyl == 4) |> (\(d) lm(mpg ~ disp, data = d))()
>
> It was hoped that allowing a bare function expression would make this
> more convenient, but it has issues as outlined below. We are exploring
> some alternatives, and will hopefully settle on one soon after the
> holidays.
>
> The basic problem, pointed out in a comment on Twitter, is that in
> expressions of the form
>
>  1 |> \(x) x + 1 -> y
>  1 |> \(x) x + 1 |> \(y) x + y
>
> everything after the \(x) is parsed as part of the body of the
> function.  So these are parsed along the lines of
>
>  1 |> \(x) { x + 1 -> y }
>  1 |> \(x) { x + 1 |> \(y) x + y }
>
> In the first case the result is assigned to a (useless) local
> variable.  Someone writing this is more likely to have intended to
> assign the result to a global variable, as this would:
>
>  (1 |> \(x) x + 1) -> y
>
> In the second case the 'x' in 'x + y' refers to the local variable 'x'
> in the first RHS function. Someone writing this is more likely to have
> meant
>
>  (1 |> \(x) x + 1) |> \(y) x + y
>
> with 'x' in 'x + y' now referring to a global variable:
>
>  > x <- 2
>  > 1 |> \(x) x + 1 |> \(y) x + y
>  [1] 3
>  > (1 |> \(x) x + 1) |> \(y) x + y
>  [1] 4
>
> These issues arise with any approach in R that allows a bare function
> expression on the RHS of a pipe operation. It also arises in other
> languages with pipe operators. For example, here is the last example
> in Julia:
>
>  julia> x = 2
>  2
>  julia> 1 |> x -> x + 1 |> y -> x + y
>  3
>  julia> ( 1 |> x -> x + 1 ) |> y -> x + y
>  4
>
> Even though proper use of parentheses can work around these issues,
> the likelihood of making mistakes that are hard to track down is too
> high. So we will disallow the use of bare function expressions on the
> right hand side of a pipe.
>
> Best,
>
> luke
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-09 Thread Gabor Grothendieck

On Wed, Dec 9, 2020 at 12:36 PM Gabriel Becker  wrote:
> I mean, I think the bizarro pipe was a pretty clever piece of work. I was 
> impressed by what John did there, but I don't really know what you're 
> suggesting here. As you say, the bizarro pipe works now without any changes 
> and you're welcome to use it if you prefer it to base's (proposed/likely) |> 
> and magrittr's %>%.
>

If  |> exists then it will be impossible to avoid it unless the only
software you ever use is your own.
It's about the entire R ecosystem and what gets used because it is in the base.

It would still be possible to implement \(x)... without |>  so I would
go with that and rethink
the pipe situation.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-09 Thread Gabor Grothendieck

On Wed, Dec 9, 2020 at 10:08 AM Duncan Murdoch  wrote:
>
> You might be interested in this blog post by Michael Barrowman:
>
> https://michaelbarrowman.co.uk/post/the-new-base-pipe/
>
> He does some timing comparisons, and the current R-devel implementations
> of |> and \() do quite well.

It does bring out that the requirement of using functions to get around the
lack of placeholders is not free but exacts a small penalty in
terms of performance (in addition to verbosity).

The bizarro pipe supports placeholders and so doesn't require functions
as a workaround and thus would presumably be even faster.  It is also
perfectly consistent with the rest of R and requires no new syntax.
You have to explicitly add a dot as the first argument but this seems a better
compromise to me than those involved with |> .

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] the pipe |> and line breaks in pipelines

2020-12-09 Thread Gabor Grothendieck

On Wed, Dec 9, 2020 at 4:03 AM Timothy Goodman  wrote:
> But the bigger issue happens when I want to re-run just *part* of the
> pipeline.

Insert one of the following into the pipeline. It does not require that you
edit any lines.   It only involves inserting a line.

  print %>%
  { str(.); . } %>%
  { . ->> .save } %>%

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-08 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 9:09 AM Gabor Grothendieck
 wrote:
>
> On Sat, Dec 5, 2020 at 1:19 PM  wrote:
> > Let's get some experience
>
> Here is my last SO post using dplyr rewritten to use R 4.1 devel.  Seems

It occurred to me it would also be interesting to show this example
rewritten using John Mount's bizarro pipe
(which is clever use of syntax to get the effect of a pipe) with the
new \(x) ...
This can be done entirely in base R 4.1.  It does not use \>, just \(x)...

  "myfile.csv" ->.;
readLines(.) ->.;
gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) ->.;
read.csv(text = .) ->.;
replace(., 2:3, lapply(.[2:3], \(col) lapply(col, \(x)
eval(parse(text = x) ->.;
. -> DF

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] anonymous functions

2020-12-08 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 12:34 PM  wrote:
> I don't disagree in principle, but the reality is users want shortcuts
> and as a result various packages, in particular tidyverse, have been
> providing them. Mostly based on formulas, mostly with significant
> issues since formulas weren't designed for this, and mostly
> incompatible (tidyverse ones are compatible within tidyverse but not
> with others). And of course none work in sapply or lapply.

The formulas as functions in the gsubfn package work with nearly any function
including sapply and lapply.

  library(gsubfn)
  fn$lapply(1:3, ~ x + 1)
  ## [1] 2 3 4

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-08 Thread Gabor Grothendieck

Duncan Murdoch:
> I agree it's all about call expressions, but they aren't all being
> treated equally:
>
> x |> f(...)
>
> expands to f(x, ...), while
>
> x |> `function`(...)
>
> expands to `function`(...)(x).  This is an exception to the rule for

Yes, this is the problem.  It is trying to handle two different sorts of right
hand sides, calls and functions, using only syntax level operations and
it really needs to either make use of deeper information or have some
method that is available at the syntax level for identifying whether the
right hand side is a call or function.  In the latter case having two
operators would be one way to do it.

  f <- \(x) x + 1
  x |> f()  # call
  x |:> f  # function
  x |:> \(x) x + 1  # function

In the other case where deeper information is used there would only be one
operator and it would handle all cases but would use more than just syntax
level knowledge.

R solved these sorts of problems long ago using S3 and other object oriented
systems which dispatch different methods based on what the right hand side is.
The attempt to avoid using the existing or equivalent mechanisms seems to have
led to this problem.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-07 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 2:02 PM Kevin Ushey  wrote:
>
> IMHO the use of anonymous functions is a very clean solution to the
> placeholder problem, and the shorthand lambda syntax makes it much
> more ergonomic to use. Pipe implementations that crawl the RHS for
> usages of `.` are going to be more expensive than the alternatives. It

You wouldn't have to crawl the expression.  This does it at the syntax level.

  e <- quote( { gsub("x", "y", .) } )
  c(e[[1]], quote(. <- LHS), e[-1])

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-07 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 12:54 PM Duncan Murdoch  wrote:
> An advantage of the current implementation is that it's simple and easy
> to understand.  Once you make it a user-modifiable binary operator,
> things will go kind of nuts.
>
> For example, I doubt if there are many users of magrittr's pipe who
> really understand its subtleties, e.g. the example in Luke's paper where
> 1 %>% c(., 2) gives c(1,2), but 1 %>% c(c(.), 2) gives c(1, 1, 2). (And
> I could add 1 %>% c(c(.), 2, .) and  1 %>% c(c(.), 2, . + 2)  to
> continue the fun.)

The rule is not so complicated.  Automatic insertion is done unless
you use dot in the top level function or if you surround it with
{...}.  It really makes sense since if you use gsub(pattern,
replacement, .) then surely you don't want automatic insertion and if
you surround it with { ... } then you are explicitly telling it not
to.

Assuming the existence of placeholders a possible simplification would
be to NOT do automatic insertion if { ... } is used and to use it
otherwise although personally having used it for some time I find the
existing rule in magrittr generally does what you want.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] anonymous functions

2020-12-07 Thread Gabor Grothendieck

It is easier to understand a function if you can see the entire
function body at once on a page or screen and excessive verbosity
interferes with that.

On Mon, Dec 7, 2020 at 12:04 PM Therneau, Terry M., Ph.D. via R-devel
 wrote:
>
> “The shorthand form \(x) x + 1 is parsed as function(x) x + 1. It may be 
> helpful in making
> code containing simple function expressions more readable.”
>
> Color me unimpressed.
> Over the decades I've seen several "who can write the shortest code" threads: 
> in Fortran,
> in C, in Splus, ...   The same old idea that "short" is a synonym for either 
> elegant,
> readable, or efficient is now being recylced in the tidyverse.   The truth is 
> that "short"
> is actually an antonym for all of these things, at least for anyone else 
> reading the code;
> or for the original coder 30-60 minutes after the "clever" lines were 
> written.  Minimal
> use of the spacebar and/or the return key isn't usually held up as a goal, 
> but creeps into
> many practiioner's code as well.
>
> People are excited by replacing "function(" with "\("?  Really?   Are people 
> typing code
> with their thumbs?
> I am ambivalent about pipes: I think it is a great concept, but too many of 
> my colleagues
> think that using pipes = no need for any comments.
>
> As time goes on, I find my goal is to make my code less compact and more 
> readable.  Every
> bug fix or new feature in the survival package now adds more lines of 
> comments or other
> documentation than lines of code.  If I have to puzzle out what a line does, 
> what about
> the poor sod who inherits the maintainance?
>
>
> --
> Terry M Therneau, PhD
> Department of Health Science Research
> Mayo Clinic
> thern...@mayo.edu
>
> "TERR-ree THUR-noh"
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-07 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 10:11 AM  wrote:
> Or, keeping dplyr but with R-devel pipe and function shorthand:
>
> DF <- "myfile.csv" %>%
> readLines() |>
> \(.) gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) |>
> \(.) read.csv(text = .) |>
> mutate(across(2:3, \(col) lapply(col, \(x) eval(parse(text = x)
>
> Using named arguments to redirect to the implicit first does work,
> also in magrittr, but for me at least it is the kind of thing I would
> probably regret a month later when trying to figure out the code.

The gsub issue suggests that if one were to start afresh
that the arguments to gsub (and many other R functions)
should be rearranged.  Of course, that is precisely what
the tidyverse did.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-07 Thread Gabor Grothendieck

One could examine how magrittr works as a reference implementation if
there is a question on how something should function.  It's in
widespread use and seems to work well.

On Mon, Dec 7, 2020 at 10:20 AM Deepayan Sarkar
 wrote:
>
> On Mon, Dec 7, 2020 at 6:53 PM Gabor Grothendieck
>  wrote:
> >
> > On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch  
> > wrote:
> > > I agree it's all about call expressions, but they aren't all being
> > > treated equally:
> > >
> > > x |> f(...)
> > >
> > > expands to f(x, ...), while
> > >
> > > x |> `function`(...)
> > >
> > > expands to `function`(...)(x).  This is an exception to the rule for
> > > other calls, but I think it's a justified one.
> >
> > This admitted inconsistency is justified by what?  No argument has been
> > presented.  The justification seems to be implicitly driven by 
> > implementation
> > concerns at the expense of usability and language consistency.
>
> Sorry if I have missed something, but is your consistency argument
> basically that if
>
> foo <- function(x) x + 1
>
> then
>
> x |> foo
> x |> function(x) x + 1
>
> should both work the same? Suppose it did. Would you then be OK if
>
> x |> foo()
>
> no longer worked as it does now, and produced foo()(x) instead of foo(x)?
>
> If you are not OK with that and want to retain the current behaviour,
> what would you want to happen with the following?
>
> bar <- function(x) function(n) rnorm(n, mean = x)
>
> 10 |> bar(runif(1))() # works 'as expected' ~ bar(runif(1))(10)
> 10 |> bar(runif(1)) # currently bar(10, runif(1))
>
> both of which you probably want. But then
>
> baz <-  bar(runif(1))
> 10 |> baz
>
> (not currently allowed) will not be the same as what you would want from
>
> 10 |> bar(runif(1))
>
> which leads to a different kind of inconsistency, doesn't it?
>
> -Deepayan



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-07 Thread Gabor Grothendieck

On Sat, Dec 5, 2020 at 1:19 PM  wrote:
> Let's get some experience

Here is my last SO post using dplyr rewritten to use R 4.1 devel.  Seems
not too bad.  Was able to work around the placeholder for gsub by specifying
the arg names and used \(...)... elsewhere.  This does not address the
inconsistency discussed though.  I have indented by 2 spaced in case the
email wraps around.  The objective is to read myfile.csv including columns that
contain c(...) and integer(0), parsing and evaluating them.


  # taken from:
  # 
https://stackoverflow.com/questions/65174764/reading-in-a-csv-that-contains-vectors-cx-y-in-r/65175172#65175172

  # create input file for testing
  Lines <- 
"\"col1\",\"col2\",\"col3\"\n\"a\",1,integer(0)\n\"c\",c(3,4),5\n\"e\",6,7\n"
  cat(Lines, file = "myfile.csv")

  #
  # base R 4.1 (devel)
  DF <- "myfile.csv" |>
readLines() |>
gsub(pattern = r'{(c\(.*?\)|integer\(0\))}', replacement = r'{"\1"}') |>
\(.) read.csv(text = .) |>
\(.) replace(., 2:3, lapply(.[2:3], \(col) lapply(col, \(x)
eval(parse(text = x)

  #
  # dplyr/magrittr
  library(dplyr)

  DF <- "myfile.csv" %>%
readLines %>%
gsub(r'{(c\(.*?\)|integer\(0\))}', r'{"\1"}', .) %>%
{ read.csv(text = .) } %>%
mutate(across(2:3, ~ lapply(., function(x) eval(parse(text = x)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-07 Thread Gabor Grothendieck

On Mon, Dec 7, 2020 at 5:41 AM Duncan Murdoch  wrote:
> I agree it's all about call expressions, but they aren't all being
> treated equally:
>
> x |> f(...)
>
> expands to f(x, ...), while
>
> x |> `function`(...)
>
> expands to `function`(...)(x).  This is an exception to the rule for
> other calls, but I think it's a justified one.

This admitted inconsistency is justified by what?  No argument has been
presented.  The justification seems to be implicitly driven by implementation
concerns at the expense of usability and language consistency.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-06 Thread Gabor Grothendieck

This is really irrelevant.

On Sun, Dec 6, 2020 at 9:23 PM Gabriel Becker  wrote:
>
> Hi Gabor,
>
> On Sun, Dec 6, 2020 at 3:22 PM Gabor Grothendieck  
> wrote:
>>
>> I understand very well that it is implemented at the syntax level;
>> however, in any case the implementation is irrelevant to the principles.
>>
>> Here a similar example to the one I gave before but this time written out:
>>
>> This works:
>>
>>   3 |> function(x) x + 1
>>
>> but this does not:
>>
>>   foo <- function(x) x + 1
>>   3 |> foo
>>
>> so it breaks the principle of functions being first class objects.  foo and 
>> its
>> definition are not interchangeable.
>
>
> I understood what you meant as well.
>
> The issue is that neither foo nor its definition are being operated on, or 
> even exist within the scope of what |> is defined to do. You are used to 
> magrittr's %>% where arguably what you are saying would be true. But its not 
> here, in my view.
>
> Again, I think the issue is that |>, in as much as it "operates" on anything 
> at all (it not being a function, regardless of appearances), operates on call 
> expression objects, NOT on functions, ever.
>
> function(x) x parses to a call expression as does RHSfun(), while RHSfun does 
> not, it parses to a name, regardless of whether that symbol will eventually 
> evaluate to a closure or not.
>
> So in fact, it seems to me that, technically, all name symbols are being 
> treated exactly the same (none are allowed, including those which will lookup 
> to functions during evaluation), while all* call expressions are also being 
> treated the same. And again, there are no functions anywhere in either case.
>
> * except those that include that the parser flags as syntactically special.
>
>>
>> You have
>> to write 3 |> foo() but don't have to write 3 |> (function(x) x + 1)().
>
>
> I think you should probably be careful what you wish for here. I'm not 
> involved with this work and do not speak for any of those who were, but the 
> principled way to make that consistent while remaining entirely in the parser 
> seems very likely to be to require the latter, rather than not require the 
> former.
>
>>
>> This isn't just a matter of notation, i.e. foo vs foo(), but is a
>> matter of breaking
>> the way R works as a functional language with first class functions.
>
>
> I don't agree. Consider `+`
>
> Having
>
> foo <- get("+") ## note no `` here
> foo(x,y)
>
> parse and work correctly while
>
> +(x,y)
>
>  does not does not mean + isn't a function or that it is a "second class 
> citizen", it simply means that the parser has constraints on the syntax for 
> writing code that calls it that calling other functions are not subject to. 
> The fact that such syntactic constraints can exist proves that there is not 
> some overarching inviolable principle being violated here, I think. Now you 
> may say "well thats just the parser, it has to parse + specially because its 
> an operator with specific precedence etc". Well, the same exact thing is true 
> of |> I think.
>
> Best,
> ~G
>>
>>
>> On Sun, Dec 6, 2020 at 4:06 PM Gabriel Becker  wrote:
>> >
>> > Hi Gabor,
>> >
>> > On Sun, Dec 6, 2020 at 12:52 PM Gabor Grothendieck 
>> >  wrote:
>> >>
>> >> I think the real issue here is that functions are supposed to be
>> >> first class objects in R
>> >> or are supposed to be and |> would break that if if is possible
>> >> to write function(x) x + 1 on the RHS but not foo (assuming foo
>> >> was defined as that function).
>> >>
>> >> I don't think getting experience with using it can change that
>> >> inconsistency which seems serious to me and needs to
>> >> be addressed even if it complicates the implementation
>> >> since it drives to the heart of what R is.
>> >>
>> >
>> > With respect I think this is a misunderstanding of what is happening here.
>> >
>> > Functions are first class citizens. |> is, for all intents and purposes, a 
>> > macro.
>> >
>> > LHS |> RHS(arg2=5)
>> >
>> > parses to
>> >
>> > RHS(LHS, arg2 = 5)
>> >
>> > There are no functions at the point in time when the pipe transformation 
>> > happens, because no code has been evaluated. To know if a symbol is going 
>> > to evaluate to a function requires evaluation which is a ste

Re: [Rd] New pipe operator

2020-12-06 Thread Gabor Grothendieck

I think the real issue here is that functions are supposed to be
first class objects in R
or are supposed to be and |> would break that if if is possible
to write function(x) x + 1 on the RHS but not foo (assuming foo
was defined as that function).

I don't think getting experience with using it can change that
inconsistency which seems serious to me and needs to
be addressed even if it complicates the implementation
since it drives to the heart of what R is.

On Sat, Dec 5, 2020 at 1:08 PM Gabor Grothendieck
 wrote:
>
> The construct utils::head  is not that common but bare functions are
> very common and to make it harder to use the common case so that
> the uncommon case is slightly easier is not desirable.
>
> Also it is trivial to write this which does work:
>
> mtcars %>% (utils::head)
>
> On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage  
> wrote:
> >
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> > >
> > > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> > > wrote:
> > > >
> > > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> > > > >>   Error: function '::' not supported in RHS call of a pipe
> > > > >
> > > > > To me, this error looks much more friendly than magrittr's error.
> > > > > Some of them got too used to specify functions without (). This
> > > > > is OK until they use `::`, but when they need to use it, it takes
> > > > > hours to figure out why
> > > > >
> > > > > mtcars %>% base::head
> > > > > #> Error in .::base : unused argument (head)
> > > > >
> > > > > won't work but
> > > > >
> > > > > mtcars %>% head
> > > > >
> > > > > works. I think this is a too harsh lesson for ordinary R users to
> > > > > learn `::` is a function. I've been wanting for magrittr to drop the
> > > > > support for a function name without () to avoid this confusion,
> > > > > so I would very much welcome the new pipe operator's behavior.
> > > > > Thank you all the developers who implemented this!
> > > >
> > > > I agree, it's an improvement on the corresponding magrittr error.
> > > >
> > > > I think the semantics of not evaluating the RHS, but treating the pipe
> > > > as purely syntactical is a good decision.
> > > >
> > > > I'm not sure I like the recommended way to pipe into a particular 
> > > > argument:
> > > >
> > > >mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> > > >
> > > > or
> > > >
> > > >mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> > > >
> > > > both of which are equivalent to
> > > >
> > > >mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = 
> > > > d))()
> > > >
> > > > It's tempting to suggest it should allow something like
> > > >
> > > >mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> > >
> > > Which is really not that far off from
> > >
> > > mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> > >
> > > once you get used to it.
> > >
> > > One consequence of the implementation is that it's not clear how
> > > multiple occurrences of the placeholder would be in

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Gabor Grothendieck

Why is that ambiguous?  It works in magrittr.

> library(magrittr)
> 1 %>% `+`()
[1] 1

On Sun, Dec 6, 2020 at 1:09 PM  wrote:
>
> On Sun, 6 Dec 2020, Gabor Grothendieck wrote:
>
> > The following gives an error.
> >
> >   1 |> `+`(2)
> >   ## Error: function '+' is not supported in RHS call of a pipe
> >
> >   1 |> `+`()
> >   ## Error: function '+' is not supported in RHS call of a pipe
> >
> > but this does work:
> >
> >   1 |> (`+`)(2)
> >   ## [1] 3
> >
> >   1 |> (`+`)()
> >   ## [1] 1
> >
> > The error message suggests that this was intentional.
> > It isn't mentioned in ?"|>"
>
> ?"|>" says:
>
>   To avoid ambiguities, functions in ‘rhs’ calls may not
>   be syntactically special, such as ‘+’ or ‘if’.
>
> (used to say lhs; fixed now).
>
> Best,
>
> luke
>
> >
> > On Sat, Dec 5, 2020 at 1:19 PM  wrote:
> >>
> >> We went back and forth on this several times. The key advantage of
> >> requiring parentheses is to keep things simple and consistent.  Let's
> >> get some experience with that. If experience shows requiring
> >> parentheses creates too many issues then we can add the option of
> >> dropping them later (with special handling of :: and :::). It's easier
> >> to add flexibility and complexity than to restrict it after the fact.
> >>
> >> Best,
> >>
> >> luke
> >>
> >> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
> >>
> >>> I'm surprised by the aversion to
> >>>
> >>> mtcars |> nrow
> >>>
> >>> over
> >>>
> >>> mtcars |> nrow()
> >>>
> >>> and I think the decision to disallow the former should be
> >>> reconsidered.  The pipe operator is only going to be used when the rhs
> >>> is a function, so there is no ambiguity with omitting the parentheses.
> >>> If it's disallowed, it becomes inconsistent with other treatments like
> >>> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> >>> noise.  I'm not sure why this decision was taken
> >>>
> >>> If the only issue is with the double (and triple) colon operator, then
> >>> ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> >>> -- in other words, demote the precedence of |>
> >>>
> >>> Obviously (looking at the R-Syntax branch) this decision was
> >>> considered, put into place, then dropped, but I can't see why
> >>> precisely.
> >>>
> >>> Best,
> >>>
> >>>
> >>> Hugh.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> >>> wrote:
> >>>>
> >>>> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> >>>> wrote:
> >>>>>
> >>>>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >>>>>>>   Error: function '::' not supported in RHS call of a pipe
> >>>>>>
> >>>>>> To me, this error looks much more friendly than magrittr's error.
> >>>>>> Some of them got too used to specify functions without (). This
> >>>>>> is OK until they use `::`, but when they need to use it, it takes
> >>>>>> hours to figure out why
> >>>>>>
> >>>>>> mtcars %>% base::head
> >>>>>> #> Error in .::base : unused argument (head)
> >>>>>>
> >>>>>> won't work but
> >>>>>>
> >>>>>> mtcars %>% head
> >>>>>>
> >>>>>> works. I think this is a too harsh lesson for ordinary R users to
> >>>>>> learn `::` is a function. I've been wanting for magrittr to drop the
> >>>>>> support for a function name without () to avoid this confusion,
> >>>>>> so I would very much welcome the new pipe operator's behavior.
> >>>>>> Thank you all the developers who implemented this!
> >>>>>
> >>>>> I agree, it's an improvement on the corresponding magrittr error.
> >>>>>
> >>>>> I think the semantics of not evaluating the RHS, but treating the pipe
> >>>>> as purely syntactical is a go

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Gabor Grothendieck

The following gives an error.

   1 |> `+`(2)
   ## Error: function '+' is not supported in RHS call of a pipe

   1 |> `+`()
   ## Error: function '+' is not supported in RHS call of a pipe

but this does work:

   1 |> (`+`)(2)
   ## [1] 3

   1 |> (`+`)()
   ## [1] 1

The error message suggests that this was intentional.
It isn't mentioned in ?"|>"

On Sat, Dec 5, 2020 at 1:19 PM  wrote:
>
> We went back and forth on this several times. The key advantage of
> requiring parentheses is to keep things simple and consistent.  Let's
> get some experience with that. If experience shows requiring
> parentheses creates too many issues then we can add the option of
> dropping them later (with special handling of :: and :::). It's easier
> to add flexibility and complexity than to restrict it after the fact.
>
> Best,
>
> luke
>
> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
>
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> >>
> >> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> >> wrote:
> >>>
> >>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >   Error: function '::' not supported in RHS call of a pipe
> 
>  To me, this error looks much more friendly than magrittr's error.
>  Some of them got too used to specify functions without (). This
>  is OK until they use `::`, but when they need to use it, it takes
>  hours to figure out why
> 
>  mtcars %>% base::head
>  #> Error in .::base : unused argument (head)
> 
>  won't work but
> 
>  mtcars %>% head
> 
>  works. I think this is a too harsh lesson for ordinary R users to
>  learn `::` is a function. I've been wanting for magrittr to drop the
>  support for a function name without () to avoid this confusion,
>  so I would very much welcome the new pipe operator's behavior.
>  Thank you all the developers who implemented this!
> >>>
> >>> I agree, it's an improvement on the corresponding magrittr error.
> >>>
> >>> I think the semantics of not evaluating the RHS, but treating the pipe
> >>> as purely syntactical is a good decision.
> >>>
> >>> I'm not sure I like the recommended way to pipe into a particular 
> >>> argument:
> >>>
> >>>mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> >>>
> >>> or
> >>>
> >>>mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> >>>
> >>> both of which are equivalent to
> >>>
> >>>mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
> >>>
> >>> It's tempting to suggest it should allow something like
> >>>
> >>>mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> >>
> >> Which is really not that far off from
> >>
> >> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> >>
> >> once you get used to it.
> >>
> >> One consequence of the implementation is that it's not clear how
> >> multiple occurrences of the placeholder would be interpreted. With
> >> magrittr,
> >>
> >> sort(runif(10)) %>% ecdf(.)(.)
> >> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >>
> >> This is probably what you would expect, if you expect it to work at all, 
> >> and not
> >>
> >> ecdf(sort(runif(10)))(sort(runif(10)))
> >>
> >> There would be no such ambiguity with anonymous functions
> >>
> >> sort(runif(10)) |> \(.) ecdf(.)(.)
> >>
> >> -Deepayan
> >>
> >>> which would be expanded to something equivalent to the other versions:
> >>> but that makes it quite a bit more complicated.  (Maybe _ or \. should
> >>> be used instead of ., since those are not legal variable names.)
> >>>
> >>> I don't think there should be an attempt to copy magrittr's special
> >>> casing of how . is used in determining whether to also include the
> >>> previous value as first argument.
> >>>
> >>> Duncan Murdoch
> >>>
> >>>
> 
>  Best,
>  Hiroaki Yutani
> 
>  2020年12月4日(金) 20:51 Duncan Murdoch :
> >
> > Just saw this on the R-devel news:
> >
> >
> > R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
> > notation for creating functions,

Re: [Rd] installling R-devel on Windows

2020-12-06 Thread Gabor Grothendieck

I meant on the R devel download page. (I was just installing Rtools40
on another computer.)

On Sun, Dec 6, 2020 at 10:27 AM Gabor Grothendieck
 wrote:
>
> I tried it from another computer and it did work.  Is there some way
> of installing R devel using the analog of the R --vanilla flag so that I can
> do it in a reproducible manner. It seems to remember prior settings
> and maybe that is a problem although one would not expect a setting
> that could lead to what occurs.  I don't see anything documenting flags
> on the Rtools40 page.
>
> On Sat, Dec 5, 2020 at 9:52 AM Gabor Grothendieck
>  wrote:
> >
> > I clicked on the download link at
> > https://cran.r-project.org/bin/windows/base/rdevel.html
> > and then opened the downloaded file which starts the installation process.
> > I specified a new directory that does not exist, R-test, to be sure that
> > it would not get confused with an old directory.
> >
> > I repeated this using different directories and on different days.
> >
> > I tried it from a user and an Admin account.
> >
> > If I use the exact same procedure to install R-4.0.3patched it works.
> >
> > I have successfully downloaded and installed R maybe hundreds
> > of times over the last 10 to 20 years and have never before
> > encountered this.
> >
> >
> >
> >
> >
> > On Sat, Dec 5, 2020 at 9:13 AM Jeroen Ooms  wrote:
> > >
> > > On Sat, Dec 5, 2020 at 3:00 PM Gabor Grothendieck
> > >  wrote:
> > > >
> > > > When I try to install r-devel on Windows all I get is this.  No other
> > > > files.  This also occurred yesterday as well.
> > >
> > > It just tested it to be sure, but it works fine for me. Are you using
> > > the official installer from
> > > https://cran.r-project.org/bin/windows/base/rdevel.html ?
> > >
> > > The default install path is not R-test C:\Program Files\R\R-devel.
> > > Perhaps you have old files lingering from previous installations that
> > > cause permission problems during the installation process?
> >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] installling R-devel on Windows

2020-12-06 Thread Gabor Grothendieck

I tried it from another computer and it did work.  Is there some way
of installing R devel using the analog of the R --vanilla flag so that I can
do it in a reproducible manner. It seems to remember prior settings
and maybe that is a problem although one would not expect a setting
that could lead to what occurs.  I don't see anything documenting flags
on the Rtools40 page.

On Sat, Dec 5, 2020 at 9:52 AM Gabor Grothendieck
 wrote:
>
> I clicked on the download link at
> https://cran.r-project.org/bin/windows/base/rdevel.html
> and then opened the downloaded file which starts the installation process.
> I specified a new directory that does not exist, R-test, to be sure that
> it would not get confused with an old directory.
>
> I repeated this using different directories and on different days.
>
> I tried it from a user and an Admin account.
>
> If I use the exact same procedure to install R-4.0.3patched it works.
>
> I have successfully downloaded and installed R maybe hundreds
> of times over the last 10 to 20 years and have never before
> encountered this.
>
>
>
>
>
> On Sat, Dec 5, 2020 at 9:13 AM Jeroen Ooms  wrote:
> >
> > On Sat, Dec 5, 2020 at 3:00 PM Gabor Grothendieck
> >  wrote:
> > >
> > > When I try to install r-devel on Windows all I get is this.  No other
> > > files.  This also occurred yesterday as well.
> >
> > It just tested it to be sure, but it works fine for me. Are you using
> > the official installer from
> > https://cran.r-project.org/bin/windows/base/rdevel.html ?
> >
> > The default install path is not R-test C:\Program Files\R\R-devel.
> > Perhaps you have old files lingering from previous installations that
> > cause permission problems during the installation process?
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.POSIXct.numeric change default of origin argument

2020-12-06 Thread Gabor Grothendieck

For example, this works:

  library(zoo)
  as.Date(0)
  ## [1] "1970-01-01"

On Sun, Dec 6, 2020 at 7:10 AM Achim Zeileis  wrote:
>
> On Sun, 6 Dec 2020, Jan Gorecki wrote:
>
> > Hello all,
> >
> > I would like to propose to change the default value for "origin"
> > argument in as.POSIXct.numeric method, from current missing to new
> > "1970-01-01".
> > My proposal is motivated by the fact that this is the most commonly
> > needed value for "origin" argument and having that as a default seems
> > reasonable.
> > Proposed change seems to be pretty safe because it is now an error.
>
> I would also be in favor of this (and have been for years), mostly to make
> it consistent with the as.numeric() method. Same for "Date".
>
> To support the latter, the "zoo" package provides a separate as.Date()
> generic that enables the as.Date.numeric() with different default.
>
> The main argument of R Core against it is that it is too uncertain whether
> the origin is really 1970-01-01, e.g., when importing from Excel or SAS.
>
> Best wishes,
> Z
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New pipe operator

2020-12-05 Thread Gabor Grothendieck

The construct utils::head  is not that common but bare functions are
very common and to make it harder to use the common case so that
the uncommon case is slightly easier is not desirable.

Also it is trivial to write this which does work:

mtcars %>% (utils::head)

On Sat, Dec 5, 2020 at 11:59 AM Hugh Parsonage  wrote:
>
> I'm surprised by the aversion to
>
> mtcars |> nrow
>
> over
>
> mtcars |> nrow()
>
> and I think the decision to disallow the former should be
> reconsidered.  The pipe operator is only going to be used when the rhs
> is a function, so there is no ambiguity with omitting the parentheses.
> If it's disallowed, it becomes inconsistent with other treatments like
> sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> noise.  I'm not sure why this decision was taken
>
> If the only issue is with the double (and triple) colon operator, then
> ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> -- in other words, demote the precedence of |>
>
> Obviously (looking at the R-Syntax branch) this decision was
> considered, put into place, then dropped, but I can't see why
> precisely.
>
> Best,
>
>
> Hugh.
>
>
>
>
>
>
>
> On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> wrote:
> >
> > On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> > wrote:
> > >
> > > On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> > > >>   Error: function '::' not supported in RHS call of a pipe
> > > >
> > > > To me, this error looks much more friendly than magrittr's error.
> > > > Some of them got too used to specify functions without (). This
> > > > is OK until they use `::`, but when they need to use it, it takes
> > > > hours to figure out why
> > > >
> > > > mtcars %>% base::head
> > > > #> Error in .::base : unused argument (head)
> > > >
> > > > won't work but
> > > >
> > > > mtcars %>% head
> > > >
> > > > works. I think this is a too harsh lesson for ordinary R users to
> > > > learn `::` is a function. I've been wanting for magrittr to drop the
> > > > support for a function name without () to avoid this confusion,
> > > > so I would very much welcome the new pipe operator's behavior.
> > > > Thank you all the developers who implemented this!
> > >
> > > I agree, it's an improvement on the corresponding magrittr error.
> > >
> > > I think the semantics of not evaluating the RHS, but treating the pipe
> > > as purely syntactical is a good decision.
> > >
> > > I'm not sure I like the recommended way to pipe into a particular 
> > > argument:
> > >
> > >mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> > >
> > > or
> > >
> > >mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> > >
> > > both of which are equivalent to
> > >
> > >mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
> > >
> > > It's tempting to suggest it should allow something like
> > >
> > >mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> >
> > Which is really not that far off from
> >
> > mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> >
> > once you get used to it.
> >
> > One consequence of the implementation is that it's not clear how
> > multiple occurrences of the placeholder would be interpreted. With
> > magrittr,
> >
> > sort(runif(10)) %>% ecdf(.)(.)
> > ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >
> > This is probably what you would expect, if you expect it to work at all, 
> > and not
> >
> > ecdf(sort(runif(10)))(sort(runif(10)))
> >
> > There would be no such ambiguity with anonymous functions
> >
> > sort(runif(10)) |> \(.) ecdf(.)(.)
> >
> > -Deepayan
> >
> > > which would be expanded to something equivalent to the other versions:
> > > but that makes it quite a bit more complicated.  (Maybe _ or \. should
> > > be used instead of ., since those are not legal variable names.)
> > >
> > > I don't think there should be an attempt to copy magrittr's special
> > > casing of how . is used in determining whether to also include the
> > > previous value as first argument.
> > >
> > > Duncan Murdoch
> > >
> > >
> > > >
> > > > Best,
> > > > Hiroaki Yutani
> > > >
> > > > 2020年12月4日(金) 20:51 Duncan Murdoch :
> > > >>
> > > >> Just saw this on the R-devel news:
> > > >>
> > > >>
> > > >> R now provides a simple native pipe syntax ‘|>’ as well as a shorthand
> > > >> notation for creating functions, e.g. ‘\(x) x + 1’ is parsed as
> > > >> ‘function(x) x + 1’. The pipe implementation as a syntax transformation
> > > >> was motivated by suggestions from Jim Hester and Lionel Henry. These
> > > >> features are experimental and may change prior to release.
> > > >>
> > > >>
> > > >> This is a good addition; by using "|>" instead of "%>%" there should be
> > > >> a chance to get operator precedence right.  That said, the ?Syntax help
> > > >> topic hasn't been updated, so I'm not sure where it fits in.
> > > >>
> > > >> There are some choices that take a little getting used to:
> > > >>
> > > >>   > mtcars |> head
> > >

Re: [Rd] installling R-devel on Windows

2020-12-05 Thread Gabor Grothendieck

I clicked on the download link at
https://cran.r-project.org/bin/windows/base/rdevel.html
and then opened the downloaded file which starts the installation process.
I specified a new directory that does not exist, R-test, to be sure that
it would not get confused with an old directory.

I repeated this using different directories and on different days.

I tried it from a user and an Admin account.

If I use the exact same procedure to install R-4.0.3patched it works.

I have successfully downloaded and installed R maybe hundreds
of times over the last 10 to 20 years and have never before
encountered this.

On Sat, Dec 5, 2020 at 9:13 AM Jeroen Ooms  wrote:
>
> On Sat, Dec 5, 2020 at 3:00 PM Gabor Grothendieck
>  wrote:
> >
> > When I try to install r-devel on Windows all I get is this.  No other
> > files.  This also occurred yesterday as well.
>
> It just tested it to be sure, but it works fine for me. Are you using
> the official installer from
> https://cran.r-project.org/bin/windows/base/rdevel.html ?
>
> The default install path is not R-test C:\Program Files\R\R-devel.
> Perhaps you have old files lingering from previous installations that
> cause permission problems during the installation process?

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] installling R-devel on Windows

2020-12-05 Thread Gabor Grothendieck

When I try to install r-devel on Windows all I get is this.  No other
files.  This also occurred yesterday as well.

 Directory of C:\Program Files\R\R-test

12/05/2020  08:56 AM  .
12/05/2020  08:56 AM  ..
12/05/2020  08:56 AM11,503 unins000.dat
12/05/2020  08:56 AM 2,594,145 unins000.exe
   2 File(s)  2,605,648 bytes

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] import with except(ion)

2020-10-31 Thread Gabor Grothendieck

coxreg could search for frailty and issue a warning or error if found.  This
returns TRUE if frailty is used in the formula argument as a function but
not otherwise.  That would allow implementation of a nicer message than
if it were just reported as a missing function.

find_frailty <- function(e) {
if (is.logical(e)) return(e)
if (length(e) > 1) {
if (identical(e[[1]], as.name("frailty"))) return(TRUE)
for (i in 1:length(e)) if (isTRUE(Recall(e[[i]]))) return(TRUE)
}
FALSE
}
find_frailty(frailty ~ frailty)
## [1] FALSE
fo <- Surv(time, status) ~ age + frailty(inst)
find_frailty(fo)
## [1] TRUE

On Fri, Oct 30, 2020 at 2:46 PM Göran Broström  wrote:
>
> My CRAN package eha depends on the survival package, and that creates
> problems with innocent users: It is about the 'frailty' function
> (mainly). While (after 'library(eha)')
>
> f1 <- coxph(Surv(time, status) ~ age + frailty(inst), data = lung)
>
> produces what you would expect (a frailty survival analysis), the use of
> the coxreg function from eha
>
> f2 <- coxreg(Surv(time, status) ~ age + frailty(inst), data = lung)
>
> produces (almost) nonsense. That's because the survival::frailty
> function essentially returns its input and coxreg is happy with that,
> treats it as an ordinary numeric (or factor) covariate, and nonsense is
> produced, but some users think otherwise. (Maybe it would be better to
> introduce frailty in a separate argument?)
>
> I want to prevent this to happen, but I do not understand how to do it
> in the best way. I tried to move survival from Depends: to Imports: and
> adding import(survival, except = c(frailty, cluster)) to NAMESPACE. This
> had the side effect that a user must qualify the Surv function by
> survival::Surv, not satisfactory (similarly for other popular functions
> in survival).
>
> Another option I thought of was to define my own Surv function as
> Surv <- survival::Surv in my package, but it doesn't feel right.
> It seems to work, though.
>
> As you may understand from this, I am not very familiar with these
> issues. I have used Depends: survival for a long time and been happy
> with that.
>
> Any help on this is highly appreciated.
>
> Göran
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] ftable <-> data.frame etc {was "justify hard coded in format.ftable"}

2020-05-23 Thread Gabor Grothendieck



That's not the problem.  The problem is that if you have

   ft <- ftable(UCBAdmissions, row.vars = 3:2)
   ft
   ## Admit Admitted Rejected
   ## Dept Gender
   ## AMale  512  313
   ##  Female 89   19
   ## BMale  353  207
   ##  Female 178
   ## CMale  120  205
  ## ... etc ...

then as.data.frame(ft) gives a deconstructed 24x4 data.frame
like this:

  as.data.frame(ft)
  ##Dept GenderAdmit Freq
  ## 1 A   Male Admitted  512
  ## 2 B   Male Admitted  353
  ## 3 C   Male Admitted  120
  ## 4 D   Male Admitted  138
  ## ... etc ...

which is fine but it does not address the problem here.  The
problem here is that we want a a usable data.frame
having columns that correspond to ft.  We want this 12x4
data.frame:

  ##Dept Gender Admitted Rejected
  ## 1 A   Male  512  313
  ## 2 A Female   89   19
  ## 3 B   Male  353  207
  ## 4 B Female   178
  ## ... etc ...

The links I provided already pointed to the code below which
someone posted on SO and solves the problem but I would
have thought this would be easy to do in base R and natural
to provide.

  ftable2df <- function(mydata) {
ifelse(class(mydata) == "ftable",
mydata <- mydata, mydata <- ftable(mydata))
dfrows <- rev(expand.grid(rev(attr(mydata, "row.vars"
dfcols <- as.data.frame.matrix(mydata)
names(dfcols) <- do.call(
  paste, c(rev(expand.grid(rev(attr(mydata, "col.vars", sep = "_"))
cbind(dfrows, dfcols)
  }
  ftable2df(ft)
  ##Dept Gender Admitted Rejected
  ## 1 A   Male  512  313
  ## 2 A Female   89   19
  ## 3 B   Male  353  207
  ## 4 B Female   178
  ## ... etc ...

 Fri, May 15, 2020 at 12:25 PM Martin Maechler
 wrote:
>
> >>>>> Gabor Grothendieck
> >>>>> on Thu, 14 May 2020 06:56:06 -0400 writes:
>
> > If you are looking at ftable could you also consider adding
> > a way to convert an ftable into a usable data.frame such as
> > the ftable2df function defined here:
>
> > https://stackoverflow.com/questions/11141406/reshaping-an-array-to-data-frame/11143126#11143126
>
> > and there is an example of using it here:
>
> > https://stackoverflow.com/questions/61333663/manipulating-an-array-into-a-data-frame-in-base-r/61334756#61334756
>
> > Being able to move back and forth between various base class representations
> > seems like something that would be natural to provide.
>
> Sure!
>
> But there is already an  as.data.frame() method for "ftable",
> {and I would not want theif(! .. ftable)  ftable(x)  part anyway.
>
> What I think many useRs / programmeRs  very often forget about
> is more-than-2-dimensional arrays {which *are* at the beginning
> of that SO question} and that these are often by far the most
> efficient data structure (rather than the corresponding data frames).
>
> and even less people forget that a "table" in base R is just a
> special case of a 1-D, 2-D, 3-D,  array.
> (Semantically a special case: "array" with non-negative integer content
>
> I'd claim that everything you here ("move back and forth between
> ...") is already there in the "ftable" implementation in stats,
> notably in the source file  src/library/stats/R/ftable.R
>  -> https://svn.r-project.org/R/trunk/src/library/stats/R/ftable.R
>
> The problem may be in
>
> 1) too sparse documentation about the close relations
>"ftable" <-> "array" <-> "table" <-> "data.frame"
>
> 2) people not thinking often enough about more-than-2D-arrays and the
>   special corresponding class "table" in R.
>
> To start with one:
>
> > str(UCBAdmissions)
>  'table' num [1:2, 1:2, 1:6] 512 313 89 19 353 207 17 8 120 205 ...
>  - attr(*, "dimnames")=List of 3
>   ..$ Admit : chr [1:2] "Admitted" "Rejected"
>   ..$ Gender: chr [1:2] "Male" "Female"
>   ..$ Dept  : chr [1:6] "A" "B" "C" "D" ...
> >
>
> and look at the *examples* in the help files and the S3 methods
>
> methods(class = "ftable")
> [1] as.data.frame as.matrix as.table  formathead  
> print
> [7] tail
> see '?methods' for accessing help and source code
> > methods(class = "table")
>  [1] [ aperm as.data.frame Axis  coerce
> initialize
>  [7] lines plot  pointsprint show  
> sl

Re: [Rd] justify hard coded in format.ftable

2020-05-15 Thread Gabor Grothendieck



If you are looking at ftable could you also consider adding
a way to convert an ftable into a usable data.frame such as
the ftable2df function defined here:

https://stackoverflow.com/questions/11141406/reshaping-an-array-to-data-frame/11143126#11143126

and there is an example of using it here:

https://stackoverflow.com/questions/61333663/manipulating-an-array-into-a-data-frame-in-base-r/61334756#61334756

Being able to move back and forth between various base class representations
seems like something that would be natural to provide.

Thanks.

On Thu, May 14, 2020 at 5:32 AM Martin Maechler
 wrote:
>
> > SOEIRO Thomas
> > on Wed, 13 May 2020 20:27:15 + writes:
>
> > Dear all,
> > I haven't received any feedback so far on my proposal to make "justify" 
> argument available in stats:::format.ftable
>
> > Is this list the appropriate place for this kind of proposal?
>
> Yes, it is.. Actually such a post is even a "role model" post
> for R-devel.
>
> > I hope this follow-up to my message won't be taken as rude. Of course 
> it's not meant to be, but I'm not used to the R mailing lists...
>
> well, there could be said much, and many stories told here ... ;-)
>
> > Thank you in advance for your comments,
>
> > Best,
> > Thomas
>
> The main reasons for "no reaction" (for such nice post) probably
> are combination of the following
>
> - we are busy
> - if we have time, we think other things are more exciting
> - we have not used ftable much/at all and are not interested.
>
> Even though the first 2 apply to me, I'll have a 2nd look into
> your post now, and may end up well agreeing with your proposal.
>
> Martin Maechler
> ETH Zurich  and  R Core team
>
>
>
>
> >> Dear all,
> >>
> >> justify argument is hard coded in format.ftable:
> >>
> >> cbind(apply(LABS, 2L, format, justify = "left"),
> >> apply(DATA, 2L, format, justify = "right"))
> >>
> >> It would be useful to have the possibility to modify the argument 
> between c("left", "right", "centre", "none") as in format.default.
> >>
> >> The lines could be changed to:
> >>
> >> if(length(justify) != 2)
> >> stop("justify must be length 2")
> >> cbind(apply(LABS, 2L, format, justify = justify[1]),
> >> apply(DATA, 2L, format, justify = justify[2]))
> >>
> >> The argument justify could defaults to c("left", "right") for backward 
> compatibility.
> >>
> >> It could then allow:
> >> ftab <- ftable(wool + tension ~ breaks, warpbreaks)
> >> format.ftable(ftab, justify = c("none", "none"))
> >>
> >> Best regards,
> >>
> >> Thomas
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] justify hard coded in format.ftable

2020-05-15 Thread Gabor Grothendieck



One can use as.data.frame(as.matrix(tab)) to avoid calling
as.data.frame.matrix directly
(although I find I do use as.data.frame.matrix anyways sometimes even
though it is generally
better to call the generic.).

Also note that the various as.data.frame methods do not address the examples
in the SO links I posted which is why I mentioned it.

On Thu, May 14, 2020 at 9:22 AM SOEIRO Thomas  wrote:
>
> Thanks for the links. I agree that such a feature would be a nice addition, 
> and could make ftable even more useful.
>
> In the same spirit, I think it could be useful to mention the undocumented 
> base::as.data.frame.matrix function in documentation of table and xtabs (in 
> addition to the already mentioned base::as.data.frame.table). The conversion 
> from ftable/table/xtabs to data.frame is a common task that some users seem 
> to struggle with 
> (https://stackoverflow.com/questions/10758961/how-to-convert-a-table-to-a-data-frame).
>
> tab <- table(warpbreaks$wool, warpbreaks$tension)
> as.data.frame(tab) # reshaped table
> as.data.frame.matrix(tab) # non-reshaped table
>
> To sum up, for the sake of clarity, these proposals address two different 
> topics:
> - The justify argument would reduce the need to reformat the exported ftable
> - An ftable2df-like function (and the mention of as.data.frame.matrix in the 
> documentation) would facilitate the reuse of ftable results for further 
> analysis.
>
> Thank you very much,
>
> Thomas
>
> > If you are looking at ftable could you also consider adding a way to 
> > convert an ftable into a usable data.frame such as the ftable2df function 
> > defined here:
> >
> > https://stackoverflow.com/questions/11141406/reshaping-an-array-to-data-frame/11143126#11143126
> >
> > and there is an example of using it here:
> >
> > https://stackoverflow.com/questions/61333663/manipulating-an-array-into-a-data-frame-in-base-r/61334756#61334756
> >
> > Being able to move back and forth between various base class 
> > representations seems like something that would be natural to provide.
> >
> > Thanks.
> >
> > On Thu, May 14, 2020 at 5:32 AM Martin Maechler 
> >  wrote:
> >>
> >>> SOEIRO Thomas
> >>> on Wed, 13 May 2020 20:27:15 + writes:
> >>
> >>> Dear all,
> >>> I haven't received any feedback so far on my proposal to make
> >> "justify" argument available in stats:::format.ftable
> >>
> >>> Is this list the appropriate place for this kind of proposal?
> >>
> >> Yes, it is.. Actually such a post is even a "role model" post for
> >> R-devel.
> >>
> >>> I hope this follow-up to my message won't be taken as rude. Of course 
> >>> it's not meant to be, but I'm not used to the R mailing lists...
> >>
> >> well, there could be said much, and many stories told here ... ;-)
> >>
> >>> Thank you in advance for your comments,
> >>
> >>> Best,
> >>> Thomas
> >>
> >> The main reasons for "no reaction" (for such nice post) probably are
> >> combination of the following
> >>
> >> - we are busy
> >> - if we have time, we think other things are more exciting
> >> - we have not used ftable much/at all and are not interested.
> >>
> >> Even though the first 2 apply to me, I'll have a 2nd look into your
> >> post now, and may end up well agreeing with your proposal.
> >>
> >> Martin Maechler
> >> ETH Zurich  and  R Core team
> >>
> >>
> >>
> >>
>  Dear all,
> 
>  justify argument is hard coded in format.ftable:
> 
>  cbind(apply(LABS, 2L, format, justify = "left"),
>  apply(DATA, 2L, format, justify = "right"))
> 
>  It would be useful to have the possibility to modify the argument 
>  between c("left", "right", "centre", "none") as in format.default.
> 
>  The lines could be changed to:
> 
>  if(length(justify) != 2)
>  stop("justify must be length 2")
>  cbind(apply(LABS, 2L, format, justify = justify[1]),
>  apply(DATA, 2L, format, justify = justify[2]))
> 
>  The argument justify could defaults to c("left", "right") for backward 
>  compatibility.
> 
>  It could then allow:
>  ftab <- ftable(wool + tension ~ breaks, warpbreaks)
>  format.ftable(ftab, justify = c("none", "none"))
> 
>  Best regards,
> 
>  Thomas



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] suggestion: "." in [lsv]apply()

2020-04-20 Thread Gabor Grothendieck

I wouldn't drive my choices using unlikely edge cases
but rather focus on the vast majority of practical cases.

The popularity of tidyverse shows that this philosophy
works well from a user's perspective.

For the vast majority of practical cases it works well, and for the
others you can either use function as usual or do it like this:

lapply(quote(a + b), fn$identity(~ as.character(x)))

or (we used dot here but you can use any name you like)

. <- fn$identity
lapply(quote(a + b), .(~ as.character(x)))

For the vast majority of practical cases it has the advantages that the function
can be represented more naturally using whatever argument names are most
convenient rather than being forced to use reserved names and it supports
multiple arguments and dot dot dot.

Also it does more than just represent functions. It also interpolates strings
so it can be used for multiple purposes.

library(sqldf)
mytime <- 4
fn$sqldf("select * from BOD where Time < $mytime")


On Mon, Apr 20, 2020 at 9:32 AM Sokol Serguei  wrote:
>
> Le 19/04/2020 à 20:46, Gabor Grothendieck a écrit :
> > You can get pretty close to that already using fn$ in the gsubfn package:
> >> library(gsubfn) fn$sapply(split(mtcars, mtcars$cyl), x ~
> >> summary(lm(mpg ~ wt, x))$r.squared)
> > 4 6 8 0.5086326 0.4645102 0.4229655
> Right, I thought about similar syntax but this implementation has
> similar flaws pointed by Simon, i.e. it reduces the domain of valid
> inputs (though not on the same parameters). Take an example:
>
> library(gsubfn)
> fn$sapply(quote(x+y), as.character)
> #Error in lapply(X = X, FUN = FUN, ...) : object 'x' not found
>
> while
>
> sapply(quote(x+y), as.character)
> #[1] "+" "x" "y"
>
> This makes me think that it could be advantageous to replace
> match.fun(FUN) in *apply() family by as.function(FUN) with obvious
> additional methods:
> as.function.character <- function(x) match.fun(x)
> as.function.name <- function(x) match.fun(x)
>
> Such replacement would leave current usage of *apply() as is but at the
> same time would leave enough space for users who want to adapt *apply()
> to their objects like formula or whatever class that is currently not
> convertible to functions by match.fun()
>
> Would it be possible?
>
> Best,
> Serguei.
>
> > It is not specific to sapply but rather fn$ can preface most
> > functions. If the only free variables are the arguments to the
> > function then you can omit the left hand side of the formula, i.e. the
> > arguments to the function are implied by the free variables in the
> > right hand side. Here x is the implied argument to the function
> > because it is a free variable. We did not have use the name x. Any
> > name could be used. It is the fact that it is a free variable, not its
> > name, that matters.
> >> fn$sapply(split(mtcars, mtcars$cyl), ~ sum(dim(x)))
> > 4 6 8 22 18 25 On Fri, Apr 17, 2020 at 4:11 AM Sokol Serguei
> >  wrote:
> >> Thanks Simon, Now, I see better your argument. Le 16/04/2020 à 22:48,
> >> Simon Urbanek a écrit :
> >>> ... I'm not arguing against the principle, I'm arguing about your
> >>> particular proposal as it is inconsistent and not general.
> >> This sounds promising for me. May be in a (new?) future, R core will
> >> come with a correct proposal for this principle? Meanwhile, to avoid
> >> substitute(), I'll look on the side of formula syntax deviation as
> >> your example x ~> i + x suggested. Best, Serguei.
> >>> Personally, I find the current syntax much clearer and readable
> >>> (defining anything by convention like . being the function variable
> >>> seems arbitrary and "dirty" to me), but if you wanted to define a
> >>> shorter syntax, you could use something like x ~> i + x. That said,
> >>> I really don't see the value of not using function(x) [especially
> >>> these days when people are arguing for long variable names with the
> >>> justification that IDEs do all the work anyway], but as I said, my
> >>> argument was against the actual proposal, not general ideas about
> >>> syntax improvement. Cheers, Simon
> >>>> On 17/04/2020, at 3:53 AM, Sokol Serguei 
> >>>> wrote: Simon, Thanks for replying. In what follows I won't try to
> >>>> argue (I understood that you find this a bad idea) but I would like
> >>>> to make clearer some of your point for me (and may be for others).
> >>>> Le 16/04/2020 à 16:48, Simon Urbanek a écrit :
> >>>>> Serguei,
> >>>>>> On 17/04/2020, at 2:24 AM,

Re: [Rd] suggestion: "." in [lsv]apply()

2020-04-19 Thread Gabor Grothendieck

You can get pretty close to that already using fn$ in the gsubfn package:

> library(gsubfn)
> fn$sapply(split(mtcars, mtcars$cyl), x ~ summary(lm(mpg ~ wt, x))$r.squared)
4 6 8
0.5086326 0.4645102 0.4229655

It is not specific to sapply but rather fn$ can preface most functions.
If the only free variables are the arguments to the function then you
can omit the left hand side of the formula, i.e. the arguments to the
function are implied by the free variables in the right hand side.  Here
x is the implied argument to the function because it is a free variable.
We did not have use the name x.  Any name could be used.  It is the
fact that it is a free variable, not its name, that matters.

> fn$sapply(split(mtcars, mtcars$cyl), ~ sum(dim(x)))
 4  6  8
22 18 25

On Fri, Apr 17, 2020 at 4:11 AM Sokol Serguei  wrote:
>
> Thanks Simon,
>
> Now, I see better your argument.
>
> Le 16/04/2020 à 22:48, Simon Urbanek a écrit :
> > ... I'm not arguing against the principle, I'm arguing about your
> > particular proposal as it is inconsistent and not general.
> This sounds promising for me. May be in a (new?) future, R core will
> come with a correct proposal for this principle?
> Meanwhile, to avoid substitute(), I'll look on the side of formula
> syntax deviation as your example x ~> i + x suggested.
>
> Best,
> Serguei.
>
> > Personally, I find the current syntax much clearer and readable
> > (defining anything by convention like . being the function variable
> > seems arbitrary and "dirty" to me), but if you wanted to define a
> > shorter syntax, you could use something like x ~> i + x. That said, I
> > really don't see the value of not using function(x) [especially these
> > days when people are arguing for long variable names with the
> > justification that IDEs do all the work anyway], but as I said, my
> > argument was against the actual proposal, not general ideas about
> > syntax improvement. Cheers, Simon
> >> On 17/04/2020, at 3:53 AM, Sokol Serguei 
> >> wrote: Simon, Thanks for replying. In what follows I won't try to
> >> argue (I understood that you find this a bad idea) but I would like
> >> to make clearer some of your point for me (and may be for others). Le
> >> 16/04/2020 à 16:48, Simon Urbanek a écrit :
> >>> Serguei,
>  On 17/04/2020, at 2:24 AM, Sokol Serguei 
>  wrote: Hi, I would like to make a suggestion for a small syntactic
>  modification of FUN argument in the family of functions
>  [lsv]apply(). The idea is to allow one-liner expressions without
>  typing "function(item) {...}" to surround them. The argument to the
>  anonymous function is simply referred as ".". Let take an example.
>  With this new feature, the following call sapply(split(mtcars,
>  mtcars$cyl), function(d) summary(lm(mpg ~ wt, d))$r.squared) # 4 6
>  8 #0.5086326 0.4645102 0.4229655 could be rewritten as
>  sapply(split(mtcars, mtcars$cyl), summary(lm(mpg ~ wt,
>  .))$r.squared) "Not a big saving in typing" you can say but
>  multiplied by the number of [lsv]apply usage and a neater look, I
>  think, the idea merits to be considered.
> >>> It's not in any way "neater", not only is it less readable, it's
> >>> just plain wrong. What if the expression returned a function?
> >> do you mean like in l=sapply(1:3, function(i) function(x) i+x)
> >> l[[1]](3) # 4 l[[2]](3) # 5 This is indeed a corner case but a pair
> >> of () or {} can keep wsapply() in course: l=wsapply(1:3, (function(x)
> >> .+x)) l[[1]](3) # 4 l[[2]](3) # 5
> >>> How do you know that you don't want to apply the result of the call?
> >> A small example (if it is significantly different from the one above)
> >> would be very helpful for me to understand this point.
> >>> For the same reason the implementation below won't work - very often
> >>> you just pass a symbol that evaluates to a function and always en
> >>> expression that returns a function and there is no way to
> >>> distinguish that from your new proposed syntax.
> >> Even with () or {} around such "dotted" expression? Best, Serguei.
> >>> When you feel compelled to use substitute() you should hear alarm
> >>> bells that something is wrong ;). You can certainly write a new
> >>> function that uses a different syntax (and I'm sure someone has
> >>> already done that in the package space), but what you propose is
> >>> incompatible with *apply in R (and very much not R syntax). Cheers,
> >>> Simon
>  To illustrate a possible implementation, I propose a wrapper
>  example for sapply(): wsapply=function(l, fun, ...) {
>  s=substitute(fun) if (is.name(s) || is.call(s) &&
>  s[[1]]==as.name("function")) { sapply(l, fun, ...) # legacy call }
>  else { sapply(l, function(d) eval(s, list(.=d)), ...) } } Now, we
>  can do: wsapply(split(mtcars, mtcars$cyl), summary(lm(mpg ~ wt,
>  .))$r.squared) or, traditional way: wsapply(split(mtcars,
>  mtcars$cyl), function(d) summary(lm(mpg ~ wt,

Re: [Rd] New matrix function

2019-10-11 Thread Gabor Grothendieck

The link you posted used the same inputs as in my example. If that is
not what you meant maybe
a different example is needed.
Regards.

On Fri, Oct 11, 2019 at 2:39 PM Pages, Herve  wrote:
>
> Has someone looked into the image processing area for this? That sounds
> a little bit too high-level for base R to me (and I would be surprised
> if any mainstream programming language had this kind of functionality
> built-in).
>
> H.
>
> On 10/11/19 03:44, Morgan Morgan wrote:
> > Hi All,
> >
> > I was looking for a function to find a small matrix inside a larger matrix
> > in R similar to the one described in the following link:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mathworks.com_matlabcentral_answers_194708-2Dindex-2Da-2Dsmall-2Dmatrix-2Din-2Da-2Dlarger-2Dmatrix=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo=9Eu0WySIEzrWuYXFhwhHETpZQzi6hHLd84DZsbZsXYY=
> >
> > I couldn't find anything.
> >
> > The above function can be seen as a "generalisation" of the "which"
> > function as well as the function described in the following post:
> >
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__coolbutuseless.github.io_2018_04_03_finding-2Da-2Dlength-2Dn-2Dneedle-2Din-2Da-2Dhaystack_=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo=qZ3SJ8t8zEDA-em4WT7gBmN66qvvCKKKXRJunoF6P3k=
> >
> > Would be possible to add such a function to base R?
> >
> > I am happy to work with someone from the R core team (if you wish) and
> > suggest an implementation in C.
> >
> > Thank you
> > Best regards,
> > Morgan
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=v96tqHMO3CLNBS7KTmdshM371i6W_v8_2H5bdVy_KHo=tyVSs9EYVBd_dmVm1LSC23GhUzbBv8ULvtsveo-COoU=
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Gabor Grothendieck

I pressed return too soon.

If we had such a multiply then

   which(embed(A, x) %==.&% reverse(x))

On Fri, Oct 11, 2019 at 10:57 AM Gabor Grothendieck
 wrote:
>
> Also note that the functionality discussed could be regarded as a 
> generalization
> of matrix multiplication where *  and + are general functions and in this case
> we have * replaced by == and + replaced by &.
>
> On Fri, Oct 11, 2019 at 10:46 AM Gabor Grothendieck
>  wrote:
> >
> > Using the example in the link here are two one-liners:
> >
> >   A <- c(2,3,4,1,2,3,4,1,1,2)
> >   x <- c(1,2)
> >
> >   # 1 - zoo
> >   library(zoo)
> >   which( rollapply(A, length(x), identical, x, fill = FALSE, align = 
> > "left") )
> >   ## [1] 4 9
> >
> >   # 2 - Base R using conversion to character
> >   gregexpr(paste(x, collapse = ""), paste(A, collapse = ""))[[1]]
> >   ## [1] 4 9
> >   ...snip ...
> >
> > On Fri, Oct 11, 2019 at 3:45 AM Morgan Morgan  
> > wrote:
> > >
> > > Hi All,
> > >
> > > I was looking for a function to find a small matrix inside a larger matrix
> > > in R similar to the one described in the following link:
> > >
> > > https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
> > >
> > > I couldn't find anything.
> > >
> > > The above function can be seen as a "generalisation" of the "which"
> > > function as well as the function described in the following post:
> > >
> > > https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
> > >
> > > Would be possible to add such a function to base R?
> > >
> > > I am happy to work with someone from the R core team (if you wish) and
> > > suggest an implementation in C.
> > >
> > > Thank you
> > > Best regards,
> > > Morgan
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> >
> > --
> > Statistics & Software Consulting
> > GKX Group, GKX Associates Inc.
> > tel: 1-877-GKX-GROUP
> > email: ggrothendieck at gmail.com
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Gabor Grothendieck

Also note that the functionality discussed could be regarded as a generalization
of matrix multiplication where *  and + are general functions and in this case
we have * replaced by == and + replaced by &.

On Fri, Oct 11, 2019 at 10:46 AM Gabor Grothendieck
 wrote:
>
> Using the example in the link here are two one-liners:
>
>   A <- c(2,3,4,1,2,3,4,1,1,2)
>   x <- c(1,2)
>
>   # 1 - zoo
>   library(zoo)
>   which( rollapply(A, length(x), identical, x, fill = FALSE, align = "left") )
>   ## [1] 4 9
>
>   # 2 - Base R using conversion to character
>   gregexpr(paste(x, collapse = ""), paste(A, collapse = ""))[[1]]
>   ## [1] 4 9
>   ...snip ...
>
> On Fri, Oct 11, 2019 at 3:45 AM Morgan Morgan  
> wrote:
> >
> > Hi All,
> >
> > I was looking for a function to find a small matrix inside a larger matrix
> > in R similar to the one described in the following link:
> >
> > https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
> >
> > I couldn't find anything.
> >
> > The above function can be seen as a "generalisation" of the "which"
> > function as well as the function described in the following post:
> >
> > https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
> >
> > Would be possible to add such a function to base R?
> >
> > I am happy to work with someone from the R core team (if you wish) and
> > suggest an implementation in C.
> >
> > Thank you
> > Best regards,
> > Morgan
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New matrix function

2019-10-11 Thread Gabor Grothendieck

Using the example in the link here are two one-liners:

  A <- c(2,3,4,1,2,3,4,1,1,2)
  x <- c(1,2)

  # 1 - zoo
  library(zoo)
  which( rollapply(A, length(x), identical, x, fill = FALSE, align = "left") )
  ## [1] 4 9

  # 2 - Base R using conversion to character
  gregexpr(paste(x, collapse = ""), paste(A, collapse = ""))[[1]]
  ## [1] 4 9
  ...snip ...

On Fri, Oct 11, 2019 at 3:45 AM Morgan Morgan  wrote:
>
> Hi All,
>
> I was looking for a function to find a small matrix inside a larger matrix
> in R similar to the one described in the following link:
>
> https://www.mathworks.com/matlabcentral/answers/194708-index-a-small-matrix-in-a-larger-matrix
>
> I couldn't find anything.
>
> The above function can be seen as a "generalisation" of the "which"
> function as well as the function described in the following post:
>
> https://coolbutuseless.github.io/2018/04/03/finding-a-length-n-needle-in-a-haystack/
>
> Would be possible to add such a function to base R?
>
> I am happy to work with someone from the R core team (if you wish) and
> suggest an implementation in C.
>
> Thank you
> Best regards,
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Recall

2018-09-23 Thread Gabor Grothendieck

Please ignore. Looking at this again I realize the problem is that
Recall is not direclty within my.compose2 but rather is within the
anonymous function in the else.
On Sun, Sep 23, 2018 at 9:23 AM Gabor Grothendieck
 wrote:
>
> This works:
>
>   my.compose <- function(f, ...) {
> if (missing(f)) identity
> else function(x) f(my.compose(...)(x))
>   }
>
>   my.compose(sin, cos, tan)(pi/4)
>   ## [1] 0.5143953
>
>   sin(cos(tan(pi/4)))
>   ## [1] 0.5143953
>
> But replacing my.compose with Recall in the else causes it to fail:
>
>   my.compose2 <- function(f, ...) {
> if (missing(f)) identity
> else function(x) f(Recall(...)(x))
>   }
>
>   my.compose2(sin, cos, tan)(pi/4)
>   ## Error in my.compose2(sin, cos, tan)(pi/4) : unused argument (tan)
>
> Seems like a bug in R.
>
> This is taken from:
> https://stackoverflow.com/questions/52463170/a-recursive-compose-function-in-r
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Recall

2018-09-23 Thread Gabor Grothendieck

This works:

  my.compose <- function(f, ...) {
if (missing(f)) identity
else function(x) f(my.compose(...)(x))
  }

  my.compose(sin, cos, tan)(pi/4)
  ## [1] 0.5143953

  sin(cos(tan(pi/4)))
  ## [1] 0.5143953

But replacing my.compose with Recall in the else causes it to fail:

  my.compose2 <- function(f, ...) {
if (missing(f)) identity
else function(x) f(Recall(...)(x))
  }

  my.compose2(sin, cos, tan)(pi/4)
  ## Error in my.compose2(sin, cos, tan)(pi/4) : unused argument (tan)

Seems like a bug in R.

This is taken from:
https://stackoverflow.com/questions/52463170/a-recursive-compose-function-in-r

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] apply with zero-row matrix

2018-07-30 Thread Gabor Grothendieck

Try pmap and related functions in purrr:

  pmap(as.data.frame(m), ~ { cat("Called...\n"); print(c(...)) })
  ## list()

On Mon, Jul 30, 2018 at 12:33 AM, David Hugh-Jones
 wrote:
> Forgive me if this has been asked many times before, but I couldn't find
> anything on the mailing lists.
>
> I'd expect apply(m, 1, foo) not to call `foo` if m is a matrix with zero
> rows.
> In fact:
>
> m <- matrix(NA, 0, 5)
> apply(m, 1, function (x) {cat("Called...\n"); print(x)})
> ## Called...
> ## [1] FALSE FALSE FALSE FALSE FALSE
>
> Similarly for apply(m, 2,...) if m has no columns.
> Is there a reason for this? Could it be documented?
>
> David
> --
> Sent from Gmail Mobile
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] odd behavior of names

2018-07-29 Thread Gabor Grothendieck

The first component name has backticks around it and the second does
not. Though not wrong, it seems inconsistent.

list(a = 1, b = 2)
## $`a`
## [1] 1
##
## $b
## [1] 2

R.version.string
## [1] "R version 3.5.1 Patched (2018-07-02 r74950)"



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] oddity in transform

2018-07-24 Thread Gabor Grothendieck

The idea is that one wants to write the line of code below
 in a general way which works the same
whether you specify ix as one column or multiple columns but the naming entirely
changes when you do this and BOD[, 1] and transform(BOD, X=..., Y=...) or
other hard coding solutions still require writing multiple cases.

ix <- 1:2
transform(BOD, X = BOD[ix] * seq(6))



On Tue, Jul 24, 2018 at 7:14 AM, Emil Bode  wrote:
> I think you meant to call BOD[,1]
> From ?transform, the ... arguments are supposed to be vectors, and BOD[1] is 
> still a data.frame (with one column). So I don't think it's surprising 
> transform gets confused by which name to use (X, or Time?), and kind of 
> compromises on the name "Time". It's also in a note in ?transform: "If some 
> of the values are not vectors of the appropriate length, you deserve whatever 
> you get!"
> And if you want to do it with multiple extra columns (and are not satisfied 
> with these labels), I think the proper way to go would be " transform(BOD, 
> X=BOD[,1]*seq(6), Y=BOD[,2]*seq(6))"
>
> If you want to trace it back further, it's not in transform but in 
> data.frame. Column-names are prepended with a higher-level name if the object 
> has more than one column.
> And it uses the tag-name if simply supplied with a vector:
> data.frame(BOD[1:2], X=BOD[1]*seq(6)) takes the name of the only column of 
> BOD[1], Time. Only because that column name is already present, it's changed 
> to Time.1
> data.frame(BOD[1:2], X=BOD[,1]*seq(6)) gives third column-name X (as X is now 
> a vector)
> data.frame(BOD[1:2], X=BOD[1:2]*seq(6)) or with BOD[,1:2] gives columns names 
> X.Time and X.demand, to show these (multiple) columns are coming from X
>
> So I don't think there's much to fix here. I this case having X.Time in all 
> cases would have been better, but in general the column-naming of data.frame 
> works, changing it would likely cause a lot of problems.
> You can always change the column-names later.
>
> Best regards,
> Emil Bode
>
> Data-analyst
>
> +31 6 43 83 89 33
> emil.b...@dans.knaw.nl
>
> DANS: Netherlands Institute for Permanent Access to Digital Research Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | 
> i...@dans.knaw.nl <mailto:i...@dans.kn> | dans.knaw.nl 
> 
> DANS is an institute of the Dutch Academy KNAW <http://knaw.nl/nl> and 
> funding organisation NWO <http://www.nwo.nl/>.
>
> On 23/07/2018, 16:52, "R-devel on behalf of Gabor Grothendieck" 
>  wrote:
>
> Note the inconsistency in the names in these two examples.  X.Time in
> the first case and Time.1 in the second case.
>
>   > transform(BOD, X = BOD[1:2] * seq(6))
> Time demand X.Time X.demand
>   118.3  1  8.3
>   22   10.3  4 20.6
>   33   19.0  9 57.0
>   44   16.0 16 64.0
>   55   15.6 25 78.0
>   67   19.8 42118.8
>
>   > transform(BOD, X = BOD[1] * seq(6))
> Time demand Time.1
>   118.3  1
>   22   10.3  4
>   33   19.0  9
>   44   16.0 16
>   55   15.6 25
>   67   19.8 42
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] oddity in transform

2018-07-23 Thread Gabor Grothendieck

Note the inconsistency in the names in these two examples.  X.Time in
the first case and Time.1 in the second case.

  > transform(BOD, X = BOD[1:2] * seq(6))
Time demand X.Time X.demand
  118.3  1  8.3
  22   10.3  4 20.6
  33   19.0  9 57.0
  44   16.0 16 64.0
  55   15.6 25 78.0
  67   19.8 42118.8

  > transform(BOD, X = BOD[1] * seq(6))
Time demand Time.1
  118.3  1
  22   10.3  4
  33   19.0  9
  44   16.0 16
  55   15.6 25
  67   19.8 42

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] A few suggestions and perspectives from a PhD student

2017-05-05 Thread Gabor Grothendieck

Regarding the anonymous-function-in-a-pipeline point one can already
do this which does use brackets but even so it involves fewer
characters than the example shown.  Here { . * 2 } is basically a
lambda whose argument is dot. Would this be sufficient?

  library(magrittr)

  1.5 %>% { . * 2 }
  ## [1] 3

Regarding currying note that with magrittr Ista's code could be written as:

  1:5 %>% lapply(foo, y = 3)

or at the expense of slightly more verbosity:

  1:5 %>% Map(f = . %>% foo(y = 3))


On Fri, May 5, 2017 at 1:00 PM, Antonin Klima  wrote:
> Dear Sir or Madam,
>
> I am in 2nd year of my PhD in bioinformatics, after taking my Master’s in 
> computer science, and have been using R heavily during my PhD. As such, I 
> have put together a list of certain features in R that, in my opinion, would 
> be beneficial to add, or could be improved. The first two are already 
> implemented in packages, but given that it is implemented as user-defined 
> operators, it greatly restricts its usefulness. I hope you will find my 
> suggestions interesting. If you find time, I will welcome any feedback as to 
> whether you find the suggestions useful, or why you do not think they should 
> be implemented. I will also welcome if you enlighten me with any features I 
> might be unaware of, that might solve the issues I have pointed out below.
>
> 1) piping
> Currently available in package magrittr, piping makes the code better 
> readable by having the line start at its natural starting point, and 
> following with functions that are applied - in order. The readability of 
> several nested calls with a number of parameters each is almost zero, it’s 
> almost as if one would need to come up with the solution himself. Pipeline in 
> comparison is very straightforward, especially together with the point (2).
>
> The package here works rather good nevertheless, the shortcomings of piping 
> not being native are not quite as severe as in point (2). Nevertheless, an 
> intuitive symbol such as | would be helpful, and it sometimes bothers me that 
> I have to parenthesize anonymous function, which would probably not be 
> required in a native pipe-operator, much like it is not required in f.ex. 
> lapply. That is,
> 1:5 %>% function(x) x+2
> should be totally fine
>
> 2) currying
> Currently available in package Curry. The idea is that, having a function 
> such as foo = function(x, y) x+y, one would like to write for example 
> lapply(foo(3), 1:5), and have the interpreter figure out ok, foo(3) does not 
> make a value result, but it can still give a function result - a function of 
> y. This would be indeed most useful for various apply functions, rather than 
> writing function(x) foo(3,x).
>
> I suggest that currying would make the code easier to write, and more 
> readable, especially when using apply functions. One might imagine that there 
> could be some confusion with such a feature, especially from people 
> unfamiliar with functional programming, although R already does take function 
> as first-order arguments, so it could be just fine. But one could address it 
> with special syntax, such as $foo(3) [$foo(x=3)] for partial application.  
> The current currying package has very limited usefulness, as, being limited 
> by the user-defined operator framework, it only rarely can contribute to less 
> code/more readability. Compare yourself:
> $foo(x=3) vs foo %<% 3
> goo = function(a,b,c)
> $goo(b=3) vs goo %><% list(b=3)
>
> Moreover, one would often like currying to have highest priority. For 
> example, when piping:
> data %>% foo %>% foo1 %<% 3
> if one wants to do data %>% foo %>% $foo(x=3)
>
> 3) Code executable only when running the script itself
> Whereas the first two suggestions are somewhat stealing from Haskell and the 
> like, this suggestion would be stealing from Python. I’m building quite a 
> complicated pipeline, using S4 classes. After defining the class and its 
> methods, I also define how to build the class to my likings, based on my 
> input data, using various now-defined methods. So I end up having a list of 
> command line arguments to process, and the way to create the class instance 
> based on them. If I write it to the class file, however, I end up running the 
> code when it is sourced from the next step in the pipeline, that needs the 
> previous class definitions.
>
> A feature such as pythonic “if __name__ == __main__” would thus be useful. As 
> it is, I had to create run scripts as separate files. Which is actually not 
> so terrible, given the class and its methods often span a few hundred lines, 
> but still.
>
> 4) non-exported global variables
> I also find it lacking, that I seem to be unable to create constants that 
> would not get passed to files that source the class definition. That is, if 
> class1 features global constant CONSTANT=3, then if class2 sources class1, it 
> will also include the constant. This 1) clutters the namespace when running 
>

Re: [Rd] RFC: tapply(*, ..., init.value = NA)

2017-01-27 Thread Gabor Grothendieck

If xtabs is enhanced then as.data.frame.table may also need to be
modified so that it continues to be usable as an inverse, at least to
the degree feasible.


On Thu, Jan 26, 2017 at 5:42 AM, Martin Maechler
 wrote:
> Last week, we've talked here about "xtabs(), factors and NAs",
>  ->  https://stat.ethz.ch/pipermail/r-devel/2017-January/073621.html
>
> In the mean time, I've spent several hours on the issue
> and also committed changes to R-devel "in two iterations".
>
> In the case there is a *Left* hand side part to xtabs() formula,
> see the help page example using 'esoph',
> it uses  tapply(...,  FUN = sum)   and
> I now think there is a missing feature in tapply() there, which
> I am proposing to change.
>
> Look at a small example:
>
>> D2 <- data.frame(n = gl(3,4), L = gl(6,2, labels=LETTERS[1:6]), 
>> N=3)[-c(1,5), ]; xtabs(~., D2)
> , , N = 3
>
>L
> n   A B C D E F
>   1 1 2 0 0 0 0
>   2 0 0 1 2 0 0
>   3 0 0 0 0 2 2
>
>> DN <- D2; DN[1,"N"] <- NA; DN
>n L  N
> 2  1 A NA
> 3  1 B  3
> 4  1 B  3
> 6  2 C  3
> 7  2 D  3
> 8  2 D  3
> 9  3 E  3
> 10 3 E  3
> 11 3 F  3
> 12 3 F  3
>> with(DN, tapply(N, list(n,L), FUN=sum))
>A  B  C  D  E  F
> 1 NA  6 NA NA NA NA
> 2 NA NA  3  6 NA NA
> 3 NA NA NA NA  6  6
>>
>
> and as you can see, the resulting matrix has NAs, all the same
> NA_real_, but semantically of two different kinds:
>
> 1) at ["1", "A"], the  NA  comes from the NA in 'N'
> 2) all other NAs come from the fact that there is no such factor combination
>*and* from the fact that tapply() uses
>
>array(dim = .., dimnames = ...)
>
> i.e., initializes the array with NAs  (see definition of 'array').
>
> My proposition is the following patch to  tapply(), adding a new
> option 'init.value':
>
> -
>
> -tapply <- function (X, INDEX, FUN = NULL, ..., simplify = TRUE)
> +tapply <- function (X, INDEX, FUN = NULL, ..., init.value = NA, simplify = 
> TRUE)
>  {
>  FUN <- if (!is.null(FUN)) match.fun(FUN)
>  if (!is.list(INDEX)) INDEX <- list(INDEX)
> @@ -44,7 +44,7 @@
>  index <- as.logical(lengths(ans))  # equivalently, lengths(ans) > 0L
>  ans <- lapply(X = ans[index], FUN = FUN, ...)
>  if (simplify && all(lengths(ans) == 1L)) {
> -   ansmat <- array(dim = extent, dimnames = namelist)
> +   ansmat <- array(init.value, dim = extent, dimnames = namelist)
> ans <- unlist(ans, recursive = FALSE)
>  } else {
> ansmat <- array(vector("list", prod(extent)),
>
> -
>
> With that, I can set the initial value to '0' instead of array's
> default of NA :
>
>> with(DN, tapply(N, list(n,L), FUN=sum, init.value=0))
>A B C D E F
> 1 NA 6 0 0 0 0
> 2  0 0 3 6 0 0
> 3  0 0 0 0 6 6
>>
>
> which now has 0 counts and NA  as is desirable to be used inside
> xtabs().
>
> All fine... and would not be worth a posting to R-devel,
> except for this:
>
> The change will not be 100% back compatible -- by necessity: any new argument 
> for
> tapply() will make that argument name not available to be
> specified (via '...') for 'FUN'.  The new function would be
>
>> str(tapply)
> function (X, INDEX, FUN = NULL, ..., init.value = NA, simplify = TRUE)
>
> where the '...' are passed FUN(),  and with the new signature,
> 'init.value' then won't be passed to FUN  "anymore" (compared to
> R <= 3.3.x).
>
> For that reason, we could use   'INIT.VALUE' instead (possibly decreasing
> the probability the arg name is used in other functions).
>
>
> Opinions?
>
> Thank you in advance,
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] On implementing zero-overhead code reuse

2016-10-03 Thread Gabor Grothendieck

Have a look at the CRAN modules package and the import package.

On Sun, Oct 2, 2016 at 1:29 PM, Kynn Jones  wrote:
> I'm looking for a way to approximate the "zero-overhead" model of code
> reuse available in languages like Python, Perl, etc.
>
> I've described this idea in more detail, and the motivation for this
> question in an earlier post to R-help
> (https://stat.ethz.ch/pipermail/r-help/2016-September/442174.html).
>
> (One of the responses I got advised that I post my question here instead.)
>
> The best I have so far is to configure my PROJ_R_LIB environment
> variable to point to the directory with my shared code, and put a
> function like the following in my .Rprofile file:
>
> import <- function(name){
> ## usage:
> ## import("foo")
> ## foo$bar()
> path <- file.path(Sys.getenv("PROJ_R_LIB"),paste0(name,".R"))
> if(!file.exists(path)) stop('file "',path,'" does not exist')
> mod <- new.env()
> source(path,local=mod)
> list2env(setNames(list(mod),list(name)),envir=parent.frame())
> invisible()
> }
>
> (NB: the idea above is an elaboration of the one I showed in my first post.)
>
> But this is very much of an R noob's solution.  I figure there may
> already be more solid ways to achieve "zero-overhead" code reuse.
>
> I would appreciate any suggestions/critiques/pointers/comments.
>
> TIA!
>
> kj
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] strcapture enhancement

2016-09-21 Thread Gabor Grothendieck

Note that read.pattern in gsubfn does accept stringsAsFactors = FALSE,
e.g. using your input lines and pattern:

library(gsubfn)
Lines <- c("Three 3", "Twenty 20")
pat <- "([[:alpha:]]*) +([[:digit:]]*)"

s2 <- read.pattern(text = Lines, pattern = pat, stringsAsFactors = FALSE,
 col.names = c("Name", "Number"))

giving:

> str(s2)
'data.frame':   2 obs. of  2 variables:
 $ Name  : chr  "Three" "Twenty"
 $ Number: int  3 20


On Wed, Sep 21, 2016 at 2:06 PM, William Dunlap via R-devel
 wrote:
> The new strcapture function in R-devel is handy, capturing
> the matches to the parenthesized subpatterns in a regular
> expression in the columns of a data.frame, whose column
> names and classes are given by the 'proto' argument.  E.g.,
>
>> p1 <- data.frame(Name="", Number=0)
>> str(strcapture("([[:alpha:]]*) +([[:digit:]]*)", c("Three 3", "Twenty
> 20"), proto=p1))
> 'data.frame':   2 obs. of  2 variables:
>  $ Name  : Factor w/ 2 levels "Three","Twenty": 1 2
>  $ Number: num  3 20
>
> I think it would be even nicer if it constructed its data.frame
> using the check.names=FALSE and stringsAsFactors=FALSE
> arguments.  Then the names and types specified in the proto
> argument would be respected instead of changing them as
> in the following example
>
>> p2 <- data.frame("The Name"="", "The Number"=0, stringsAsFactors=FALSE,
> check.names=FALSE)
>> str(strcapture("([[:alpha:]]*) +([[:digit:]]*)", c("Three 3", "Twenty
> 20"), proto=p2))
> 'data.frame':   2 obs. of  2 variables:
>  $ The.Name  : Factor w/ 2 levels "Three","Twenty": 1 2
>  $ The.Number: num  3 20
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] stack problem

2016-06-27 Thread Gabor Grothendieck

One would normally want the original order that so that one can stack
a list, operate on the result and then unstack it back with the
unstacked result having the same ordering as the original.

LL <- list(z = 1:3, a = list())
# since we can't do s <- stack(LL,. drop = FALSE) do this instead:
s <- transform(stack(LL), ind = factor(as.character(ind), levels = names(LL)))
unstack(s)




On Mon, Jun 27, 2016 at 2:55 PM, Michael Lawrence
<lawrence.mich...@gene.com> wrote:
> I'll add the drop argument but I'm wondering about the order of the
> levels. Should we set the levels to unique(names(x)) or sort them,
> too?
>
> On Mon, Jun 27, 2016 at 10:39 AM, Gabor Grothendieck
> <ggrothendi...@gmail.com> wrote:
>> stack() seems to drop empty levels.  Perhaps there could be a
>> drop=FALSE argument if one wanted all the original levels.  In the
>> example below, we may wish to retain level "b" in s$ind even though
>> component LL$b has length 0.
>>
>>> LL <- list(a = 1:3, b = list())
>>> s <- stack(LL)
>>> str(s)
>> 'data.frame':   3 obs. of  2 variables:
>>  $ values: int  1 2 3
>>  $ ind   : Factor w/ 1 level "a": 1 1 1
>>
>>
>> --
>> Statistics & Software Consulting
>> GKX Group, GKX Associates Inc.
>> tel: 1-877-GKX-GROUP
>> email: ggrothendieck at gmail.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] stack problem

2016-06-27 Thread Gabor Grothendieck

stack() seems to drop empty levels.  Perhaps there could be a
drop=FALSE argument if one wanted all the original levels.  In the
example below, we may wish to retain level "b" in s$ind even though
component LL$b has length 0.

> LL <- list(a = 1:3, b = list())
> s <- stack(LL)
> str(s)
'data.frame':   3 obs. of  2 variables:
 $ values: int  1 2 3
 $ ind   : Factor w/ 1 level "a": 1 1 1


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] ctrl-R in Rgui

2016-05-05 Thread Gabor Grothendieck

When in the Rgui editor sometimes ctrl-R does not cause anything to be
sent to the R console.

It can be reproduced like this:

- when in the Rgui console press ctrl-F N to get a new editor window
- enter: pi + 3 followed by Enter
- while still in the editor window press ctrl-A ctrl-R and pi + 3 gets
entered into the console and runs as expected
- while still in the editor window press ctrl-A ctrl-R again

In the last case nothing happens whereas one would have expected it to
be entered into the console again and run again.

I am using [1] "R version 3.3.0 Patched (2016-05-03 r70575)"



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] for in r-devel

2016-03-18 Thread Gabor Grothendieck

Regarding, this news item for r-devel:

‘for()’ loops are generalized to iterate over any object with ‘[[’ and
‘length()’ methods. Thanks to Hervé Pagès for the idea and the patch.

Below dd is an object for which [[ and length work but the result is
still numeric rather than Date class in  "R Under development
(unstable) (2016-03-15 r70334)" as observed in the comments to:
http://stackoverflow.com/questions/36074344/why-does-for-convert-date-to-numeric#comment59794873_36074344
Expanding on that:

dd <- Sys.Date() + 0:1

dd[[1]]  # [[ works
## [1] "2016-03-18"

length(dd)  # length works
## [1]  2

for(d in dd) str(d)  # gives numeric rather than Date class
## num 16878
## num 16879


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Puzzled by eval

2015-11-06 Thread Gabor Grothendieck

This code which I think I wrote but might have gotten from elsewhere a
long time ago shows the environments that are searched from a given
function, in this case chart.RelativePerformance in
PerformanceAnalytics package.   Try it on some of your functions in
and out of packages to help determine the sequence of environments R
searches along:

library( PerformanceAnalytics )  ## change as needed
x <- environment(chart.RelativePerformance) ## change as needed
str(x)
while (!identical(x, emptyenv())) {
p <- parent.env(x)
cat(" child is above this line and parent is below \n")
str(p)
if (isBaseNamespace(p)) cat("Same as .BaseNamespaceEnv\n")
if (identical(p, baseenv())) cat("Same as baseenv()\n")
if (identical(p, emptyenv())) cat("Same as emptyenv()\n")
if (identical(p, globalenv())) cat("Same as globalenv()\n")
x <- p
}

On Fri, Nov 6, 2015 at 9:47 AM, Duncan Murdoch  wrote:
> On 06/11/2015 8:20 AM, Therneau, Terry M., Ph.D. wrote:
>>
>> Duncan,
>> That's helpful.  Two follow-up questions:
>> 1. Where would I have found this information?  I had looked at eval and
>> model.frame.
>
>
> I think the best description is Luke's article on namespaces, "Name space
> management for R". Luke Tierney, R News, 3(1):2-6, June 2003. There's a link
> to it from the "Technical papers" section of the HTML help index.  There's
> also a short description of this in the R Language Definition manual in the
> "Search path" section 3.5.4.
>
>
>> 2. What stops the following code from falling down the same rabbit hole?
>> Shouldn't it
>> find base::cos first?
>>
>>  library(survival)
>>  cos <- lung
>>  coxph(Surv(time, status) ~ age, data=cos)
>
>
> If that code is in a function anywhere (package or not), cos will be a local
> variable created there in the evaluation environment created when you
> evaluate the function.  If you execute it at the command line, you'll create
> a variable called "cos" in the global environment.  Local variables come
> ahead of the 3 places I listed.  (This is why Luke's article is good:  it
> doesn't oversimplify.)
>
> There's one other twist.  Even with cos being a local variable, cos(theta)
> would find base::cos, because the evaluator knows it is looking for a
> function (since it's a function call) and will skip over the local dataframe
> named cos.
>
> Duncan Murdoch
>
>>
>> Terry T.
>>
>>
>> On 11/06/2015 07:51 AM, Duncan Murdoch wrote:
>>>
>>> On 06/11/2015 7:36 AM, Therneau, Terry M., Ph.D. wrote:

 I am currently puzzled by a seach path behavior.  I have a library of a
 dozen routines
 getlabs(), getssn(), getecg(), ... that interface to local repositories
 and pull back
 patient information.  All have a the first 6 arguments in common, and
 immediately call a
 second routine to do initial processing of these 6.  The functions "joe"
 and "fred" below
 capture the relevant portion of them.
  My puzzle is this: the last test in the "test" file works fine if
 these routines are
 sourced and executed at the command line, it fails if the routines are
 bundled up and
 loaded as a library. That test is motivated by a user who called his
 data set "t", and
 ended up with a match to base:::t instead of his data, resulting in a
 strange error
 message out of model.frame  --- you can always count on the users!
 (There are a few
 hundred.)
   I'm attempting to be careful with envr and enclos arguments -- how
 does base end up
 earlier in the search path?   Perhaps this is clearly stated in the docs
 and just not
 clear to me?  A working solution to the dilemma is of course more than
 welcome.
>>>
>>>
>>> I haven't followed through all the details in fred(), but I can answer
>>> the last question.
>>> In package code, the search order is:
>>>
>>> - the package environment
>>> - the imports to the package (with base being an implicit import)
>>> - the global environment and the rest of the search list.
>>>
>>> In code sourced to the global environment, only the third of these is
>>> searched.  Since
>>> base is in the second one, it is found first in the package version.
>>>
>>> Duncan Murdoch
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] returnValue()

2015-05-22 Thread Gabor Grothendieck

Please disregard. I was running an older version of R at the time.  In
R version 3.2.0 Patched (2015-04-19 r68205) returnValue() does work.

On Fri, May 22, 2015 at 6:25 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 In R devel rev.66393 (2014-08-15) it was possible to do this:

trace(optim, exit = quote(str(returnValue(

 but returnValue() does not seem to be available any more.  The above
 was useful to get the output of a function when it was called deep
 within another function that I have no control over.

 Has this been replaced by some other equivalent function?

 P.S. This demonstrates that it no longer works.  The error message is
 that it cannot find function 'returnValue`:

 trace(optim, exit = quote(str(returnValue(
 Tracing function optim in package stats
 [1] optim
 arima(presidents, order = c(1, 0, 0))
 Tracing optim(init[mask], armafn, method = optim.method, hessian =
 TRUE,   on exit
 Error in str(returnValue()) : could not find function returnValue


 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] returnValue()

2015-05-22 Thread Gabor Grothendieck

In R devel rev.66393 (2014-08-15) it was possible to do this:

   trace(optim, exit = quote(str(returnValue(

but returnValue() does not seem to be available any more.  The above
was useful to get the output of a function when it was called deep
within another function that I have no control over.

Has this been replaced by some other equivalent function?

P.S. This demonstrates that it no longer works.  The error message is
that it cannot find function 'returnValue`:

 trace(optim, exit = quote(str(returnValue(
Tracing function optim in package stats
[1] optim
 arima(presidents, order = c(1, 0, 0))
Tracing optim(init[mask], armafn, method = optim.method, hessian =
TRUE,   on exit
Error in str(returnValue()) : could not find function returnValue


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] xtabs and NA

2015-02-09 Thread Gabor Grothendieck

On Mon, Feb 9, 2015 at 8:52 AM, Kirill Müller
kirill.muel...@ivt.baug.ethz.ch wrote:
 Hi


 I haven't found a way to produce a tabulation from factor data with NA
 values using xtabs. Please find a minimal example below, it's also on R-pubs
 [1]. Tested with R 3.1.2 and R-devel r67720.

 It doesn't seem to be documented explicitly that it's not supported. From
 reading the code [2] it looks like the relevant call to table() doesn't set
 the useNA parameter, which I think is necessary to make NAs show up in the
 result.

 Am I missing anything? If this a bug -- would a patch be welcome? Do we need
 compatibility with the current behavior?

 I'm aware of workarounds, I just prefer xtabs() over table() for its
 interface.


Passing table the output of model.frame would still allow the use of a
formula interface:

 mf - model.frame( ~ data, na.action = na.pass)
 do.call(table, c(mf, useNA = ifany))

   abc NA
   1111


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN and ggplot2 geom and stat extensions

2014-12-23 Thread Gabor Grothendieck

On Tue, Dec 23, 2014 at 11:21 AM, Ista Zahn istaz...@gmail.com wrote:
 On Tue, Dec 23, 2014 at 10:34 AM, Frank Harrell
 f.harr...@vanderbilt.edu wrote:
 I am thinking about adding several geom and stat extensions to ggplot2
 in the Hmisc package.  To do this requires using non-exported ggplot2
 functions as discussed in
 http://stackoverflow.com/questions/18108406/creating-a-custom-stat-object-in-ggplot2

 If I use the needed ggplot2::: notation the package will no longer pass
 CRAN checks.  Does anyone know of a solution?

 the ggthemes package is on CRAN and uses ggplot2::: so it is at least
 possible that this will be allowed for Hmisc as well.


Packages ggmap, ggtern and ggsubplot also define their own geom's and/or stat's.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Options that are local to the package that sets them

2014-10-31 Thread Gabor Grothendieck

On Fri, Oct 31, 2014 at 7:34 PM, Gábor Csárdi csardi.ga...@gmail.com wrote:
 Dear All,

 I am trying to do the following, and could use some hints.

 Suppose I have a package called pkgA. pkgA exposes an API that
 includes setting some options, e.g. pkgA works with color palettes,
 and the user of the package can define new palettes. pkgA provides an
 API to manipulate these palettes, including defining them.

 pkgA is intended to be used in other packages, e.g. in pkgB1 and
 pkgB2. Now suppose pkgB1 and pkgB2 both set new palettes using pkgA.
 They might set palettes with the same name, of course, they do not
 know about each other.

 My question is, is there a straightforward way to implement pkgA's
 API, such that pkgB1 and pkgB2 do not interfere? In other words, if
 pkgB1 and pkgB2 both define a palette 'foo', but they define it
 differently, each should see her own version of it.

 I guess this requires that I put something (a function?) in both
 pkgB1's and pkgB2's package namespace. As I see it, this can only
 happen when pkgA's API is called from pkgB1 (and pkgB2).

 So at this time I could just walk up the call tree and put the palette
 definition in the first environment that is not pkgA's. This looks
 somewhat messy, and I am probably missing some caveats.

 Is there a better way? I have a feeling that this is already supported
 somehow, I just can't find out how.


Try the settings package.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Options that are local to the package that sets them

2014-10-31 Thread Gabor Grothendieck

On Fri, Oct 31, 2014 at 8:43 PM, Gábor Csárdi csardi.ga...@gmail.com wrote:
 On Fri, Oct 31, 2014 at 8:10 PM, Gabor Grothendieck
 ggrothendi...@gmail.com wrote:
 [...]
 Is there a better way? I have a feeling that this is already supported
 somehow, I just can't find out how.


 Try the settings package.

 I could, but I don't see how it would solve my problem.
 https://github.com/markvanderloo/settings/issues/1

Isn't your problem really just that you want multiple sets of
settings?  That's what settings provides.

pkgA would provide a class whose instances are created by the clients.
Assuming you wrap this in a function create:

inst1 - create(a = 1, b = 2)

where create sets up a settings object and does anything else
returning the handle inst1.

When you want to do something you would pass the instance to the
function or method that actually carries it out.  This could be done
with any OO system in R without settings but if you are looking for an
options type interface which I thought you were then you might be able
to leverage that package.




-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Using Rtools with gcc 4.8.3

2014-10-05 Thread Gabor Grothendieck

On Sun, Oct 5, 2014 at 6:51 AM, Uwe Ligges
lig...@statistik.tu-dortmund.de wrote:


 On 05.10.2014 12:20, Jeroen Ooms wrote:

 I started working on some R bindings for mongo-c-driver [1]. The C
 library compiles fine on Ubuntu Trusty (gcc 4.8.2) and osx (clang),
 however on my windows machine (gcc 4.6.3 from Rtools 3.1) it fails
 with:  'INIT_ONCE_STATIC_INIT' undeclared. Google suggests that this
 might be a problem in older versions of mingw-w64. So I grabbed a copy
 of mingw-w64 version 4.8.3 and indeed, here the library compiles
 without errors.

 Now I am unsure how to make mingw 4.8.3 work with Rtools. I extracted
 the contents of [2] into C:\RBuildTools\3.1\gcc-4.8.3\ and my
 package Makevars contains

CC = c:/RBuildTools/3.1/gcc-4.8.3/bin/gcc

 However it seems like R still uses the old gcc 4.6.3 for R CMD
 INSTALL. What am I doing wrong? Is there a recommended setup for
 building packages on Windows using a Rtools but with another compiler?

 In addition: will I be able to publish this package to CRAN, or do I
 have to wait for Rtools to get updated with a more recent gcc?


 Currently only 4.6.3 is supported and that is the one used to build binary
 packages on CRAN. Hence you need to wait until it is updated.

 Best,
 Uwe Ligges



 [1] https://github.com/mongodb/mongo-c-driver
 [2]
 http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/mingw-builds/4.8.3/threads-posix/dwarf/


Are there any plans for this?  gcc is already up to 4.9.1 and I am
sure a lot of people would like to see the latest version available as
part of Rtools.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re R CMD check checking in development version of R

2014-08-28 Thread Gabor Grothendieck

Yes, Depends certainly has a role. The ability of one package to
automatically provide all the facilities of another package to the
user is important. There are many situations where the functionality
you want to provide to the user is split among multiple packages.

For example,

1. xts uses zoo and the user ought to be able to use all the
functionality of zoo when they issue a library(zoo) call.  The
alternatives are that the user must tediously issue library(zoo) every
time they issue library(xts) or else that the zoo code be copied or
partially replicated into xts which would be undesirable
maintenance-wise.

2. Another example is sqldf.  The user wants to be able to use fn$ and
other string manipulation functions in gsubfn when using sqldf in
order to perform string substitution on the SQL statement.  Also its
desirable to be able to directly access sqlite which means the user
needs access to RSQLite.

3. At one time one could just issue library(ggplot2) but now that
ggplot2 does not use Depends for scales one annoyingly needs to issue
library(scales) if one wants to specify a scale.   I use ggplot2
enough that I can remember it despite the ongoing annoyance but I
would hate to think that every package with split functionality
suddenly adds such onerous requirements onto all its users.  (I am not
really picking on ggplot2 which is a very nice package - just this one
aspect.)

I am not sure but there might be additional problems if the secondary
package defines an S3 generic that the primary package needs to use if
one does not use Depends.


On Thu, Aug 28, 2014 at 2:43 PM, Gavin Simpson ucfa...@gmail.com wrote:
 I fully agree.

 This is how I have come to understand Depends vs Imports and why I
 currently will not be removing vegan from Depends for my analogue package.
 This is also why I was pushing back against the notion that was voiced
 early in this thread that *nothing* should be in Depends.

 Cheers

 G


 On 28 August 2014 08:47, Bert Gunter bgun...@gene.com wrote:

 This is a nice explanation of the Imports/Depends distinction. It
 ought to go into the Extensions ref manual imho.

 Cheers,
 Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll




 On Thu, Aug 28, 2014 at 7:39 AM, Simon Urbanek
 simon.urba...@r-project.org wrote:
 
  On Aug 27, 2014, at 6:01 PM, Gavin Simpson ucfa...@gmail.com wrote:
 
  On 27 August 2014 15:24, Hadley Wickham h.wick...@gmail.com wrote:
 
  Is that the cause of these NOTEs? Is the expectation that if I am
 using a
  function from a package, even a package that I have in Depends:, that
 I
  have to explicitly declare these imports in NAMESPACE?
 
  Yes.
 
  (Otherwise your package won't work if it's only attached and not
  loaded. i.e. if someone does analogue::foo() only the imported
  functions are available, not the functions in packages you depend on)
 
 
  Cheers Hadley. Thanks for the confirmation, but...
 
  ...I don't get this; what is the point of Depends? I thought it was my
  package needs these other packages to work, i.e. be loaded. Hence it is
  user error (IMHO ;-) to do `analogue::foo()` without having the
  dependencies loaded too.
 
 
  No. The point of Depends is that if your package is attached, it also
 attaches the other packages to make them available for the user.
 Essentially you're saying if you want to use my package interactively, you
 will also want to use those other packages interactively. You still need
 to use import() to define what exactly is used by your package - as opposed
 to what you want to be available to the user in case it is attached.
 
  Cheers,
  Simon
 
 
 
  This check (whilst having found some things I should have imported and
  didn't - which is a good thing!) seems to be circumventing the
 intention of
  having something in Depends. Is Depends going to go away?
 
 
  (And really you shouldn't have any packages in depends, they should
  all be in imports)
 
 
  I disagree with *any*; having say vegan loaded when one is using
 analogue
  is a design decision as the latter borrows heavily from and builds upon
  vegan. In general I have moved packages that didn't need to be in
 Depends
  into Imports; in the version I am currently doing final tweaks on
 before it
  goes to CRAN I have remove all but vegan from Depends.
 
  Or am I thinking about this in the wrong way?
 
  Thanks again
 
  Gavin
 
 
 
  Hadley
 
 
  --
  http://had.co.nz/
 
 
 
 
  --
  Gavin Simpson, PhD
 
[[alternative HTML version deleted]]
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel




 --
 Gavin Simpson, PhD

 [[alternative HTML version deleted]]

Re: [Rd] Licence for datasets in a R-package

2014-07-21 Thread Gabor Grothendieck

On Mon, Jul 21, 2014 at 12:54 PM, Gábor Csárdi csardi.ga...@gmail.com wrote:
 In practice, CRAN maintainers do not allow multiple licenses for parts
 of the same package. At least they did not for my package a couple of
 months ago.


If that is the case then you could put your data files in a separate
package from the code with one depending on the other.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] useDynLib

2014-07-06 Thread Gabor Grothendieck

I would like to be able to load two versions of a package at once and
to do that was thinking of giving each version a different package
name in the DESCRIPTION file and the building and installing each such
version separately.

library(myPkg1)
library(myPkg2)

and then use myPkg1::myFun() and myPkg2::myFun().

To do that easily it would be convenient if one could change the
package name in only one place (the DESCRIPTION file) and have that
propagate to all other uses of the package name in the package.

Suppose the package were named myPkg.  Then the problem areas are:

1. The NAMESPACE file has myPkg hard-coded like this:

useDynLib(myPkg, .registration=TRUE)

2. The configure.ac file has myPkg hard-coded like this:

AC_INIT([myPkg], 1.0.0)

3. There are various references to myPkg hard-code throughout the R
code, e.g. myPkg::myFun, but I am ok here as I assume this would work
where an .onLoad would be used to grab the package name from the
.onLoad's pkgname argument:
   `::`(pkgname, MyFun)
(Also some or all of these may not be needed in the first place.)

1. Is there some way to cause these instances to change when the
package name in the DESCRIPTION file changes or is there some other
approach that would make it easy to change the package name in just
one spot or some other way to load two packages that are versions of
each other at one time.  (The fact that the package has C++ code seems
to be the complicating factor.)

2. Are there any examples of CRAN packages that have been set up to
make this easy to do?

Thanks.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] type.convert and doubles

2014-04-19 Thread Gabor Grothendieck

On Sat, Apr 19, 2014 at 1:06 PM, Simon Urbanek
simon.urba...@r-project.org wrote:
 On Apr 19, 2014, at 9:00 AM, Martin Maechler maech...@stat.math.ethz.ch 
 wrote:

 I think there should be two separate discussions:

 a) have an option (argument to type.convert and possibly read.table) to 
 enable/disable this behavior. I'm strongly in favor of this.

 b) decide what the default for a) will be. I have no strong opinion, I can 
 see arguments in both directions

 But most importantly I think a) is better than the status quo - even if the 
 discussion about b) drags out.

 Cheers,
 Simon

Another possibility is:

(c) Return the column as factor/character but with a distinguishing
class so that the user can reset its class later. e.g.

DF - read.table(...)
DF[] - lapply(DF, function(x) if (inherits(x, special.class))
as.numeric(x) else x)

Personally I would go with (a) in both type.convert and read.table
with a default that reflects the historical behavior rather than the
current 3.1 behavior.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Can the output of Sys.getenv() be improved?

2014-04-18 Thread Gabor Grothendieck

On Fri, Apr 18, 2014 at 12:38 PM, Zhang,Jun jhzh...@mdanderson.org wrote:
 Within an R session, type Sys.getenv() will list all the environment 
 variables, but each one of them occupies about a page, so scrolling to find 
 one is difficult. Is this because I don't know how to use it or something 
 could be improved? Usually I'm not sure the exact name of a variable but want 
 to look it up. Recently I installed rjags, with the JAGS-3.4.0's lib, 
 include, modules information provided during compilation. When I load rjags, 
 I was told that it linked to JAGS 3.3.0 (a package also available in my 
 environment). This made me think there must be a variable to make that to 
 happen. What can I do to make rjags to link to JAGS 3.4.0?


Try this:

str(as.list(Sys.getenv()))

or this:

View(as.matrix(Sys.getenv()))

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] type.convert and doubles

2014-04-17 Thread Gabor Grothendieck

On Thu, Apr 17, 2014 at 2:21 PM, Murray Stokely mur...@stokely.org wrote:
 If you later want to do arithmetic on them, you can choose to lose
 precision by using as.numeric() or use one of the large number
 packages on CRAN (GMP, int64, bit64, etc.).  But once you've dropped
 the precision with as.numeric you can never get it back, which is why
 the previous behavior was clearly dangerous.

Only if you knew that that column was supposed to be numeric. There is
nothing in type.convert or read.table to allow you to override how it
works (colClasses only works if you knew which columns are which in
the first place) nor is there anything to allow you to know which
columns were affected so that you know which columns to look at to fix
it yourself afterwards.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R 3.1.0 and C++11

2014-04-10 Thread Gabor Grothendieck

On Tue, Oct 29, 2013 at 1:58 AM,  rom...@r-enthusiasts.com wrote:
 Le 2013-10-29 03:01, Whit Armstrong a écrit :

 I would love to see optional c++0x support added for R.


 c++0x was the name given for when this was in development. Now c++11 is a
 published standard backed by implementations by major compilers.
 people need to stop calling it c++0x


 If there is anything I can do to help, please let me know.


 Come here https://github.com/romainfrancois/cpp11_article where I'm writing
 an article on C++11 and what would be the benefits.


Unless you are willing to do it yourself currently Rtools on Windows uses
g++ 4.6.3 and that requires that one specify -std=c++0x or -std=gnu++0x .

Ubuntu 12.04 LTS also provides g++ 4.6.3.

g++ 4.7 is the first version of g++ that accepts -std=c++11 or -std=gnu++11

More info at:
http://gcc.gnu.org/projects/cxx0x.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] file.exists does not like path names ending in /

2014-01-17 Thread Gabor Grothendieck

On Fri, Jan 17, 2014 at 6:16 AM, Martin Maechler
maech...@stat.math.ethz.ch wrote:
 Gabor Grothendieck ggrothendi...@gmail.com
 on Fri, 17 Jan 2014 00:10:43 -0500 writes:

  If a path name ends in slash then file.exists says it does
  not exist.  I would have expected these to all return
  TRUE.

  file.exists(/Program Files)
  [1] TRUE
  file.exists(/Program Files/)
  [1] FALSE
  file.exists(normalizePath(/Program Files/))
  [1] FALSE
  R.version.string
  [1] R version 3.0.2 Patched (2013-11-25 r64299)

 I would also have expected all these to work,
 but that is only because I do not use Windows;
 indeed, for me  (Linux Fedora 19, but I'm pretty sure *any*
 version):

dir.create(/tmp/foo bar)
normalizePath(/tmp/foo bar/)
   [1] /tmp/foo bar
file.exists(/tmp/foo bar/)
   [1] TRUE
file.exists(normalizePath(/tmp/foo bar/))
   [1] TRUE
   


  I am using Windows 8.1 .

 poor you  ;-)   yes, don't take it personally

 Last but not least   ?file.exists   in its  'Details' section
 mentions several times how Windows behaves differently from the
 civilized world (:-)
 notably that it also says

   (However,
  directory names must not include a trailing backslash or slash on
  Windows.)

 So, all seems as documented, but unfortunately your (and most
 probably not only yours !!) expectations are different.
 I agree that this is quite unfortunate, and we all would be
 happy if OSes were consistent here.

 I wonder if R couldn't help you (and many others) better to
 gloss over these OS differences in a helpful way.
 This list may be a good place to discuss such proposals ...


I do find that sometimes I have to deal with paths that I did not
create that end in / or \.  At the moment I am using this to avoid the
problem:

File.exists - function(x) {
   if (.Platform$OS == windows  grepl([/\\]$, x)) {
   file.exists(dirname(x))
   } else file.exists(x)
}

but it would be nice if that could be done by file.exists itself.


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] file.exists does not like path names ending in /

2014-01-16 Thread Gabor Grothendieck

If a path name ends in slash then file.exists says it does not exist.
I would have expected these to all return TRUE.

 file.exists(/Program Files)
[1] TRUE
 file.exists(/Program Files/)
[1] FALSE
 file.exists(normalizePath(/Program Files/))
[1] FALSE
 R.version.string
[1] R version 3.0.2 Patched (2013-11-25 r64299)

I am using Windows 8.1 .

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] win.version() incorrect

2014-01-16 Thread Gabor Grothendieck

On Windows 8.1 I get this.  win.version() indicates build 9200 but I
actually have build 9600 as can be seen from the ver command.
shell(winver) also indicates 9600.  I assume ver and winver are
correct and win.version() is not.


 win.version()
[1] Windows 8 x64 (build 9200)
 shell(ver)

Microsoft Windows [Version 6.3.9600]


 R.version.string
[1] R version 3.0.2 Patched (2013-11-25 r64299)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] In-string variable/symbol substitution: What formats/syntax is out there?

2013-12-17 Thread Gabor Grothendieck

On Tue, Dec 17, 2013 at 4:44 PM, Henrik Bengtsson h...@biostat.ucsf.edu wrote:
 Hi,

 I'm try to collect a list of methods/packages available in R for doing
 in-string variable/symbol substitution, e.g. someFcn(pi=${pi}),
 anotherFcn(pi=@pi@) and so on becomes pi=3.141593.  I am aware of
 the following:

 ** gsubfn() in the 'gsubfn' package, e.g.
 gsubfn( , , pi = $pi, 2pi = `2*pi`)
 [1] pi = 3.14159265358979, 2pi = 6.28318530717959


 ** gstring() in the 'R.utils' package, e.g.
 gstring(pi = ${pi}, 2pi = ${`2*pi`})
 [1] pi = 3.14159265358979, 2pi = 6.28318530717959


 I'm sure there are other approaches - do you know of any in R?  They
 don't have to support in-line calculations such as in the first two
 examples, but if they do, it's a bonus.  I'm looking for simpler
 functions and not full blown literate programming methods (e.g.
 Sweave, noweb, knitr, brew, RSP, ...).  It should also be *in-string*
 substitution out of the box, so sub(), sprintf() and friends does not
 count.

 Thanks

 Henrik

 PS. The following is on the borderline because it does not do
 automatic variable look up, but since others may bring it up and/or
 know of a neater approach, I mention it too:

 ** copySubstitute() in the 'Biobase' package (with some efforts), e.g.
 bbsubst - function(fmt, ...) {
   args - lapply(list(...), FUN=as.character)
   in - textConnection(fmt)
   out - textConnection(res, open=w)
   on.exit({ close(in); close(out) })
   copySubstitute(in, out, symbolValues=args)
   res
 }
 bbsubst(pi = @pi@, pi=pi)
 [1] pi = 3.14159265358979

Note that the gsubfn example above is the default only but by
specifying the pattern argument (first arg) it can be changed. e.g.

library(gsubfn)

pat - [$]([[:alpha:]][[:alnum:].]*)|[$][{]([^}]*)[}]
gsubfn(pat,, pi=$pi 2pi=${2*pi})

pat2 - @([^@]*)@
gsubfn(pat2,, pi=@pi@ 2pi=@2*pi@)

pat3 - %([^%]*)%
gsubfn(pat3,, pi=%pi% 2pi=%2*pi%)

pat4 - {{(.*?)}}
gsubfn(pat4,, pi={{pi}} 2pi={{2*pi}})



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Depending/Importing data only packages

2013-12-07 Thread Gabor Grothendieck

On Sat, Dec 7, 2013 at 1:35 PM, Paul Gilbert pgilbert...@gmail.com wrote:


 On 13-12-07 12:19 PM, Gábor Csárdi wrote:

 I don't know about this particular case, but in general it makes sense
 to rely on a data package. E.g. I am creating a package that does
 Bayesian inference for a particular problem, potentially relying on
 prior knowledge. I think it makes sense to put the data that is used
 to calculate the prior into another package, because it will be larger
 than the code, and it does not change that often.

 Gabor

 On Sat, Dec 7, 2013 at 11:51 AM, Paul Gilbert pgilbert...@gmail.com
 wrote:

 Would Suggests not work in this situation? I don't understand why you
 would need Depends. In what sense do you rely on the data only package?


 HW Because I want someone who downloads the package to be able to run
 HW the examples without having to take additional action.
 HW
 HW Hadley

 I went through this myself, including thinking it was a nuisance for users
 to need to attach other packages to run examples. In the end I decided it is
 not so bad to be explicit about what package the example data comes from, so
 illustrate it in the examples. Users may not always want this data, and
 other packages that build on yours probably do not want it.

 Even in the Bayesian inference case pointed out by Gábor, I am not
 convinced. It means the prior knowledge base cannot be exchanged for another
 one. The package would be more general if it allowed the possibility of
 attaching a different database of prior information. But this is clearly a
 more important case, since the code probably does not work without some
 database. (There are a few other situations where something like
 RequireOneOf: would be useful.)


Requiring users to load packages which could be loaded automatically
seems to go against ease of use.  Its just one more thing that they
have to remember to do.

It really should be possible to write a batteries included package
while leveraging off of other packages.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Huge performance difference between implicit and explicit print

2013-10-30 Thread Gabor Grothendieck

On Wed, Oct 30, 2013 at 6:22 PM, Hadley Wickham h.wick...@gmail.com wrote:
 Hi all,

 Can anyone help me understand why an implicit print (i.e. just typing
 df at the console), is so much slower than an explicit print (i.e.
 print(df)) in the example below?  I see the difference in both Rstudio
 and in a terminal.

 # Construct large df as quickly as possible
 dummy - 1:18e6
 df - lapply(1:10, function(x) dummy)
 names(df) - letters[1:10]
 class(df) - c(myobj, data.frame)
 attr(df, row.names) - .set_row_names(18e6)

 print.myobj - function(x, ...) {
   print.data.frame(head(x, 2))
 }

 start - proc.time(); df; flush.console(); proc.time() - start
 #  user  system elapsed
 # 0.408   0.557   0.965
 start - proc.time(); print(df); flush.console(); proc.time() - start
 #  user  system elapsed
 # 0.019   0.002   0.020

If I change print(df) to print.data.frame(df) it hangs.

R version 3.0.2 Patched (2013-10-06 r64031) -- Frisbee Sailing
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Multivariate time series in R 3 vs R 2

2013-10-26 Thread Gabor Grothendieck

On Wed, Oct 23, 2013 at 2:56 PM, Андрей Парамонов cmr.p...@gmail.com wrote:
 Hello!

 Recently I got report that my package mar1s doesn't pass checks any more on
 R 3.0.2. I started to investigate and found the following difference in
 multivariate time series handling in R 3.0.2 compared to R 2 (I've checked
 on 2.14.0).

 Suppose I wish to calculate seasonal component for time series. In case of
 multivariate time series, I wish to process each column independently. Let
 f be a simple (trivial) model of seasonal component:

 f - function(x)
   return(ts(rep(0, length(x)), start = 0, frequency = frequency(x)))

 In previous versions of R, I used the following compact and efficient
 expression to calculate seasonal component:

 y - do.call(cbind, lapply(x, f))

 It worked equally good for univariate and multivariate time series:

 R.Version()$version.string
 [1] R version 2.14.0 (2011-10-31)
 t - ts(1:10, start = 100, frequency = 10)

 x - t
 y - do.call(cbind, lapply(x, f))
 y
 Time Series:
 Start = c(0, 1)
 End = c(0, 10)
 Frequency = 10
  [1] 0 0 0 0 0 0 0 0 0 0

 x - cbind(t, t)
 y - do.call(cbind, lapply(x, f))
 y
 Time Series:
 Start = c(0, 1)
 End = c(0, 10)
 Frequency = 10
 t t
 0.0 0 0
 0.1 0 0
 0.2 0 0
 0.3 0 0
 0.4 0 0
 0.5 0 0
 0.6 0 0
 0.7 0 0
 0.8 0 0
 0.9 0 0

 But in version 3, I get some frustrating results:

 R.Version()$version.string
 [1] R version 3.0.2 (2013-09-25)
 t - ts(1:10, start = 100, frequency = 10)

 x - t
 y - do.call(cbind, lapply(x, f))
 y
 Time Series:
 Start = 0
 End = 0
 Frequency = 1
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0
   structure(0, .Tsp = c(0, 0, 1), class = ts)
 0 0



I get the same results in R-2.14.0 and R-3.02.  They both give the
result shown above with the structures in the output.  I used
R version 2.14.0 (2011-10-31).

Try starting a clean session in R 2.14.0 using:

R --vanilla

and try it again.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Question about selective importing of package functions...

2013-10-20 Thread Gabor Grothendieck

On Sun, Oct 20, 2013 at 4:49 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
 On 13-10-20 4:43 PM, Jonathan Greenberg wrote:

 I'm working on an update for my CRAN package spatial.tools and I noticed
 a new warning when running R CMD CHECK --as-cran:

 * checking CRAN incoming feasibility ... NOTE
 Maintainer: 'Jonathan Asher Greenberg spatial-to...@estarcion.net'
 Depends: includes the non-default packages:
'sp' 'raster' 'rgdal' 'mmap' 'abind' 'parallel' 'foreach'
'doParallel' 'rgeos'
 Adding so many packages to the search path is excessive
 and importing selectively is preferable.

 Is this a warning that would need to be fixed pre-CRAN (not really sure
 how, since I need functions from all of those packages)?  Is there a way
 to
 import only a single function from a package, if that function is a
 dependency?


 You really want to use imports.  Those are defined in the NAMESPACE file;
 you can import everything from a package if you want, but the best style is
 in fact to just import exactly what you need.  This is more robust than
 using Depends, and it doesn't add so much to the user's search path, so it's
 less likely to break something else (e.g. by putting a package on the path
 that masks some function the user already had there.)

That may answer the specific case of the poster but how does one
handle the case
where one wants the user to be able to access the functions in the
dependent package.

For example, sqldf depends on gsubfn which provides fn which is used
with sqldf to
perform substitutions in the SQL string.

library(sqldf)
tt - 3
fn$sqldf(select * from BOD where Time  $tt)

I don't want to ask the user to tediously issue a library(gsubfn) too since
fn is frequently needed and for literally years this has not been necessary.
Also I don't want to duplicate fn's code in sqldf since that makes the whole
thing less modular -- it would imply having to change fn in two places
if anything
in fn changed.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 961 matches

Mail list logo