Re: [R] data.frame() versus as.data.frame() applied to a matrix.

2019-02-05 Thread William Dunlap via R-help
I think of the methods of as.data.frame as a helper functions for
data.frame and don't usually call as.data.frame directly.  data.frame()
will call as.data.frame for each of its arguments and then put together the
the results into one big data.frame.

> for(method in
c("as.data.frame.list","as.data.frame.character","as.data.frame.integer","as.data.frame.numeric","as.data.frame.matrix"))
trace(method, quote(str(x)))
Tracing function "as.data.frame.list" in package "base"
Tracing function "as.data.frame.character" in package "base"
Tracing function "as.data.frame.integer" in package "base"
Tracing function "as.data.frame.numeric" in package "base"
Tracing function "as.data.frame.matrix" in package "base"
> d <-
data.frame(Mat=cbind(m1=11:12,M2=13:14),Num=c(15.5,16.6),Int=17:18,List=list(L1=19:20,L2=c(20.2,21.2)))
Tracing as.data.frame.matrix(x[[i]], optional = TRUE) on entry
 int [1:2, 1:2] 11 12 13 14
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:2] "m1" "M2"
Tracing as.data.frame.numeric(x[[i]], optional = TRUE) on entry
 num [1:2] 15.5 16.6
Tracing as.data.frame.integer(x[[i]], optional = TRUE) on entry
 int [1:2] 17 18
Tracing as.data.frame.list(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors) on entry
List of 2
 $ L1: int [1:2] 19 20
 $ L2: num [1:2] 20.2 21.2
Tracing as.data.frame.integer(x[[i]], optional = TRUE) on entry
 int [1:2] 19 20
Tracing as.data.frame.numeric(x[[i]], optional = TRUE) on entry
 num [1:2] 20.2 21.2

If I recall correctly, that is how S did things and Splus tried to use
something like as.data.frameAux for the name of the helper function to
avoid some of the frustration you describe.

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Tue, Feb 5, 2019 at 2:22 PM Rolf Turner  wrote:

>
> Consider the following:
>
> set.seed(42)
> X <- matrix(runif(40),10,4)
> colnames(X) <- c("a","b","a:x","b:x") # Imitating the output
># of model.matrix().
> D1 <- as.data.frame(X)
> D2 <- data.frame(X)
> names(D1)
> [1] "a"   "b"   "a:x" "b:x"
> names(D2)
> [1] "a"   "b"   "a.x" "b.x"
>
> The names of D2 are syntactically valid; those of D1 are not.
>
> Why should I have expected this phenomenon? :-)
>
> The as.data.frame() syntax seems to me much more natural for converting
> a matrix to a data frame, yet it doesn't get it quite right, sometimes,
> in respect of the names.
>
> Is there some reason that as.data.frame() does not apply make.names()?
> Or was this just an oversight?
>
> cheers,
>
> Rolf Turner
>
> --
> Honorary Research Fellow
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame() versus as.data.frame() applied to a matrix.

2019-02-05 Thread Richard M. Heiberger
To me the interesting difference between matrix() and as.matrix() is
that as.matrix() retains the argument names as the rows names of the
result.
> tmp <- structure(1:3, names=letters[1:3])
> tmp
a b c
1 2 3
> matrix(tmp)
 [,1]
[1,]1
[2,]2
[3,]3
> as.matrix(tmp)
  [,1]
a1
b2
c3
>

On Tue, Feb 5, 2019 at 6:53 PM Rolf Turner  wrote:
>
>
> On 2/6/19 12:27 PM, Jeff Newmiller wrote:
>
> > I have no idea about "why it is this way" but there are many cases
> > where I would rather have to use backticks around
> > syntactically-invalid names than deal with arbitrary rules for
> > mapping column names as they were supplied to column names as R wants
> > them to be. From that perspective, making the conversion function
> > leave the names alone and limit the name-mashing to one function
> > sounds great to me. You can always call make.names yourself.
>
> Fair enough.  My real problem was getting ambushed by the fact that
> *different* names arise depending on whether one uses data.frame(X)
> or as.data.frame(X).  I'll spare you the details. :-)
>
> cheers,
>
> Rolf
>
> >
> > On February 5, 2019 2:22:24 PM PST, Rolf Turner
> >  wrote:
> >>
> >> Consider the following:
> >>
> >> set.seed(42) X <- matrix(runif(40),10,4) colnames(X) <-
> >> c("a","b","a:x","b:x") # Imitating the output # of model.matrix().
> >> D1 <- as.data.frame(X) D2 <- data.frame(X) names(D1) [1] "a"   "b"
> >> "a:x" "b:x" names(D2) [1] "a"   "b"   "a.x" "b.x"
> >>
> >> The names of D2 are syntactically valid; those of D1 are not.
> >>
> >> Why should I have expected this phenomenon? :-)
> >>
> >> The as.data.frame() syntax seems to me much more natural for
> >> converting
> >>
> >> a matrix to a data frame, yet it doesn't get it quite right,
> >> sometimes, in respect of the names.
> >>
> >> Is there some reason that as.data.frame() does not apply
> >> make.names()? Or was this just an oversight?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame() versus as.data.frame() applied to a matrix.

2019-02-05 Thread Rolf Turner



On 2/6/19 12:27 PM, Jeff Newmiller wrote:


I have no idea about "why it is this way" but there are many cases
where I would rather have to use backticks around
syntactically-invalid names than deal with arbitrary rules for
mapping column names as they were supplied to column names as R wants
them to be. From that perspective, making the conversion function
leave the names alone and limit the name-mashing to one function
sounds great to me. You can always call make.names yourself.


Fair enough.  My real problem was getting ambushed by the fact that 
*different* names arise depending on whether one uses data.frame(X)

or as.data.frame(X).  I'll spare you the details. :-)

cheers,

Rolf



On February 5, 2019 2:22:24 PM PST, Rolf Turner
 wrote:


Consider the following:

set.seed(42) X <- matrix(runif(40),10,4) colnames(X) <-
c("a","b","a:x","b:x") # Imitating the output # of model.matrix(). 
D1 <- as.data.frame(X) D2 <- data.frame(X) names(D1) [1] "a"   "b"

"a:x" "b:x" names(D2) [1] "a"   "b"   "a.x" "b.x"

The names of D2 are syntactically valid; those of D1 are not.

Why should I have expected this phenomenon? :-)

The as.data.frame() syntax seems to me much more natural for
converting

a matrix to a data frame, yet it doesn't get it quite right,
sometimes, in respect of the names.

Is there some reason that as.data.frame() does not apply
make.names()? Or was this just an oversight?


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data.frame() versus as.data.frame() applied to a matrix.

2019-02-05 Thread Jeff Newmiller
I have no idea about "why it is this way" but there are many cases where I 
would rather have to use backticks around syntactically-invalid names than deal 
with arbitrary rules for mapping column names as they were supplied to column 
names as R wants them to be. From that perspective, making the conversion 
function leave the names alone and limit the name-mashing to one function 
sounds great to me. You can always call make.names yourself.

On February 5, 2019 2:22:24 PM PST, Rolf Turner  wrote:
>
>Consider the following:
>
>set.seed(42)
>X <- matrix(runif(40),10,4)
>colnames(X) <- c("a","b","a:x","b:x") # Imitating the output
>   # of model.matrix().
>D1 <- as.data.frame(X)
>D2 <- data.frame(X)
>names(D1)
>[1] "a"   "b"   "a:x" "b:x"
>names(D2)
>[1] "a"   "b"   "a.x" "b.x"
>
>The names of D2 are syntactically valid; those of D1 are not.
>
>Why should I have expected this phenomenon? :-)
>
>The as.data.frame() syntax seems to me much more natural for converting
>
>a matrix to a data frame, yet it doesn't get it quite right, sometimes,
>in respect of the names.
>
>Is there some reason that as.data.frame() does not apply make.names()?
>Or was this just an oversight?
>
>cheers,
>
>Rolf Turner

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame() versus as.data.frame() applied to a matrix.

2019-02-05 Thread Rolf Turner



Consider the following:

set.seed(42)
X <- matrix(runif(40),10,4)
colnames(X) <- c("a","b","a:x","b:x") # Imitating the output
  # of model.matrix().
D1 <- as.data.frame(X)
D2 <- data.frame(X)
names(D1)
[1] "a"   "b"   "a:x" "b:x"
names(D2)
[1] "a"   "b"   "a.x" "b.x"

The names of D2 are syntactically valid; those of D1 are not.

Why should I have expected this phenomenon? :-)

The as.data.frame() syntax seems to me much more natural for converting 
a matrix to a data frame, yet it doesn't get it quite right, sometimes,

in respect of the names.

Is there some reason that as.data.frame() does not apply make.names()?
Or was this just an oversight?

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.