Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Ivan Calandra
I feel like I'm always asking this type of questions, but is it possible 
to add a base function that allows creating an empty data.frame, as 
matrix() does?


What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several 
matrices and then combining them


Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl  wrote:

Thanks. I also noticed myself minutes after sending my message to the list.
My 'please ignore my question it was just a stupid typo' message was sent
with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given column types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10), stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

I use the following code to create two data.frames d1 and d2 from a list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1 1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create an
'empty' data.frame with specified column types and dimensions. I need
this
data.frame to pass on to my c++ routines. Is there a more simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.










--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Santosh Srinivas
Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

  df - dataFrame(colClasses=c(a=integer, b=double), nrow=10)
  df[,1] - sample(1:nrow(df))
  df[,2] - rnorm(nrow(df))
  print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de wrote:
 I feel like I'm always asking this type of questions, but is it possible to
 add a base function that allows creating an empty data.frame, as matrix()
 does?

 What I mean would be something like:
 create.data.frame(number_of_columns, mode_of_columns).
 I think it would make things easier than creating one or several matrices
 and then combining them

 Is it possible; does it make sense?

 Ivan

 Le 5/15/2011 22:17, Bert Gunter a écrit :

 Inline below.

 On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

 Thanks. I also noticed myself minutes after sending my message to the
 list.
 My 'please ignore my question it was just a stupid typo' message was sent
 with the wrong account and is now awaiting moderation.

 However, my other question still stands: what is the
 preferred/fastest/simplest way to create a data.fame with given column
 types
 and dimensions?

 I do not know, but  why is simply

 data.frame(numeric(10), character(10), integer(10),
 stringsAsFactors=FALSE)

 not acceptable? Note that if you had, say, 500, numeric (= double) and
 100 character columns to add, you might do something like:

 z- matrix(numeric(5000),nr=10)
 u- matrix(character(1000),nr=10)
 frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

 While this might save some typing, it may not be much more efficient
 than typing it all out -- maybe just some parsing time is saved. You
 can experiment and see.

 However, since a data.frame **is** a list with added attributes and a
 great deal of the work of the constructor is in constructing and
 checking these attributes (e.g. row and column names), I see nothing
 terribly inefficient with what you did. It's just a bit obscure.  But
 maybe someone with greater expertise will set us both straight.

 Cheers,
 Bert


 Regards,
 Jan


 On 05/15/2011 04:43 PM, Bert Gunter wrote:

 In your post, you're missing the final s on the stringsAsFactors
 argument in the d1 assignment. When I typed it correctly, it works as
 expected.

 -- Bert

 On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

 I use the following code to create two data.frames d1 and d2 from a
 list:
 types- c(integer, character, double)
 nlines- 10
 d1- as.data.frame(lapply(types, do.call, list(nlines)),
 stringsAsFactor=FALSE)
 l2- lapply(types, do.call, list(nlines))
 d2- as.data.frame(l2, stringsAsFactors=FALSE)

 I would expect d1 and d2 to be the same, however, in d1 the second
 column
 is
 a factor while in d2 it is a character (which I would expect):

 str(d1)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1
 1
 1
 1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0

 str(d2)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0


 As different but related question: I use the commands above to create
 an
 'empty' data.frame with specified column types and dimensions. I need
 this
 data.frame to pass on to my c++ routines. Is there a more
 simple/elegant
 way
 of creating this data.frame?

 Regards,

 Jan


 PS:
 I am running R on 64 bit Ubuntu 11.04:

 sessionInfo()

 R version 2.12.1 (2010-12-16)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 --
 Ivan CALANDRA
 PhD Student
 University of Hamburg
 Biozentrum Grindel und Zoologisches Museum
 Abt. Säugetiere
 Martin-Luther-King-Platz 3
 D-20146 Hamburg, GERMANY
 +49(0)40 42838 6231
 ivan.calan...@uni-hamburg.de

 **
 http://www.for771.uni-bonn.de
 http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do 

Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Ivan Calandra

Thanks Santosh!
The more I learn about R.utils, the more I think that many of its 
functions should be included in the base distribution.

Ivan

Le 5/16/2011 10:42, Santosh Srinivas a écrit :

Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

   df- dataFrame(colClasses=c(a=integer, b=double), nrow=10)
   df[,1]- sample(1:nrow(df))
   df[,2]- rnorm(nrow(df))
   print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:

I feel like I'm always asking this type of questions, but is it possible to
add a base function that allows creating an empty data.frame, as matrix()
does?

What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several matrices
and then combining them

Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

Thanks. I also noticed myself minutes after sending my message to the
list.
My 'please ignore my question it was just a stupid typo' message was sent
with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given column
types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10),
stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

I use the following code to create two data.frames d1 and d2 from a
list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1
1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create
an
'empty' data.frame with specified column types and dimensions. I need
this
data.frame to pass on to my c++ routines. Is there a more
simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__

Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Ivan Calandra
Actually, what would be even better would be an extra argument to 
specify the column names.
I don't think it's very difficult to implement and it would make things 
even easier.

Ivan

Le 5/16/2011 11:25, Ivan Calandra a écrit :

Thanks Santosh!
The more I learn about R.utils, the more I think that many of its 
functions should be included in the base distribution.

Ivan

Le 5/16/2011 10:42, Santosh Srinivas a écrit :

Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

   df- dataFrame(colClasses=c(a=integer, b=double), nrow=10)
   df[,1]- sample(1:nrow(df))
   df[,2]- rnorm(nrow(df))
   print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:
I feel like I'm always asking this type of questions, but is it 
possible to
add a base function that allows creating an empty data.frame, as 
matrix()

does?

What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several 
matrices

and then combining them

Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

Thanks. I also noticed myself minutes after sending my message to the
list.
My 'please ignore my question it was just a stupid typo' message 
was sent

with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given 
column

types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10),
stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it 
works as

expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

I use the following code to create two data.frames d1 and d2 from a
list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 
0 0 0 0
  $ c: Factor w/ 1 level 
: 1 1

1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 
0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 
0 0 0 0

  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 
0 0 0 0



As different but related question: I use the commands above to 
create

an
'empty' data.frame with specified column types and dimensions. I 
need

this
data.frame to pass on to my c++ routines. Is there a more
simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. 

Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Ivan Calandra

Forget this last email, I oversaw the implementation in the examples...
Ivan


Le 5/16/2011 11:35, Ivan Calandra a écrit :
Actually, what would be even better would be an extra argument to 
specify the column names.
I don't think it's very difficult to implement and it would make 
things even easier.

Ivan

Le 5/16/2011 11:25, Ivan Calandra a écrit :

Thanks Santosh!
The more I learn about R.utils, the more I think that many of its 
functions should be included in the base distribution.

Ivan

Le 5/16/2011 10:42, Santosh Srinivas a écrit :

Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

   df- dataFrame(colClasses=c(a=integer, b=double), nrow=10)
   df[,1]- sample(1:nrow(df))
   df[,2]- rnorm(nrow(df))
   print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:
I feel like I'm always asking this type of questions, but is it 
possible to
add a base function that allows creating an empty data.frame, as 
matrix()

does?

What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several 
matrices

and then combining them

Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:
Thanks. I also noticed myself minutes after sending my message to 
the

list.
My 'please ignore my question it was just a stupid typo' message 
was sent

with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given 
column

types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10),
stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) 
and

100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it 
works as

expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der 
Laanrh...@eoos.dds.nl

  wrote:
I use the following code to create two data.frames d1 and d2 
from a

list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 
0 0 0 0
  $ c: Factor w/ 1 
level : 1 1

1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 
0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 
0 0 0 0
  $ c: chr  
...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 
0 0 0 0



As different but related question: I use the commands above to 
create

an
'empty' data.frame with specified column types and dimensions. 
I need

this
data.frame to pass on to my c++ routines. Is there a more
simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   
base


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, 

Re: [R] Unexpected behaviour as.data.frame

2011-05-16 Thread Jan van der Laan

Santosh, Ivan,

This is also what I was looking for. Thanks. Looking at the source of 
dataFrame.default is seems that it uses the same approach as I did: 
first create a list then a data.frame from that list. I think I'll stick 
with the code I already had as I don't want another dependency (multiple 
actually for R.utils). But thanks again for pointing it out.


Jan

On 05/16/2011 10:42 AM, Santosh Srinivas wrote:

Hi Ivan, Take a look dataFrame in R.utils ... is that what you want?

from the help file:

Examples

   df- dataFrame(colClasses=c(a=integer, b=double), nrow=10)
   df[,1]- sample(1:nrow(df))
   df[,2]- rnorm(nrow(df))
   print(df)

Thanks,
Santosh

On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:

I feel like I'm always asking this type of questions, but is it possible to
add a base function that allows creating an empty data.frame, as matrix()
does?

What I mean would be something like:
create.data.frame(number_of_columns, mode_of_columns).
I think it would make things easier than creating one or several matrices
and then combining them

Is it possible; does it make sense?

Ivan

Le 5/15/2011 22:17, Bert Gunter a écrit :

Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

Thanks. I also noticed myself minutes after sending my message to the
list.
My 'please ignore my question it was just a stupid typo' message was sent
with the wrong account and is now awaiting moderation.

However, my other question still stands: what is the
preferred/fastest/simplest way to create a data.fame with given column
types
and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10),
stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:


z- matrix(numeric(5000),nr=10)
u- matrix(character(1000),nr=10)
frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

I use the following code to create two data.frames d1 and d2 from a
list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column
is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1
1
1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create
an
'empty' data.frame with specified column types and dimensions. I need
this
data.frame to pass on to my c++ routines. Is there a more
simple/elegant
way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere

Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan
Forget I asked. There was a typo in my example (stringsAsFactor  
instead of stringAsFactors) which explained the difference. My  
apologies.


My second question however still stands: How does on create a  
data.frame with given column types and given dimensions? Thanks.


Regards,
Jan


Quoting Jan van der Laan rh...@eoos.dds.nl:


I use the following code to create two data.frames d1 and d2 from a list:

types  - c(integer, character, double)
nlines - 10
d1 - as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2 - lapply(types, do.call, list(nlines))
d2 - as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second
column is a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: Factor w/ 1 level : 1 1
1 1 1 1 1 1 1 1
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: chr  ...
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create
an 'empty' data.frame with specified column types and dimensions. I
need this data.frame to pass on to my c++ routines. Is there a more
simple/elegant way of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan

I use the following code to create two data.frames d1 and d2 from a list:

types  - c(integer, character, double)
nlines - 10
d1 - as.data.frame(lapply(types, do.call, list(nlines)),  
stringsAsFactor=FALSE)

l2 - lapply(types, do.call, list(nlines))
d2 - as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second  
column is a factor while in d2 it is a character (which I would expect):



str(d1)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: Factor w/ 1 level : 1  
1 1 1 1 1 1 1 1 1

 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
 $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
 $ c: chr  ...
 $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create  
an 'empty' data.frame with specified column types and dimensions. I  
need this data.frame to pass on to my c++ routines. Is there a more  
simple/elegant way of creating this data.frame?


Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Bert Gunter
In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laan rh...@eoos.dds.nl wrote:
 I use the following code to create two data.frames d1 and d2 from a list:
 types  - c(integer, character, double)
 nlines - 10
 d1     - as.data.frame(lapply(types, do.call, list(nlines)),
 stringsAsFactor=FALSE)
 l2     - lapply(types, do.call, list(nlines))
 d2     - as.data.frame(l2, stringsAsFactors=FALSE)

 I would expect d1 and d2 to be the same, however, in d1 the second column is
 a factor while in d2 it is a character (which I would expect):

 str(d1)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1 1 1
 1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0

 str(d2)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0


 As different but related question: I use the commands above to create an
 'empty' data.frame with specified column types and dimensions. I need this
 data.frame to pass on to my c++ routines. Is there a more simple/elegant way
 of creating this data.frame?

 Regards,

 Jan


 PS:
 I am running R on 64 bit Ubuntu 11.04:

 sessionInfo()

 R version 2.12.1 (2010-12-16)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Jan van der Laan
Thanks. I also noticed myself minutes after sending my message to the 
list. My 'please ignore my question it was just a stupid typo' message 
was sent with the wrong account and is now awaiting moderation.


However, my other question still stands: what is the 
preferred/fastest/simplest way to create a data.fame with given column 
types and dimensions?


Regards,
Jan


On 05/15/2011 04:43 PM, Bert Gunter wrote:

In your post, you're missing the final s on the stringsAsFactors
argument in the d1 assignment. When I typed it correctly, it works as
expected.

-- Bert

On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl  wrote:

I use the following code to create two data.frames d1 and d2 from a list:
types- c(integer, character, double)
nlines- 10
d1- as.data.frame(lapply(types, do.call, list(nlines)),
stringsAsFactor=FALSE)
l2- lapply(types, do.call, list(nlines))
d2- as.data.frame(l2, stringsAsFactors=FALSE)

I would expect d1 and d2 to be the same, however, in d1 the second column is
a factor while in d2 it is a character (which I would expect):


str(d1)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1 1 1
1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0

str(d2)

'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.  : num  0 0 0 0 0 0 0 0 0 0


As different but related question: I use the commands above to create an
'empty' data.frame with specified column types and dimensions. I need this
data.frame to pass on to my c++ routines. Is there a more simple/elegant way
of creating this data.frame?

Regards,

Jan


PS:
I am running R on 64 bit Ubuntu 11.04:


sessionInfo()

R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behaviour as.data.frame

2011-05-15 Thread Bert Gunter
Inline below.

On Sun, May 15, 2011 at 11:11 AM, Jan van der Laan rh...@eoos.dds.nl wrote:
 Thanks. I also noticed myself minutes after sending my message to the list.
 My 'please ignore my question it was just a stupid typo' message was sent
 with the wrong account and is now awaiting moderation.

 However, my other question still stands: what is the
 preferred/fastest/simplest way to create a data.fame with given column types
 and dimensions?

I do not know, but  why is simply

data.frame(numeric(10), character(10), integer(10), stringsAsFactors=FALSE)

not acceptable? Note that if you had, say, 500, numeric (= double) and
100 character columns to add, you might do something like:

 z - matrix(numeric(5000),nr=10)
 u - matrix(character(1000),nr=10)
 frm - data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns

While this might save some typing, it may not be much more efficient
than typing it all out -- maybe just some parsing time is saved. You
can experiment and see.

However, since a data.frame **is** a list with added attributes and a
great deal of the work of the constructor is in constructing and
checking these attributes (e.g. row and column names), I see nothing
terribly inefficient with what you did. It's just a bit obscure.  But
maybe someone with greater expertise will set us both straight.

Cheers,
Bert



 Regards,
 Jan


 On 05/15/2011 04:43 PM, Bert Gunter wrote:

 In your post, you're missing the final s on the stringsAsFactors
 argument in the d1 assignment. When I typed it correctly, it works as
 expected.

 -- Bert

 On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl
  wrote:

 I use the following code to create two data.frames d1 and d2 from a list:
 types- c(integer, character, double)
 nlines- 10
 d1- as.data.frame(lapply(types, do.call, list(nlines)),
 stringsAsFactor=FALSE)
 l2- lapply(types, do.call, list(nlines))
 d2- as.data.frame(l2, stringsAsFactors=FALSE)

 I would expect d1 and d2 to be the same, however, in d1 the second column
 is
 a factor while in d2 it is a character (which I would expect):

 str(d1)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: Factor w/ 1 level : 1 1 1
 1
 1 1 1 1 1 1
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0

 str(d2)

 'data.frame':   10 obs. of  3 variables:
  $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int  0 0 0 0 0 0 0 0 0 0
  $ c: chr  ...
  $ c.0..0..0..0..0..0..0..0..0..0.          : num  0 0 0 0 0 0 0 0 0 0


 As different but related question: I use the commands above to create an
 'empty' data.frame with specified column types and dimensions. I need
 this
 data.frame to pass on to my c++ routines. Is there a more simple/elegant
 way
 of creating this data.frame?

 Regards,

 Jan


 PS:
 I am running R on 64 bit Ubuntu 11.04:

 sessionInfo()

 R version 2.12.1 (2010-12-16)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.








-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.