Re: [R] idiom for constructing data frame

2015-04-03 Thread peter dalgaard

 On 31 Mar 2015, at 20:55 , William Dunlap wdun...@tibco.com wrote:
 
 You can use structure() to attach the names to a list that is input to
 data.frame.
 E.g.,
 
 dfNames - c(First, Second Name)
 data.frame(lapply(structure(dfNames, names=dfNames),
 function(name)rep(NA_real_, 5)))
 

Yes, I cooked up something similar:

names - c(foo,bar,baz)
names(names) - names # confuse 'em
as.data.frame(lapply(names, function(x) rep(NA_real_,10)))

but wouldn't it be more to the point to do

df - as.data.frame(rep(list(rep(NA_real_, 10)),3))
names(df) - names

?

The lapply() approach could be generalized to a vector of column classes, 
though. 

A general solution looks impracticable; once you start considering how to 
specify factor columns with each their own level set, things get a bit out of 
hand. 

-pd

 
 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com
 
 On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:
 
 Hi,
 
 Duncan Murdoch suggested:
 
 The matrix() function has a dimnames argument, so you could do this:
 
 names - c(strat, id, pid)
 data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
 That's a definite improvement, thanks. But no way to skip matrix()? It
 just seems unRlike, although since it's only full of NA values there
 are no coercion issues with column types or anything, so it doesn't
 hurt. It's just inelegant. :)
 
 Sarah
 --
 Sarah Goslee
 http://www.functionaldiversity.org
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-04-03 Thread William Dunlap
 but wouldn't it be more to the point to do

 df - as.data.frame(rep(list(rep(NA_real_, 10)),3))
 names(df) - names

As a matter of personal style (and functional programming
sensibility), I prefer not to make named objects and then modify them.
Also, the names coming out of that as.data.frame call are exceedingly
ugly and I'd rather not generate them at all.

Also adding the names after calling data.frame means can give
different results than passing them into data.frame(), which can
mangle nonsyntactic names like Second Name into Second.Name.
It is often preferable, but it is different.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Apr 3, 2015 at 5:51 AM, peter dalgaard pda...@gmail.com wrote:


  On 31 Mar 2015, at 20:55 , William Dunlap wdun...@tibco.com wrote:
 
  You can use structure() to attach the names to a list that is input to
  data.frame.
  E.g.,
 
  dfNames - c(First, Second Name)
  data.frame(lapply(structure(dfNames, names=dfNames),
  function(name)rep(NA_real_, 5)))
 

 Yes, I cooked up something similar:

 names - c(foo,bar,baz)
 names(names) - names # confuse 'em
 as.data.frame(lapply(names, function(x) rep(NA_real_,10)))

 but wouldn't it be more to the point to do

 df - as.data.frame(rep(list(rep(NA_real_, 10)),3))
 names(df) - names

 ?

 The lapply() approach could be generalized to a vector of column classes,
 though.

 A general solution looks impracticable; once you start considering how to
 specify factor columns with each their own level set, things get a bit out
 of hand.

 -pd

 
  Bill Dunlap
  TIBCO Software
  wdunlap tibco.com
 
  On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
  wrote:
 
  Hi,
 
  Duncan Murdoch suggested:
 
  The matrix() function has a dimnames argument, so you could do this:
 
  names - c(strat, id, pid)
  data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
  That's a definite improvement, thanks. But no way to skip matrix()? It
  just seems unRlike, although since it's only full of NA values there
  are no coercion issues with column types or anything, so it doesn't
  hurt. It's just inelegant. :)
 
  Sarah
  --
  Sarah Goslee
  http://www.functionaldiversity.org
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Peter Dalgaard, Professor,
 Center for Statistics, Copenhagen Business School
 Solbjerg Plads 3, 2000 Frederiksberg, Denmark
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com










[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-04-03 Thread Hadley Wickham
On Tue, Mar 31, 2015 at 6:42 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 On Tue, Mar 31, 2015 at 6:35 PM, Richard M. Heiberger r...@temple.edu wrote:
 I got rid of the extra column.

 data.frame(r=seq(8), foo=NA, bar=NA, row.names=r)

 Brilliant!

 After much fussing, including a disturbing detour into nested lapply
 statements from which I barely emerged with my sanity (arguable, I
 suppose), here is a one-liner that creates a data frame of arbitrary
 number of rows given an existing data frame as template for column
 number and name:


 n - 8
 df1 - data.frame(A=runif(9), B=runif(9))

 do.call(data.frame, setNames(c(list(seq(n), r), as.list(rep(NA,
 ncol(df1, c(r, row.names, colnames(df1

 It's not elegant, but it is fairly R-ish. I should probably stop
 hunting for an elegant solution now.

Given a template df, you can create a new df with subsetting:

df2 - df1[rep(NA_real_, 8), ]
rownames(df2) - NULL
df2

This has the added benefit of preserving the types.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-04-03 Thread peter dalgaard

 On 03 Apr 2015, at 16:46 , William Dunlap wdun...@tibco.com wrote:
 
  
  df - as.data.frame(rep(list(rep(NA_real_, 10)),3))
  names(df) - names
 
 As a matter of personal style (and functional programming
 sensibility), I prefer not to make named objects and then modify them.
 Also, the names coming out of that as.data.frame call are exceedingly
 ugly and I'd rather not generate them at all.
 

Ah, yes, I missed the generation of intermediate names. You can name the list 
before as.data.frame, though:

l - rep(list(rep(NA_real_, 10)),3)
names(l) - names
as.data.frame(l)

or as a one-liner:

as.data.frame(structure(rep(list(rep(NA_real_, 10)), 3) , .Names=names))

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Ista Zahn
You can make it as elegant as you want, e.g.,

make.empty.df - function(nrow,ncol, names) {
if(length(names) %% ncol != 0) stop(Lenght of names is not a
multiple of the number of colums)
data.frame(matrix(NA, nrow, ncol, dimnames = list(NULL, names)))
}


Best,
Ista

On Tue, Mar 31, 2015 at 2:37 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi,

 Duncan Murdoch suggested:

 The matrix() function has a dimnames argument, so you could do this:

 names - c(strat, id, pid)
 data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))

 That's a definite improvement, thanks. But no way to skip matrix()? It
 just seems unRlike, although since it's only full of NA values there
 are no coercion issues with column types or anything, so it doesn't
 hurt. It's just inelegant. :)

 Sarah
 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread William Dunlap
You can use structure() to attach the names to a list that is input to
data.frame.
E.g.,

dfNames - c(First, Second Name)
data.frame(lapply(structure(dfNames, names=dfNames),
function(name)rep(NA_real_, 5)))


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
wrote:

 Hi,

 Duncan Murdoch suggested:

  The matrix() function has a dimnames argument, so you could do this:
 
  names - c(strat, id, pid)
  data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))

 That's a definite improvement, thanks. But no way to skip matrix()? It
 just seems unRlike, although since it's only full of NA values there
 are no coercion issues with column types or anything, so it doesn't
 hurt. It's just inelegant. :)

 Sarah
 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Sarah Goslee
Hi,

Duncan Murdoch suggested:

 The matrix() function has a dimnames argument, so you could do this:

 names - c(strat, id, pid)
 data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))

That's a definite improvement, thanks. But no way to skip matrix()? It
just seems unRlike, although since it's only full of NA values there
are no coercion issues with column types or anything, so it doesn't
hurt. It's just inelegant. :)

Sarah
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Sarah Goslee
I just snagged this from Duncan Murdoch's reply to the same question:

# Create an empty dataframe to hold the results
df - data.frame(strat=NA, id=NA, pid=NA)[rep(1, length(sel)),]

This skips matrix(), but how to set the column names programmatically
within a function?

Sarah, still sure I'm missing something obvious


On Tue, Mar 31, 2015 at 1:46 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi folks,

 I KNOW there has to be a way to do this more elegantly, but I
 consistently fail to come up with it, as I was just reminded while
 writing an example for a query on this list.

 What's a nifty way to construct a data frame of a given size? The only
 way I know of it to use matrix(), eg

 data.frame(matrix(NA, nrow=10, ncol=3))

 and then to set the colnames in a second step.

 This comes up a lot when pre-allocated a data frame before using a
 loop: I know the size and column names, but want an empty structure to
 fill later.

 Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] idiom for constructing data frame

2015-03-31 Thread Sarah Goslee
Hi folks,

I KNOW there has to be a way to do this more elegantly, but I
consistently fail to come up with it, as I was just reminded while
writing an example for a query on this list.

What's a nifty way to construct a data frame of a given size? The only
way I know of it to use matrix(), eg

data.frame(matrix(NA, nrow=10, ncol=3))

and then to set the colnames in a second step.

This comes up a lot when pre-allocated a data frame before using a
loop: I know the size and column names, but want an empty structure to
fill later.

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Duncan Murdoch

On 31/03/2015 1:52 PM, Sarah Goslee wrote:

I just snagged this from Duncan Murdoch's reply to the same question:

# Create an empty dataframe to hold the results
df - data.frame(strat=NA, id=NA, pid=NA)[rep(1, length(sel)),]

This skips matrix(), but how to set the column names programmatically
within a function?

Sarah, still sure I'm missing something obvious


The matrix() function has a dimnames argument, so you could do this:

names - c(strat, id, pid)
data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))

Duncan Murdoch



On Tue, Mar 31, 2015 at 1:46 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 Hi folks,

 I KNOW there has to be a way to do this more elegantly, but I
 consistently fail to come up with it, as I was just reminded while
 writing an example for a query on this list.

 What's a nifty way to construct a data frame of a given size? The only
 way I know of it to use matrix(), eg

 data.frame(matrix(NA, nrow=10, ncol=3))

 and then to set the colnames in a second step.

 This comes up a lot when pre-allocated a data frame before using a
 loop: I know the size and column names, but want an empty structure to
 fill later.

 Sarah




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Sarah Goslee
On Tue, Mar 31, 2015 at 6:35 PM, Richard M. Heiberger r...@temple.edu wrote:
 I got rid of the extra column.

 data.frame(r=seq(8), foo=NA, bar=NA, row.names=r)

Brilliant!

After much fussing, including a disturbing detour into nested lapply
statements from which I barely emerged with my sanity (arguable, I
suppose), here is a one-liner that creates a data frame of arbitrary
number of rows given an existing data frame as template for column
number and name:


n - 8
df1 - data.frame(A=runif(9), B=runif(9))

do.call(data.frame, setNames(c(list(seq(n), r), as.list(rep(NA,
ncol(df1, c(r, row.names, colnames(df1

It's not elegant, but it is fairly R-ish. I should probably stop
hunting for an elegant solution now.

Thanks, everyone!

Sarah


 Rich

 On Tue, Mar 31, 2015 at 6:18 PM, Sven E. Templer sven.temp...@gmail.com 
 wrote:
 If you don't mind an extra column, you could use something similar to:

 data.frame(r=seq(8),foo=NA,bar=NA)

 If you do, here is another approach (see function body):

 empty.frame - function (r = 1, n = 1, fill = NA_real_) {
   data.frame(setNames(lapply(rep(fill, length(n)), rep, times=r), n))
 }
 empty.frame()
 empty.frame(, seq(3))
 empty.frame(8, c(foo, bar))

 I could not put it in one line either, without retyping at least one
 argument (n in this case).
 So I suggest a function is the way to go for a simplified syntax ...

 Thanks to all for the ideas!
 Sven

 On 31 March 2015 at 20:55, William Dunlap wdun...@tibco.com wrote:

 You can use structure() to attach the names to a list that is input to
 data.frame.
 E.g.,

 dfNames - c(First, Second Name)
 data.frame(lapply(structure(dfNames, names=dfNames),
 function(name)rep(NA_real_, 5)))


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com

 On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  Hi,
 
  Duncan Murdoch suggested:
 
   The matrix() function has a dimnames argument, so you could do this:
  
   names - c(strat, id, pid)
   data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
  That's a definite improvement, thanks. But no way to skip matrix()? It
  just seems unRlike, although since it's only full of NA values there
  are no coercion issues with column types or anything, so it doesn't
  hurt. It's just inelegant. :)
 
  Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Sven E. Templer
If you don't mind an extra column, you could use something similar to:

data.frame(r=seq(8),foo=NA,bar=NA)

If you do, here is another approach (see function body):

empty.frame - function (r = 1, n = 1, fill = NA_real_) {
  data.frame(setNames(lapply(rep(fill, length(n)), rep, times=r), n))
}
empty.frame()
empty.frame(, seq(3))
empty.frame(8, c(foo, bar))

I could not put it in one line either, without retyping at least one
argument (n in this case).
So I suggest a function is the way to go for a simplified syntax ...

Thanks to all for the ideas!
Sven

On 31 March 2015 at 20:55, William Dunlap wdun...@tibco.com wrote:

 You can use structure() to attach the names to a list that is input to
 data.frame.
 E.g.,

 dfNames - c(First, Second Name)
 data.frame(lapply(structure(dfNames, names=dfNames),
 function(name)rep(NA_real_, 5)))


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com

 On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  Hi,
 
  Duncan Murdoch suggested:
 
   The matrix() function has a dimnames argument, so you could do this:
  
   names - c(strat, id, pid)
   data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
  That's a definite improvement, thanks. But no way to skip matrix()? It
  just seems unRlike, although since it's only full of NA values there
  are no coercion issues with column types or anything, so it doesn't
  hurt. It's just inelegant. :)
 
  Sarah
  --
  Sarah Goslee
  http://www.functionaldiversity.org
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Richard M. Heiberger
I got rid of the extra column.

data.frame(r=seq(8), foo=NA, bar=NA, row.names=r)

Rich

On Tue, Mar 31, 2015 at 6:18 PM, Sven E. Templer sven.temp...@gmail.com wrote:
 If you don't mind an extra column, you could use something similar to:

 data.frame(r=seq(8),foo=NA,bar=NA)

 If you do, here is another approach (see function body):

 empty.frame - function (r = 1, n = 1, fill = NA_real_) {
   data.frame(setNames(lapply(rep(fill, length(n)), rep, times=r), n))
 }
 empty.frame()
 empty.frame(, seq(3))
 empty.frame(8, c(foo, bar))

 I could not put it in one line either, without retyping at least one
 argument (n in this case).
 So I suggest a function is the way to go for a simplified syntax ...

 Thanks to all for the ideas!
 Sven

 On 31 March 2015 at 20:55, William Dunlap wdun...@tibco.com wrote:

 You can use structure() to attach the names to a list that is input to
 data.frame.
 E.g.,

 dfNames - c(First, Second Name)
 data.frame(lapply(structure(dfNames, names=dfNames),
 function(name)rep(NA_real_, 5)))


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com

 On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  Hi,
 
  Duncan Murdoch suggested:
 
   The matrix() function has a dimnames argument, so you could do this:
  
   names - c(strat, id, pid)
   data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
  That's a definite improvement, thanks. But no way to skip matrix()? It
  just seems unRlike, although since it's only full of NA values there
  are no coercion issues with column types or anything, so it doesn't
  hurt. It's just inelegant. :)
 
  Sarah
  --
  Sarah Goslee
  http://www.functionaldiversity.org
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] idiom for constructing data frame

2015-03-31 Thread Henrik Bengtsson
I've got dataFrame() in R.utils for this purpose, e.g.

 df - dataFrame(colClasses=c(a=integer, b=double, c=character), 
 nrow=10L)
 str(df)
'data.frame':   10 obs. of  3 variables:
 $ a: int  0 0 0 0 0 0 0 0 0 0
 $ b: num  0 0 0 0 0 0 0 0 0 0
 $ c: chr  ...

Related: You can use the colClasses() function to generate the
'colClasses' argument dynamically, e.g.

 cols - colClasses(idc)
 names(cols) - c(a, b, c)
 str(cols)
 Named chr [1:3] integer double character
 - attr(*, names)= chr [1:3] a b c

 cols - colClasses(sprintf(c2d%di, 4))
 df - dataFrame(colClasses=cols, nrow=10L)
str(df)
'data.frame':   10 obs. of  7 variables:
 $ : chr  ...
 $ : num  0 0 0 0 0 0 0 0 0 0
 $ : num  0 0 0 0 0 0 0 0 0 0
 $ : int  0 0 0 0 0 0 0 0 0 0
 $ : int  0 0 0 0 0 0 0 0 0 0
 $ : int  0 0 0 0 0 0 0 0 0 0
 $ : int  0 0 0 0 0 0 0 0 0 0


dataFrame() is basically implemented as:

dataFrame - function(colClasses, nrow=1L, ...) {
  df - vector(list, length=length(colClasses))
  names(df) - names(colClasses)
  for (kk in seq(along=df)) {
df[[kk]] - vector(colClasses[kk], length=nrow)
  }
  attr(df, row.names) - seq(length=nrow)
  class(df) - data.frame
  df
} # dataFrame()

/Henrik

On Tue, Mar 31, 2015 at 4:42 PM, Sarah Goslee sarah.gos...@gmail.com wrote:
 On Tue, Mar 31, 2015 at 6:35 PM, Richard M. Heiberger r...@temple.edu wrote:
 I got rid of the extra column.

 data.frame(r=seq(8), foo=NA, bar=NA, row.names=r)

 Brilliant!

 After much fussing, including a disturbing detour into nested lapply
 statements from which I barely emerged with my sanity (arguable, I
 suppose), here is a one-liner that creates a data frame of arbitrary
 number of rows given an existing data frame as template for column
 number and name:


 n - 8
 df1 - data.frame(A=runif(9), B=runif(9))

 do.call(data.frame, setNames(c(list(seq(n), r), as.list(rep(NA,
 ncol(df1, c(r, row.names, colnames(df1

 It's not elegant, but it is fairly R-ish. I should probably stop
 hunting for an elegant solution now.

 Thanks, everyone!

 Sarah


 Rich

 On Tue, Mar 31, 2015 at 6:18 PM, Sven E. Templer sven.temp...@gmail.com 
 wrote:
 If you don't mind an extra column, you could use something similar to:

 data.frame(r=seq(8),foo=NA,bar=NA)

 If you do, here is another approach (see function body):

 empty.frame - function (r = 1, n = 1, fill = NA_real_) {
   data.frame(setNames(lapply(rep(fill, length(n)), rep, times=r), n))
 }
 empty.frame()
 empty.frame(, seq(3))
 empty.frame(8, c(foo, bar))

 I could not put it in one line either, without retyping at least one
 argument (n in this case).
 So I suggest a function is the way to go for a simplified syntax ...

 Thanks to all for the ideas!
 Sven

 On 31 March 2015 at 20:55, William Dunlap wdun...@tibco.com wrote:

 You can use structure() to attach the names to a list that is input to
 data.frame.
 E.g.,

 dfNames - c(First, Second Name)
 data.frame(lapply(structure(dfNames, names=dfNames),
 function(name)rep(NA_real_, 5)))


 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com

 On Tue, Mar 31, 2015 at 11:37 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:

  Hi,
 
  Duncan Murdoch suggested:
 
   The matrix() function has a dimnames argument, so you could do this:
  
   names - c(strat, id, pid)
   data.frame(matrix(NA, nrow=10, ncol=3, dimnames=list(NULL, names)))
 
  That's a definite improvement, thanks. But no way to skip matrix()? It
  just seems unRlike, although since it's only full of NA values there
  are no coercion issues with column types or anything, so it doesn't
  hurt. It's just inelegant. :)
 
  Sarah

 --
 Sarah Goslee
 http://www.functionaldiversity.org

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.