Re: [R] unfold list (variable number of columns) into a data frame

Dennis Murphy Sun, 23 Oct 2011 09:57:24 -0700

Hi:

Here's one approach:


# Function to process a list component into a data frame
ff <- function(x) {
     data.frame(time = x[1], partitioning_mode = x[2], workload = x[3],
                runtime = as.numeric(x[4:length(x)]) )
   }

# Apply it to each element of the list:
do.call(rbind, lapply(data, ff))

or equivalently, using the plyr package,

library('plyr')
ldply(data, ff)

# Example:
L <- list(c("1", "sharding", "query", "607", "85", "52", "79", "77",
"67", "98"),
          c("1", "sharding", "refresh", "2932", "2870", "2877", "2868"),
          c("1", "replication", "query", "2891", "2907", "2922", "2937"))
do.call(rbind, lapply(L, ff))
   time partitioning_mode workload runtime
1     1          sharding    query     607
2     1          sharding    query      85
3     1          sharding    query      52
4     1          sharding    query      79
5     1          sharding    query      77
6     1          sharding    query      67
7     1          sharding    query      98
8     1          sharding  refresh    2932
9     1          sharding  refresh    2870
10    1          sharding  refresh    2877
11    1          sharding  refresh    2868
12    1       replication    query    2891
13    1       replication    query    2907
14    1       replication    query    2922
15    1       replication    query    2937

HTH,
Dennis

On Sun, Oct 23, 2011 at 8:38 AM, Giovanni Azua <[email protected]> wrote:
> Hello,
>
> I used R a lot one year ago and now I am a bit rusty :)
>
> I have my raw data which correspond to the list of runtimes per minute 
> (minute "1" "2" "3" in two database modes "sharding" and "query" and two 
> workload types "query" and "refresh") and as a list of char arrays that looks 
> like this:
>
>> str(data)
> List of 122
>  $ : chr [1:163] "1" "sharding" "query" "607" "85" "52" "79" "77" "67" "98"  
> ...
>  $ : chr [1:313] "1" "sharding" "refresh" "2932" "2870" "2877" "2868" ...
>  $ : chr [1:57] "1" "replication" "query" "2891" "2907" "2922" "2937" ...
>  $ : chr [1:278] "1" "replication refresh "79" "79" "89" "79" "89" "79" "79" 
> "79" ...
>  $ : chr [1:163] "2" "sharding" "query" "607" "85" "52" "79" "77" "67" "98"  
> ...
>  $ : chr [1:313] "2" "sharding" "refresh" "2932" "2870" "2877" "2868" ...
>  $ : chr [1:57] "2" "replication" "query" "2891" "2907" "2922" "2937" ...
>  $ : chr [1:278] "2" "replication refresh "79" "79" "89" "79" "89" "79" "79" 
> "79" ...
>  $ : chr [1:163] "3" "sharding" "query" "607" "85" "52" "79" "77" "67" "98"  
> ...
>  $ : chr [1:313] "3" "sharding" "refresh" "2932" "2870" "2877" "2868" ...
>  $ : chr [1:57] "3" "replication" "query" "2891" "2907" "2922" "2937" ...
>  $ : chr [1:278] "3" "replication refresh "79" "79" "89" "79" "89" "79" "79" 
> "79" ...
>
> I would like to transform the one above into a data frame where this 
> structure in unfolded in the following way:
>
> 'data.frame': N obs. of  3 variables:
>  $ time : int  1 1 1 1 1 1 1 1 1 1 1 ...
>  $ partitioning_mode : chr "sharding" "sharding" "sharding" "sharding" 
> "sharding" "sharding" "sharding" "sharding" "sharding" "sharding" ...
>  $ workload : chr "query" "query" "query" "query" "query" "query" "query" 
> "refresh" "refresh" "refresh" "refresh" ...
>  $ runtime : num  607 85 52 79 77 67 98 2932 2870 2877 2868...
>
> So instead of having an associative array (variable number of columns) it 
> should become a simple list where the group or factors are repeated for every 
> occurrence of the  specific runtime. Basically my ultimate goal is to get a 
> data frame structure that is "summarizeBy"-friendly and "ggplot2-friendly" 
> i.e. using this data frame format.
>
> Help greatly appreciated!
>
> TIA,
> Best regards,
> Giovanni
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] unfold list (variable number of columns) into a data frame

Reply via email to