Ted,

Based upon your code below, you might be better off using two lapply() 
constructs to create the x and y results separately, taking advantage of 
lapply()'s built-in ability to create lists 'on the fly', while returning a 
NULL when the function will not be applied to the data based upon your test.

For example:

lapply(seq(n), function(i) if (test on ID[i]) funcX() else NULL)

and something like:

lapply(seq(n), function(i) if (test on ID[i]) do.call(rbind, funcY()) else NULL)


and then you can use the do.call() approach on the results of both.


Consider:

# Only return data if 'i' is even

Res1 <- lapply(1:5, function(i) if (i %% 2 == 0) iris[1:i, ] else NULL)

> Res1
[[1]]
NULL

[[2]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa

[[3]]
NULL

[[4]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa

[[5]]
NULL



When we use do.call() here the elements that are NULL do not result in any 
problems creating the result:

> do.call(rbind, Res1)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          5.1         3.5          1.4         0.2  setosa
4          4.9         3.0          1.4         0.2  setosa
5          4.7         3.2          1.3         0.2  setosa
6          4.6         3.1          1.5         0.2  setosa



Now consider the second example, where your function would return a list of 
data frames. I'll use replicate() with 'simplify = FALSE' so that the result 
within lapply() is either a single list of data frames or NULL. If the result 
would be a list of data frames, we'll use do.call() within the loop so that 
lapply() returns a single data frame rather than a list of data frames. 
Consider:


> replicate(3, iris[1:3, ], simplify = FALSE)
[[1]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa

[[2]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa

[[3]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa



> do.call(rbind, replicate(3, iris[1:3, ], simplify = FALSE))
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          5.1         3.5          1.4         0.2  setosa
5          4.9         3.0          1.4         0.2  setosa
6          4.7         3.2          1.3         0.2  setosa
7          5.1         3.5          1.4         0.2  setosa
8          4.9         3.0          1.4         0.2  setosa
9          4.7         3.2          1.3         0.2  setosa



So now:

Res2 <- lapply(1:5, function(i) if (i %% 2 == 0) 
                                   do.call(rbind, replicate(i, iris[1:i, ], 
                                                            simplify = FALSE)) 
                                   else NULL)

> Res2
[[1]]
NULL

[[2]]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          5.1         3.5          1.4         0.2  setosa
4          4.9         3.0          1.4         0.2  setosa

[[3]]
NULL

[[4]]
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.1         3.5          1.4         0.2  setosa
6           4.9         3.0          1.4         0.2  setosa
7           4.7         3.2          1.3         0.2  setosa
8           4.6         3.1          1.5         0.2  setosa
9           5.1         3.5          1.4         0.2  setosa
10          4.9         3.0          1.4         0.2  setosa
11          4.7         3.2          1.3         0.2  setosa
12          4.6         3.1          1.5         0.2  setosa
13          5.1         3.5          1.4         0.2  setosa
14          4.9         3.0          1.4         0.2  setosa
15          4.7         3.2          1.3         0.2  setosa
16          4.6         3.1          1.5         0.2  setosa

[[5]]
NULL


> do.call(rbind, Res2)
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           5.1         3.5          1.4         0.2  setosa
4           4.9         3.0          1.4         0.2  setosa
5           5.1         3.5          1.4         0.2  setosa
6           4.9         3.0          1.4         0.2  setosa
7           4.7         3.2          1.3         0.2  setosa
8           4.6         3.1          1.5         0.2  setosa
9           5.1         3.5          1.4         0.2  setosa
10          4.9         3.0          1.4         0.2  setosa
11          4.7         3.2          1.3         0.2  setosa
12          4.6         3.1          1.5         0.2  setosa
13          5.1         3.5          1.4         0.2  setosa
14          4.9         3.0          1.4         0.2  setosa
15          4.7         3.2          1.3         0.2  setosa
16          4.6         3.1          1.5         0.2  setosa
17          5.1         3.5          1.4         0.2  setosa
18          4.9         3.0          1.4         0.2  setosa
19          4.7         3.2          1.3         0.2  setosa
20          4.6         3.1          1.5         0.2  setosa




So if you separate the two procedures, given that they are returning differing 
data and structures, by using lapply() you can avoid worrying about the 
returned data structure, as well as having to preallocate based upon not 
knowing how many IDs there will be. By returning NULL when the respective 
function will not be applied based upon your test, you can still use 
do.call(rbind, TheList) since those list elements that are NULL will be ignored 
in the result.

Does that help?

Marc



On Jul 15, 2010, at 4:32 PM, Ted Byers wrote:

> Thanks Marc
> 
> Part of the challenge here is that EVERYTHING is dynamic.  New data is being 
> added to the DB all the time  Each active ID makes a new sample very day or 
> at a minimum every week, and new IDs are added every week.  So I can't hard 
> code anything.  If, for a given ID, I had 50 weekly samples last week, I'll 
> have 51 samples this week.
> 
> But some for the IDs have sample sizes that are so small, it would be pure BS 
> to try to use fitdist on their data.
> 
> I have figured out a way to handle this for a given ID, and so I have the 
> loop that iterates over the IDs, and processes the data for that ID IF there 
> is sufficient data.  And to make things interesting, the number of IDs I need 
> to process this week is greater than the number of IDs I had to process last 
> week.
> 
> So, I iterate over IDs, from 1 up through perhaps 500.  If a given ID has 
> sufficient data, I get the z lists.  And I have checked, applying rbind to 
> these works great!  Of all the IDs' datasets I have examined, perhaps 10% do 
> not yet have enough data to work with (but that, too changes through time).
> 
> From what you have said, it would seem that I ought to make a master list.  
> So, I need to learn how to make a master list grow from nothing to include 
> all these z lists.  That reduces to a question of how can one append 
> dynamically created lists of varying size (from just a few list elements to a 
> few hundred list elements) to such a master list.
> 
> Actually, when it gets right down to it, I think I am ignorant of a key piece 
> of the puzzle (I have probably missed the key part of the documentation 
> dealing with this).  I do not yet know how to add even one element to a list 
> within a loop where the loop does not exist (or at least is empty) at the 
> beginning of the loop.
> 
> I get your example "do.call(rbind, c(z1, z2, z3, z4))", but what do you do if 
> there is no list at the beginning of a loop and you need to handle something 
> like:
> 
> #n is some large number, and in about 10% of values of 'i' (not known a 
> priori) creation
> # of x and y is skipped
> for (i = 1:n) {
>   if(test that returns tru only 90% of the time) {
>     x = function_that_makes_a_data_frame()
>     y = function_that_makes_a_list_of_data_frames()
>   }
> }
> 
> We have not created any lists on entry into the loop.  How do we create a 
> list containing all instances of x and another that contains all elements 
> that had been in each instance of y?  If I can learn how to do that, then I 
> can call  do.call(rbind,x_list) and do.call(rbind,y_element_list).
> 
> If you know C++, and specifically the STL containers and algorithms, one can 
> grow vectors or lists using a function called 'push_back' which is defined on 
> most stl containers.  I am looking for the R equivalent for objects, and the 
> R equivalent of the C++ STL algorithm std::copy (passed the begin and end 
> iterators of the source list and a back inserter for the recipient 
> container), for appending a source list to a master list.
> 
> Thanks
> 
> Ted

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to