Sarah,
This strategy works great for this small dataset, but when I attempt your
method with my data set I reach the maximum allowable memory allocation and
the operation just stalls and then stops completely before it is finished.
Do you know of a way around this?
Thanks
On Tue, Mar 10, 2015 at
The key to your problem may be that
x<-apply(missing,1,genRows)
converts 'missing' to a matrix, with the same type for all columns
then makes x either a list or a matrix but never a data.frame.
Those features of apply may mess up the rest of your calculations.
Don't use apply().
Bill Dunlap
T
Sarah,
I realized what I was saying after I pressed send on the email. It makes
perfect sense now, thanks so much for your help and patience.
On Mar 10, 2015 5:57 PM, "Sarah Goslee" wrote:
> I think you're kind of missing the way this works:
>
> the data frame created by expand.grid() should ONL
I think you're kind of missing the way this works:
the data frame created by expand.grid() should ONLY have site, year,
sample (with the exact names used in the data itself).
Then the merged data frame will have the full site,year,sample
combinations, along with ALL the data variables. Your animal
You may find it beneficial to investigate packages dplyr, data.table, or a
combination of the two for handling large data sets in memory. Or, perhaps
dplyr with a SQL back end for working on disk (I have not tried that myself
yet).
I do find your excuse for manufacturing data records uncompelli
Thanks Sarah, one of my column names was missing a letter so it was
throwing things off. It works super fast now and is exactly what I needed.
My actual data set has about 6 other ancillary response data data columns,
is there a way to combine the 'full' data set I just created with the
original i
Yeah, that's tiny:
> fullout <- expand.grid(site=1:669, year=1:7, sample=1:3)
> dim(fullout)
[1] 14049 3
Almost certainly the problem is that your expand.grid result doesn't
have the same column names as your actual data file, so merge() is
trying to make an enormous result. Note how when I
Sarah,
I have 669 sites and each site has 7 years of data, so if I'm thinking
correctly then there should be 4683 possible combinations of site x year.
For each year though I need 3 sampling periods so that there is something
like the following:
site 1 year1 sample 1
site 1 year1
William,
You say not to use apply here, but what would you use in its place?
Thanks
On Tue, Mar 10, 2015 at 2:13 PM, William Dunlap wrote:
> The key to your problem may be that
>x<-apply(missing,1,genRows)
> converts 'missing' to a matrix, with the same type for all columns
> then makes x
Hi,
I didn't work through your code, because it looked overly complicated.
Here's a more general approach that does what you appear to want:
# use dput() to provide reproducible data please!
comAn <- structure(list(animals = c("bird", "bird", "bird", "bird", "bird",
"bird", "dog", "dog", "dog", "
Hey everyone,
I've written a function that adds NAs to a dataframe where data is missing
and it seems to work great if I only need to run it once, but if I run it
two times in a row I run into problems. I've created a workable example to
explain what I mean and why I would do this.
In my datafram
11 matches
Mail list logo