Re: [R] Working With Variables Having Different Lengths

2011-10-24 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: The first thing I would try would be with(subset(chemdata, param %in% c('TDS', 'Cond', 'Mg', 'SO4', 'Cl', 'Na', and 'Ca') , 1:4) , xtabs(quant ~ site + sampdate + param) ) David, Need to remove the 'and' from the above. The results

Re: [R] Working With Variables Having Different Lengths

2011-10-24 Thread David Winsemius
On Oct 24, 2011, at 11:34 AM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: The first thing I would try would be with(subset(chemdata, param %in% c('TDS', 'Cond', 'Mg', 'SO4', 'Cl', 'Na', and 'Ca') , 1:4) , xtabs(quant ~ site + sampdate + param) ) David, Need to

Re: [R] Working With Variables Having Different Lengths

2011-10-24 Thread Rich Shepard
On Mon, 24 Oct 2011, David Winsemius wrote: The appearance of levels with all zeroes is probably because I didn't include drop.unused.levels = FALSE in the xtabs specification. OK. Adding 'drop.unused.levels' does make a huge difference. Thanks, Rich

Re: [R] Working With Variables Having Different Lengths

2011-10-24 Thread David Winsemius
On Oct 24, 2011, at 12:10 PM, Rich Shepard wrote: On Mon, 24 Oct 2011, David Winsemius wrote: The appearance of levels with all zeroes is probably because I didn't include drop.unused.levels = FALSE in the xtabs specification. OK. Adding 'drop.unused.levels' does make a huge difference.

Re: [R] Working With Variables Having Different Lengths

2011-10-24 Thread Rich Shepard
On Mon, 24 Oct 2011, David Winsemius wrote: You could also have saved the subsetted data, applied `factor` to the subsetted column and then used `xtabs`. temp - subset(chemdata, your subset criteria, your column selection) temp$param - factor(temp$param) (Now only levels that exist are in the

[R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
Because of regulatory requirement changes over several decades and weather conditions preventing site access the variables in my data set have different lengths. I'd like guidance on how to perform linear regressions and other models with these variables. For example, there are 2206 rows for

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Weidong Gu
Sounds like you are dealing with missing data problem. At default, lm or glm would only keep observations with complete records (complete case analysis). This can be problematic if you have many missing variables and missing values occur not completely at random (i.e., missing values are dependent

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread B77S
I know in my experience Cond (conductivity??) doesn't vary much within a stream except for during high flow events, and I would imagine the same is true for TDS. If these are all low flow values, you could possibly determine a mean/median value to use for the missing data points. Obviously this

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, Weidong Gu wrote: No easy way out with missing data problems, all imputations are based on some strong and untestable assumptions. Thanks for the insights. Let me rephrase my question in a way that should work: is there a way to subset my comprehensive data frame

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, B77S wrote: I know in my experience Cond (conductivity??) doesn't vary much within a stream except for during high flow events, and I would imagine the same is true for TDS. This is generally true, but not in the streams with which we're working. TDS values, for

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 1:04 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, Weidong Gu wrote: No easy way out with missing data problems, all imputations are based on some strong and untestable assumptions. Thanks for the insights. Let me rephrase my question in a way that should work: is

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. David, That's what I meant: two values from the 'param' column. Assuming these are R NA's then logical

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. Assuming these are R NA's then logical indexing: with( chemdata, chemdata[!is.na(param1) !is.na(param2) ,

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 2:09 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: The last part (in the same column) does not make sense, since I was interpreting the term parameter to mean a value in a particular column. Assuming these are R NA's then logical indexing: with(

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: First you need to clarify whether TDS is the name of a column or a possible value in a column named param. This whole painful multi-question process would be greatly accelerated if you offered str(chemdata). Yes, I did on a different thread, but

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 3:02 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: First you need to clarify whether TDS is the name of a column or a possible value in a column named param. This whole painful multi-question process would be greatly accelerated if you offered

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: How are we to determine which lines contain information about the relationships of param==TDS with whatever cases or variable has values of Cond and SO4? Are you really trying to compare two disjoint groups on some statistic like the means and std-dev

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 4:38 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: How are we to determine which lines contain information about the relationships of param==TDS with whatever cases or variable has values of Cond and SO4? Are you really trying to compare two

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: What problem are you trying to solve? What I need now is to compare TDS (total dissolved solids) with specific conductivity and the ions that are normally comprise TDS. Before running any regression models I need to look at these data from three

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 6:17 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: What problem are you trying to solve? What I need now is to compare TDS (total dissolved solids) with specific conductivity and the ions that are normally comprise TDS. Before running any

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread Rich Shepard
On Fri, 21 Oct 2011, David Winsemius wrote: The only variable in that dataframe with what appears to be a continuous value (which is how I would expect total dissolved solids to be measured) is quant Are you saying that the value of quant is measuring something with different units depending on

Re: [R] Working With Variables Having Different Lengths

2011-10-21 Thread David Winsemius
On Oct 21, 2011, at 8:14 PM, Rich Shepard wrote: On Fri, 21 Oct 2011, David Winsemius wrote: The only variable in that dataframe with what appears to be a continuous value (which is how I would expect total dissolved solids to be measured) is quant Are you saying that the value of quant is