Re: [R] Working With Variables Having Different Lengths

Weidong Gu Fri, 21 Oct 2011 09:41:26 -0700

Sounds like you are dealing with missing data problem. At default, lm
or glm would only keep observations with complete records (complete
case analysis). This can be problematic if you have many missing
variables and missing values occur not completely at random (i.e.,
missing values are dependent on other (un)measured variables or
missing values themselves). Imputation is a common tool for handling
imcomplete data analysis. In R, you can find packages which conduct
single or multiple imputations, e.g. randomForest, norm, mice, mi
etc..

No easy way out with missing data problems, all imputations are based
on some strong and untestable assumptions.

Weidong Gu

On Fri, Oct 21, 2011 at 12:13 PM, Rich Shepard <rshep...@appl-ecosys.com> wrote:
>  Because of regulatory requirement changes over several decades and weather
> conditions preventing site access the variables in my data set have
> different lengths. I'd like guidance on how to perform linear regressions
> and other models with these variables.
>
>  For example, there are 2206 rows for the parameter "TDS" but only 1191
> rows for the parameter "Cond." Such discrepancies are common in these data.
>
>  Is there a reference I can read to learn how to analyze such data?
>
> Rich
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Working With Variables Having Different Lengths

Reply via email to