[EMAIL PROTECTED] wrote: > Hi! > > Is there a possibilty in R to carry out LOCF (Last Observation Carried > Forward) analysis or to create a new data frame (array, matrix) with > LOCF? Or some helpful functions, packages? > > Karl
As I understand the methodology and potential issues regarding the imputation of data for the missing observations, I have a couple of thoughts: 1. The missing observation data can be imputed where missing using standard R data management functions. The complexity or lack of it will likely depend upon your exact data structure. For example, if the missing values are all NA's, you can use vector/matrix indexing to replace them based upon various conditions. If the subsetting logic is more complex, you can use the replace() function, which enables you to specify a complex boolean construct. See ?replace for more information. If your data (x) is sequenced left to right in a time series vector, you can identify the position of the last known observation for example: > x <- c(23, 25, 24, NA, 25, NA, NA) > max(which(!is.na(x))) [1] 5 and fill to the right, repeating the last known data: > LOCF <- max(which(!is.na(x))) > x[LOCF:length(x)] <- x[LOCF] > x [1] 23 25 24 NA 25 25 25 A quick search on Google raises some known issues with the methodology depending upon the nature of the missing data and what sort of assumptions you are willing to make or live with. For more complex imptation, there are a variety of missing data imputation functions available for R, for example in Frank Harrell's Design and Hmisc packages on CRAN. 2. Another alternative to consider, depending upon how much missing data you are dealing with and its etiology, would be an unbalanced mixed effects approach using the model functions in package 'nlme'. I might defer to others here, but something to consider. HTH, Marc Schwartz ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
