Hi Mark, In my example, there has been a regime change at time 25 and I'd like to find a way to discover 1- what has changed and 2- when it did.
The problem is all that is observed are the x and y values. unknownbetas are... unknown. If you look at x and y, you can't really tell something has changed. It is not an outlier per se as it involves a change of one of the unknownbetas. In other words, I'm trying to single out which unknownbetas vs. x relationships still hold after time 25. I know it's complicated, but I you have any pointers, it will be appreciated. Thanks On 2/19/07, Marc Schwartz <[EMAIL PROTECTED]> wrote: > On Mon, 2007-02-19 at 09:58 -0500, Pierre Lapointe wrote: > > Hello, > > > > I have a particular situation where a single "wrong" observation is > > impacting the results of a traditional regression to the point that > > betas become unreliable. I need a way to calculate the most likely > > betas. Here's an example: > > > > set.seed(1) > > unknownbeta <- matrix(seq(100,500,100),25,5,byrow=TRUE) > > x <-matrix(runif(25*5),25) > > y <- rowSums(unknownbeta*x) > > summary(lm(y~0+x)) #gets back the unknown betas. > > > > #Now, let's introduce a single wrong data. > > > > unknownbeta[25,5] <-100 > > y <- rowSums(unknownbeta*x) > > summary(lm(y~0+x)) #every beta changes. > > > > I need to find out what are the most likely betas in the second > > example. There is no obvious way to know that row 25 has wrong input. > > I would even be happy if the conclusion was that x1:x4 are 100, 200, > > 300 and 400 and that x5 is zero. > > > > Thanks > > It is not clear what you mean by a "wrong" observation. Is the data > completely bad because it was improperly collected? Is this an > observation that has correct data, but is an "outlier" relative to the > other observations? Is the observation missing data, where values can be > reasonably imputed? > > Are you in a setting where the observation MUST be included in the > regression rather than be deleted? For example an "Intent to Treat" > analysis in a clinical trial? > > Depending upon the context, your options may range from simply removing > the single observation from the regression, considering some form of > weighting of the observations, to perhaps considering a robust > regression methodology and others. > > This is not strictly an R question, but one of methodology. > Clarification of which is potentially impacted upon by "community" > standards and prior work within your particular discipline. > > HTH, > > Marc Schwartz > > > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
