Dear all,

I've a question regarding outliners and the number of data points.

For instance, I want to use regression to calculate the slope over 3
years, i.e. 36 data points, one point for each month. So I use the
following method:

1. calculate the median value
2. find the standard deviation
3. set the threshold = median value + std dev * constant (e.g.
constant = 10)
4. outliers are the data points which are greater than the threshold.
5. replace an outlier with the mean of its neighbor data points.
6. regression

However, I also want to find the slope for each year using the same
method. As I may not have all the 12 data points for each calendar
year (e.g. Feb 01 - Jan 04, 36 data points in total, 11 data points
for the 1st year and 1 data points for the last year), I found the
above-mentioned method didn't work very well to detect the outliers. 
I'm thinking about making the constant smaller for fewer data points.
 Any ideas?

Thanks,
SChiu
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to