- I did not think about this question much 
until I read Robert's comment.

On 6 Feb 2003 06:31:53 -0800, [EMAIL PROTECTED] (Robert J. MacG.
Dawson) wrote:
 
> Norm Loomer wrote:
> > 
> > A colleague brought me some annual time series data that shows a
> > downward trend from 1992 to 2001. In 1996 the agency that recorded the
> > data adopted a new program that, if successful, would cause the data to
> > trend downward more sharply. What they are looking for, I think, is
> > evidence that the slope after 1996 is greater (in absolute value) than
> > the slope before 1996. Can someone help me with this?
> 
>       Probably not, unless the trend change is enormous or the random
> variation miniscule. Fitting a three-degree-of-freedom model with ten
        = minuscule [ here is my minor effort to preserve the word ]
> data is usually overfitting.

If the aggregated means are  all that is available, 
someone might plot the points that show a nice fit,
with a changed slope;  but  I think I would have trouble
trusting the statistical test even if it came out with a
small p-value.  

How malleable are the data sources?  Who created
or generated the numbers?  You can run into trouble
with selective vision that sets up a big data set, too.
However, I think the hazard is more -- it can be harder
to notice --  with just a few degrees of freedom.  

I like the word  "overfitting" here, partly because it 
suggests to me all the alternate solutions that have 
not been considered.  I'm assuming that this is not a
randomized, controlled study, and so, even if a 
trend DOES exist in the data, there is still a burden of
proof if one wants to assert that there is one  "cause."
An example  comes to mind:  There once were  variations 
of coronary-death rates between Appalachian states 
and their neighboring states. [ I don't know how this 
epidemiology ever played out.]   "Minerals in water 
supplies"  was a popular hypothesis.  But 30 years ago,
there could have been a more primitive solution, namely,
creditability of the death certificates -- based on state
differences in doing autopsies.

Back to the question:  It would be a lot more believable, 
to me, if I could see numerous sub-sets of the data.  And
the subsets would exhibit a range of outcomes, which might
(say)  be associated with some explicit features of the 
implementation of the  'new program'.  Also, that range 
should exhibit *random*  variability, so that I don't want
to conclude that I am looking at numbers that were 
cobbled together -- consciously cobbled or not -- by partisans.
Then, the subsets provide additional degrees of freedom,
to confirm the initial hypothesis.  The 'proof'  of an 
overall trend might be believable if it is shown in numerous
subsets, with (statistically) reasonable consistency.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to