Re: when to use nonparametric statistics

jackson marshmallow Wed, 03 Dec 2003 06:19:14 -0800

"David Reilly" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> Bill Rowe <[EMAIL PROTECTED]> wrote in message
news:<[EMAIL PROTECTED]>...
> > In article <[EMAIL PROTECTED]>,
> >  "jackson marshmallow" <[EMAIL PROTECTED]> wrote:
> >
> >
> > > 1) Two samples of are given and I need to compare their means and
variances.
> > > The distribution of the population is unknown. Can I use the F-test
and the
> > > t-test? Is it necessary that the sample _means_ have a Gaussian
> > > distribution? Is it sufficient? Maybe I misunderstand something
here...
> >
> > Both the t-test and F-test assume samples are from a normal (Gaussian)
> > distribution.  The t-test is reasonably robust, i.e., the sample
> > distribution needs to deviate quite a bit from normal before conclusions
> > based on the t-test are likely to be invalid. But, the F-test is more
> > senistive to deviations from normality and probably shouldn't be used if
> > there is reason to suspect the sample distribution isn't normal.
> >
> > > 2) I need to calculate the significance of correlation between two
> > > sequences. I would actually prefer to use randomization, but the
sequences
> > > may be too short. Another option is to perform linear regression and
> > > calculate the significance of the slope using a t-test (?). When is it
> > > valid?
> >
> > Linear regression (least squares) assumes a model of the form
> >
> > y_n = m x_n + b  + e_n
> >
> > where m and b are the desired regression parameters and e_n is the error
> > associated with observation y_n. Further, it is assumed the e_n are from
> > a normal distribution.
> >
> > It isn't clear to me whether this model is applicable to your problem
> > with sequences.
>
> To pursue your reflection ...
>
> The word sequence, to me ,implies a chronological set of values
> measured at fixed intervals of time.


I meant it in a more general sense, but in fact that is what I'm using.

>
> Thus one needs to treat any autoregressive structure evidenced in the
> e's such that the resultant error process , say a_n is rendered
> N.I.I.D. .
>
> a_n = e_n /[ARIMA]
>
> Care must be taken to insute that a_n contains no outliers/inliers,
> seasonal pulses , level shifts or local time trends and if it does
> then
> Intervention Variables (0,1) need to be introduced into the model
>
> > y_n = m x_n * b  + a_n [ARIMA] + I_n
>
> via I_n
>

I have a basic understanding of what you're saying; our system behaves as
some non-linear lowpass filter, different attack and decay times. I haven't
tried autoregression of e's and am not sure yet whether that will give me
any useful information at this stage... I know they are non-random... I'll
have to look further into this...


> More on this can be found at
>
> http://www.autobox.com/afs-school_regression_vs_box-jenkins.doc
>
> and
>
> http://www.autobox.com/case_studies_-_general_mills_frozen_biscuits.doc
>
> or try searching
>
> http://www.autobox.com
>
> http://www.autobox.com/teach.html
>
> Hope this helps ...


Yes, thank you very much.

>
> Dave Reilly
> Automatic Forecasting Systems
> 215-675-0652
> http://www.autobox.com


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: when to use nonparametric statistics

Reply via email to