"David Reilly" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Bill Rowe <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > > In article <[EMAIL PROTECTED]>, > > "jackson marshmallow" <[EMAIL PROTECTED]> wrote: > > > > > > > 1) Two samples of are given and I need to compare their means and variances. > > > The distribution of the population is unknown. Can I use the F-test and the > > > t-test? Is it necessary that the sample _means_ have a Gaussian > > > distribution? Is it sufficient? Maybe I misunderstand something here... > > > > Both the t-test and F-test assume samples are from a normal (Gaussian) > > distribution. The t-test is reasonably robust, i.e., the sample > > distribution needs to deviate quite a bit from normal before conclusions > > based on the t-test are likely to be invalid. But, the F-test is more > > senistive to deviations from normality and probably shouldn't be used if > > there is reason to suspect the sample distribution isn't normal. > > > > > 2) I need to calculate the significance of correlation between two > > > sequences. I would actually prefer to use randomization, but the sequences > > > may be too short. Another option is to perform linear regression and > > > calculate the significance of the slope using a t-test (?). When is it > > > valid? > > > > Linear regression (least squares) assumes a model of the form > > > > y_n = m x_n + b + e_n > > > > where m and b are the desired regression parameters and e_n is the error > > associated with observation y_n. Further, it is assumed the e_n are from > > a normal distribution. > > > > It isn't clear to me whether this model is applicable to your problem > > with sequences. > > To pursue your reflection ... > > The word sequence, to me ,implies a chronological set of values > measured at fixed intervals of time.
I meant it in a more general sense, but in fact that is what I'm using. > > Thus one needs to treat any autoregressive structure evidenced in the > e's such that the resultant error process , say a_n is rendered > N.I.I.D. . > > a_n = e_n /[ARIMA] > > Care must be taken to insute that a_n contains no outliers/inliers, > seasonal pulses , level shifts or local time trends and if it does > then > Intervention Variables (0,1) need to be introduced into the model > > > y_n = m x_n * b + a_n [ARIMA] + I_n > > via I_n > I have a basic understanding of what you're saying; our system behaves as some non-linear lowpass filter, different attack and decay times. I haven't tried autoregression of e's and am not sure yet whether that will give me any useful information at this stage... I know they are non-random... I'll have to look further into this... > More on this can be found at > > http://www.autobox.com/afs-school_regression_vs_box-jenkins.doc > > and > > http://www.autobox.com/case_studies_-_general_mills_frozen_biscuits.doc > > or try searching > > http://www.autobox.com > > http://www.autobox.com/teach.html > > Hope this helps ... Yes, thank you very much. > > Dave Reilly > Automatic Forecasting Systems > 215-675-0652 > http://www.autobox.com . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
