Let me add a little detail... The intended use of the analysis is this (right or wrong): given all of the employees in a particular job title at a particular company location, perform the t-test to determine the number of standard deviations between the means of males and females. The assumption is that, since these are "similarly situated employees", there should not be a "statistically significant" difference in pay between males and females.
how is time on job factored in? if the males at a site had been there much longer than females ... would not we expect this difference? if there were NOT a difference ... would not THAT be suspicious?
The EEOC deems a difference of 2 or more standard deviations "statistically significant."
they can consider it "significant" but ... it is not statistically significant
anyway ... simpler solution ... put ALL salaries in a column (you might do this site by site ... or make it composite over ALL sites) ... convert to z scores ... and, then ... isolate the MEAN for the z scores for the males ... and the mean of the z scores for the females ... and see how large (if any) that difference is ... if that difference is 2 units ... say that one group is being paid more than the other (significantly more by EEOC definition even though it is NOT really statistical significance) ... then be done with it ... you are in trouble ... if not, then i guess you are not in trouble ...
doing all this inferential work is an unnecessary step in what appears to me to be a rather simple problem ... IF the criterion set by EEOC is 2 sds different ... (their criterion is not two standard ERRORS different)
. . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
