Raul: Obviously we are at cross-purposes. From your last post, I think I have an inkling of what the problem is. The following represents my understanding of what we are talking about.
I assume I have a population with unknown distribution f. I want to learn something about this population, and in particular estimate its mean and variance. To do this, I take a finite random sample X1,...,Xn. These are independent random variables each having distribution f. I then define the sample mean and the sample variance. These are statistics, which means they are functions of X1,...,Xn, and so they are random variables whose distribution depends on f. I show that the expected value of the sample mean is the population mean, and the expected value of the sample variance is the population variance, which I describe by saying that these are unbiased estimators of the appropriate parameter. This calculation ultimately eliminates f, which is good because it is unknown. Now I take an actual sample x1,...,xn and evaluate the statistic on the sample. This gives me numbers which estimate the population mean and variance. If I wanted to know how good these estimates are, I would use confidence intervals. This requires some knowledge of the distribution of the statistics. For large samples, or for samples from a normal distribution, I know the sample mean is approximately normally distributed with mean equal to the population mean and variance \sigma^2/n, where \sigma^2 is the population variance. If I know \sigma, I am done. If I have to estimate \sigma from the sample, the sample mean has the t-distribution equivalent t-distribution. For the variance, constructing a confidence interval is more difficult, and requires more knowledge of the population distribution f. For example, if f is normal, then some multiple of the sample variance has the chi-squared distribution. The above summarizes the estimation process I am doing. I now believe you are doing something different, which explains why we are having trouble communicating. You have a population with known distribution f, and a random variable X whose distribution is f. You then construct a sample space of equiprobable outcomes and define a random variable Y on this whose distribution is g, with g is an approximation to f. Then then mean and variance of Y, are expected to approximate the mean and variance of X. For you, the sample mean is E(Y), a number approximating E(X), and the sample variance is \sigma^2(Y), a number approximating \sigma^2(X). This explains some of the confusion we have had, where I have been insisting that the sample mean is a random variable, and you have been insisting it is a number. The term "sample space" is misleading. For the finite distributions we are discussing, the sample space is just the set of outcomes. Applying a random variable to this is a sample of size 1: if S is the sample space, X is a function X:S->R with distribution f. When I talk about a sample of size n, I am talking about a function X1 x X2 x X3 x...Xn : S x S x ... x S-> R x R x ... x R, with distribution f x f x ... x f. This random variable ranges over all possible samples of size n. Please let me know if this accurately represents what you are doing. If so, I would suggest you proceed more straightforwardly. Since you know the population distribution f, you can estimate the population parameters without worrying about sampling or statistics. For example for the population mean, you want to find an estimate for \sum x f(x) or \int x f(x) dx, depending on whether the population is discrete or continuous. This can be done using techniques from numerical analysis rather than statistics. You are constructing an approximation g to f, and then figuring out the above with g substituted for f (I think). Best wishes, John ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
