Well the data are coming from a population that follows a normal distribution, in my 
case is just
data from a disease,but in these data part of them are coming from another  different 
disease.
The  fact is the second disease values are always  much bigger than the mean of the 
previous one
and the problem is they are not so numerous to separate then with a mixture of 
gaussian. Because
of they are a small number  my algorithm  does not works.And I need to separate them 
because they
will be noise in my results. Looking to the histogram they can consider 
outliers(because they are
far away from the peak) so stimating  the variance I can take the 90 % of the data 
that belongs
to the first disease and do my results with them
David


Russell Martin wrote:

> David Delgado Gomez <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> > Good morning,
> >
> > I have data with a normal distribution.  Values higher than the mean are
> > corrupted with noise. Is it possible to estimate the variance of the
> > gaussian distribution just taking into account values smaller than the
> > mean?
> > Thanks
> > David
>
> Some others have already commented, but I'll add my $0.02.
>
> As often is the case, we might be able to give more useful answers
> if we had more information.  If we take what you've written at face
> value, then you would appear to know a lot about the process that
> generated the data that could be applied to the problem, i.e. you
> know (somehow) that the distribution is normal (Herman Rubin might
> say it isn't normal ;-) ), and you know (somehow) that only the
> values higher than the mean are corrupted (implying you know the
> mean, too).  Perhaps you also know (somehow) the characteristics
> of the noise?
>
> Someone's suggestion that you simply reflect the data less than the
> mean to above the mean and calculate the sample variance seems like
> an easy, practical solution as a first cut, with the caveat that
> the result would have, I think, a higher uncertainty than one would
> get from the same number of "real" values.  Depending on how much
> work it is worth to get a result, here's another idea.  If you know
> mean and the characteristics of the noise, you can generate many
> samples of normal values plus noise with specified variances, and
> compare those to the observed distribution (with something like a
> K-S test, for instance).  Pick the variance that generates
> distributions closest to the observations.  You probably want to
> generate a number of samples of "fake" data for each proposed variance
> you want to test, plus maybe using some different values of the
> parameters of the noise, too.  You should be able to start with
> something fairly close to the "true" variance by taking the variance
> from the corrupted sample (if it doesn't spread the observed
> distribution too much) or getting a starting value with the
> "reflection" method given above.  It's a bit of work, and only
> doable if you have enough information, but it might be worth it
> if you really need to convince somebody about the results (like
> your dissertation commitee).
>
> Regards,
> Russell

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to