Dear Donald:
Thank you so much for your help. You can find a group of data in the attached
file. Most value in this data locate arround 0.8. There is also some data
distribute arround 1. These data should be normal distribution. In these set
of data, most of data distribute arround 0.8. If I assume it is normal
distribution, the area of this PDF should have the largest area. There is
possiblity that several data locate arround 1 and give a higher peak. So I
need use the number of data which is the area below the distribution to be
criterion to get the useful data.
I have try some method to get the mu and sigma by using histogram in Matlab.
But I found that the step of histogram gives high influence to the accuracy.
I tried to split data and get rid of the other distribution, sometimes the
boundary of the histogram is in the middle of one distribution. It will cause
the problem. If I can find solution for either (1) or (2), I can solve this
problem. The best solution in my mind should be estimating the mu and sigma
directly from the data without separating each distribution. I hope it
clearifies my idea and question. Acturally, these data come from wafer of
integrated circuit. I analysis them to get the good chips from the whole
wafer. For these reasons, I know that there are two or more normal
distributions. I think it doesn't make difference if it is other kind of
distribution.
I am apreciated all your help and also thank all your colleage. If you have
any questions, please send me email. I am looking forward to hearing for you.
Donald F. Burrill wrote:
> On Wed, 5 Apr 2000, Xinxin Shao wrote:
>
> > I meet a problem to analysis a group data. The data consist of 2 or
> > more Normal distributions with different mean.
>
> You describe two problems:
>
> (1) I want to find the sigma and mu of the distribution with the largest
> area.
> (2) How can I seperate this normal distribution from others?
>
> If you can do (2), then (1) becomes easy; and if (2) is what you really
> want and need to do, that's what you need to focus on. But if all you
> really need is (1), that's a different sort of technical problem, and
> there probably are ways of estimating that mu and sigma without having
> first to separate the elements of that distribution from the elements of
> all the other distributions that may be present.
>
> It would probably be helpful (to whichever of our colleagues are moved
> to try to address this problem!) to have rather more context. What
> exactly do you mean by "the distribution with the largest area" ?
> How are you approaching the problem in the first place? (E.g., how do
> you know you have "two or more normal distributions with different
> means", and can you narrow that down to two, or three, or four such
> distributions? Would the problem be different if the distributions were
> binomial, or exponential, or Poisson, or ...?)
>
> [Don't reply only to me, by the way: reply to the list as well. I'm not
> at all sure I know precisely how to do whatever it is you want to do.
> Circulating the problem to more people can only be helpful...]
> -- DFB.
> ------------------------------------------------------------------------
> Donald F. Burrill [EMAIL PROTECTED]
> 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
> MSC #29, Plymouth, NH 03264 603-535-2597
> 184 Nashua Road, Bedford, NH 03110 603-471-7128
--
*****************************************************
Xinxin Shao
Department of Electrical and Computer Engineering
University of Minnesota
Tel: 651-641-1869(Home) 612-625-5053(Office)
Email: [EMAIL PROTECTED]
*****************************************************
all1.dat
all2.dat
all4.dat