RE: [R] density estimation: compute sum(value * probability) for
On 13-Nov-04 bogdan romocea wrote: Dear R users, However, how do I compute sum(values*probabilities)? The probabilities produced by the density function sum to only 26%: sum(den$y) [1] 0.2611142 Would it perhaps be ok to simply do sum(den$x*den$y) * (1/sum(den$y)) [1] 1073.22 ? What you're missing is the dx! A density estimation estimates the probability density function g(x) such that int[g(x)*dx] = 1, and R's 'density' function returns estimated values of g at a discrete set of points. An integral can be approximated by a discrete summation of the form sum(g(x.i)*delta.x You can recover the set of x-values at which the density is estimated, and hence the implicit value of delta.x, from the returned density. Example: X-rnorm(1000) f-density(X) x-f$x delta.x-x[2]-x[1] g-f$y sum(g*delta.x) [1] 1.000976 Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 14-Nov-04 Time: 08:50:53 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] density estimation: compute sum(value * probability) for given distribution
First thing you probably should realize is that density is _not_ probability. A probability density function _integrates_ to one, not _sum_ to one. If X is an absolutely continuous RV with density f, then Pr(X=x)=0 for all x, and Pr(a X b) = \int_a^b f(x) dx. sum x*Pr(X=x) (over all possible values of x) for a discrete distribution is just the expectation, or mean, of the distribution. The expectation for a continuous distribution is \int x f(x) dx, where the integral is over the support of f. This is all elementary math stat that you can find in any textbook. Could you tell us exactly what you are trying to compute, or why you're computing it? HTH, Andy From: bogdan romocea Dear R users, This is a KDE beginner's question. I have this distribution: length(cap) [1] 200 summary(cap) Min. 1st Qu. MedianMean 3rd Qu.Max. 459.9 802.3 991.6 1066.0 1242.0 2382.0 I need to compute the sum of the values times their probability of occurence. The graph is fine, den - density(cap, from=min(cap), to=max(cap), give.Rkern=F) plot(den) However, how do I compute sum(values*probabilities)? The probabilities produced by the density function sum to only 26%: sum(den$y) [1] 0.2611142 Would it perhaps be ok to simply do sum(den$x*den$y) * (1/sum(den$y)) [1] 1073.22 ? Thank you, b. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] density estimation: compute sum(value * probability) for given distribution
Andy, Thanks a lot for the clarifications. I was running a simulation a number of times and trying to come up with a number to summarize the results. And, I failed to realize from the beginning that what I was trying to compute was just the mean. Regards, b. --- Liaw, Andy [EMAIL PROTECTED] wrote: First thing you probably should realize is that density is _not_ probability. A probability density function _integrates_ to one, not _sum_ to one. If X is an absolutely continuous RV with density f, then Pr(X=x)=0 for all x, and Pr(a X b) = \int_a^b f(x) dx. sum x*Pr(X=x) (over all possible values of x) for a discrete distribution is just the expectation, or mean, of the distribution. The expectation for a continuous distribution is \int x f(x) dx, where the integral is over the support of f. This is all elementary math stat that you can find in any textbook. Could you tell us exactly what you are trying to compute, or why you're computing it? HTH, Andy From: bogdan romocea Dear R users, This is a KDE beginner's question. I have this distribution: length(cap) [1] 200 summary(cap) Min. 1st Qu. MedianMean 3rd Qu.Max. 459.9 802.3 991.6 1066.0 1242.0 2382.0 I need to compute the sum of the values times their probability of occurence. The graph is fine, den - density(cap, from=min(cap), to=max(cap), give.Rkern=F) plot(den) However, how do I compute sum(values*probabilities)? The probabilities produced by the density function sum to only 26%: sum(den$y) [1] 0.2611142 Would it perhaps be ok to simply do sum(den$x*den$y) * (1/sum(den$y)) [1] 1073.22 ? Thank you, b. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Notice: This e-mail message, together with any attachments, contains information of Merck Co., Inc. (One Merck Drive, Whitehouse Station, New Jersey, USA 08889), and/or its affiliates (which may be known outside the United States as Merck Frosst, Merck Sharp Dohme or MSD and in Japan, as Banyu) that may be confidential, proprietary copyrighted and/or legally privileged. It is intended solely for the use of the individual or entity named on this message. If you are not the intended recipient, and have received this message in error, please notify us immediately by reply e-mail and then delete it from your system. -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] density estimation: compute sum(value * probability) for given distribution
bogdan romocea wrote: Dear R users, This is a KDE beginner's question. I have this distribution: length(cap) [1] 200 summary(cap) Min. 1st Qu. MedianMean 3rd Qu.Max. 459.9 802.3 991.6 1066.0 1242.0 2382.0 I need to compute the sum of the values times their probability of occurence. The graph is fine, den - density(cap, from=min(cap), to=max(cap), give.Rkern=F) plot(den) However, how do I compute sum(values*probabilities)? I don't get the point. You are estimating using a gaussian kernel. Hint: What's the probability to get x=0 for a N(0,1) distribution? So sum(values*probabilities) is zero! The probabilities produced by the density function sum to only 26%: and could also sum to, e.g., 783453.9, depending on the number of observations and the estimated parameters of the desnity ... sum(den$y) [1] 0.2611142 Would it perhaps be ok to simply do sum(den$x*den$y) * (1/sum(den$y)) [1] 1073.22 ? No. den$x is a point where the density function is equal to den$y, but den$y is not the probability to get den$x (you know, the stuff with intervals)! I fear you are mixing theory from discrete with continuous distributions. Uwe Ligges Thank you, b. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html