Re: [R] Density estimation graphs

2007-03-15 Thread Mark Wardle
Mark Wardle wrote:
 Dear all,
 
 I'm struggling with a plot and would value any help!
 ...
 
 Is there a better way? As always, I'm sure there's a one-liner rather
 than my crude technique!
 

As always, I've spent ages trying to sort this, and then the minute
after sending an email, I find the polygon() function.

Ignore previous message!

Best wishes,

Mark

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Density estimation graphs

2007-03-15 Thread Mark Wardle
Dear all,

I'm struggling with a plot and would value any help!

I'm attempting to highlight a histogram and density plot to show a
proportion of cases above a threshold value. I wanted to cross-hatch the
area below the density curve. The breaks and bandwidth are deliberate
integer values because of the type of data I'm looking at.

I've managed to do this, but I don't think it is very good! It would be
difficult, for example, to do a cross-hatch using this technique.

allele.plot - function(x, threshold=NULL, hatch.col='black',
hatch.border=hatch.col, lwd=par('lwd'),...) {
h - hist(x, breaks=max(x), plot=F)
d - density(x, bw=1)
plot(d, lwd=lwd, ...)

if (!is.null(threshold)) {
d.t - d$xthreshold
d.x - d$x[d.t]
d.y - d$y[d.t]
d.l - length(d.x)
# draw all but first line of hatch
for (i in 2:d.l) {
lines(c(d.x[i],d.x[i]),c(0,d.y[i]),
col=hatch.col,lwd=1)
}
# draw first line in hatch border colour
lines(c(d.x[1],d.x[1]),c(0,d.y[1]),
col=hatch.border,lwd=lwd)

# and now re-draw density plot lines
lines(d, lwd=lwd)
}
}

# some pretend data
s8 = rnorm(100, 15, 5)
threshold = 19  # an arbitrary cut-off
allele.plot(s8, threshold, hatch.col='grey',hatch.border='black')


Is there a better way? As always, I'm sure there's a one-liner rather
than my crude technique!

Best wishes,

Mark
-- 
Dr. Mark Wardle
Clinical research fellow and specialist registrar, Neurology
University Hospital Wales and Cardiff University, UK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Density estimation graphs

2007-03-15 Thread Charilaos Skiadas
On Mar 15, 2007, at 12:37 PM, Mark Wardle wrote:

 Dear all,

 I'm struggling with a plot and would value any help!

 I'm attempting to highlight a histogram and density plot to show a
 proportion of cases above a threshold value. I wanted to cross- 
 hatch the
 area below the density curve. The breaks and bandwidth are deliberate
 integer values because of the type of data I'm looking at.

 I've managed to do this, but I don't think it is very good! It  
 would be
 difficult, for example, to do a cross-hatch using this technique.

Don't know about a cross-hatch, but in general I use polygon for  
highlighting areas like that:

allele.plot - function(x, threshold=NULL, hatch.col='black',
hatch.border=hatch.col, lwd=par('lwd'),...) {
h - hist(x, breaks=max(x), plot=F)
d - density(x, bw=1)
plot(d, lwd=lwd, ...)   
if (!is.null(threshold)) {
d.t - d$xthreshold
d.x - d$x[d.t]
d.y - d$y[d.t]
polygon(c(d.x[1],d.x,d.x[1]),c(0,d.y,0), col=hatch.col,lwd=1)
}
}
# some pretend data
s8 = rnorm(100, 15, 5)
threshold = 19  # an arbitrary cut-off
allele.plot(s8, threshold, hatch.col='grey',hatch.border='black')


Perhaps this can help a bit. Btw, what was d.l for?

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Density Estimation

2006-06-10 Thread Adelchi Azzalini
On Thu, Jun 08, 2006 at 08:31:26PM +0200, Pedro Ramirez wrote:
 In mathematical terms the optimal bandwith for density estimation
 decreases at rate n^{-1/5}, while the one for distribution function
 decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
 one must choose an appreciably smaller bandwidth in the second case
 than in the first one.
 
 Thanks a lot for your remark! I was not aware of the fact that the
 optimal bandwidths for density and distribution do not decrease
 at the same rate.
 
 Besides the computational aspect, there is a statistical one:
 the optimal choice of bandwidth for estimating the density function
 is not optimal (and possibly not even jsut sensible) for estimating
 the distribution function, and the stated problem is equivalent to
 estimation of the distribution function.
 
 The given interval 0x3 was only an example, in fact I would
 like to estimate the probability for intervals such as
 
 0=x1 , 1=x2 , 2=x3 , 3=x4 , 
 
 and compare it with the estimates of a corresponding histogram.
 In this case the stated problem is not anymore equivalent to the
 estimation of the distribution function. What do you think, can

why not? the probabilities you are interested in are of the form

F(1)-F(0), F(2)-F(1), and so on

where F(.) if the cumulative distribution function (and it must
be continuous, since its derivative exists).

 I go a ahead in this case with the optimal bandwidth for the
 density? Thanks a lot for your help!

no

best wishes,

Adelchi

 Best wishes
 Pedro
 
 
 
 
 best wishes,
 
 Adelchi
 
 
 PR
 PR 
 PR --
 PR Gregory (Greg) L. Snow Ph.D.
 PR Statistical Data Center
 PR Intermountain Healthcare
 PR [EMAIL PROTECTED]
 PR (801) 408-8111
 PR 
 PR 
 PR -Original Message-
 PR From: [EMAIL PROTECTED]
 PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
 PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
 PR To: r-help@stat.math.ethz.ch
 PR Subject: [R] Density Estimation
 PR 
 PR Dear R-list,
 PR 
 PR I have made a simple kernel density estimation by
 PR 
 PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
 PR kde - density(x,n=100)
 PR 
 PR Now I would like to know the estimated probability that a new
 PR observation falls into the interval 0x3.
 PR 
 PR How can I integrate over the corresponding interval?
 PR In several R-packages for kernel density estimation I did not
 PR found a corresponding function. I could apply Simpson's Rule for
 PR integrating, but perhaps somebody knows a better solution.
 PR 
 PR Thanks a lot for help!
 PR 
 PR Pedro
 PR 
 PR _
 PR 
 PR __
 PR R-help@stat.math.ethz.ch mailing list
 PR https://stat.ethz.ch/mailman/listinfo/r-help
 PR PLEASE do read the posting guide!
 PR http://www.R-project.org/posting-guide.html
 PR 
 PR
 PR __
 PR R-help@stat.math.ethz.ch mailing list
 PR https://stat.ethz.ch/mailman/listinfo/r-help
 PR PLEASE do read the posting guide!
 PR http://www.R-project.org/posting-guide.html
 PR
 
 _
 Don't just search. Find. Check out the new MSN Search! 
 http://search.msn.com/

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-08 Thread Adelchi Azzalini
On Wed, 07 Jun 2006 19:54:32 +0200, Pedro Ramirez wrote:

PR Not a direct answer to your question, but if you use a logspline
PR density estimate rather than a kernal density estimate then the
PR logspline package will help you and it has built in functions for
PR dlogspline, qlogspline, and plogspline that do the integrals for
PR you.
PR 
PR If you want to stick with the KDE, then you could find the area
PR under each of the kernals for the range you are interested in
PR (need to work out the standard deviation used from the bandwidth,
PR then use pnorm for the default gaussian kernal), then just sum
PR the individual areas.
PR 
PR Hope this helps,
PR 
PR Thanks a lot for your quick help! I think I will follow your first
PR 
PR suggestion (logspline
PR density estimation) instead of summing over the kernel areas
PR because at the boundaries of the range truncated kernel areas can
PR occur, so I think it is easier to do it with logsplines. Thanks
PR again for your help!!
PR 
PR Pedro
PR 
PR 

Besides the computational aspect, there is a statistical one:
the optimal choice of bandwidth for estimating the density function 
is not optimal (and possibly not even jsut sensible) for estimating
the distribution function, and the stated problem is equivalent to
estimation of the distribution function. 

In mathematical terms the optimal bandwith for density estimation
decreases at rate n^{-1/5}, while the one for distribution function 
decreases at rate n^{-1/3}, if n is the sample size. In practical terms, 
one must choose an appreciably smaller bandwidth in the second case 
than in the first one.

best wishes,

Adelchi 


PR 
PR 
PR --
PR Gregory (Greg) L. Snow Ph.D.
PR Statistical Data Center
PR Intermountain Healthcare
PR [EMAIL PROTECTED]
PR (801) 408-8111
PR 
PR 
PR -Original Message-
PR From: [EMAIL PROTECTED]
PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
PR To: r-help@stat.math.ethz.ch
PR Subject: [R] Density Estimation
PR 
PR Dear R-list,
PR 
PR I have made a simple kernel density estimation by
PR 
PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
PR kde - density(x,n=100)
PR 
PR Now I would like to know the estimated probability that a new
PR observation falls into the interval 0x3.
PR 
PR How can I integrate over the corresponding interval?
PR In several R-packages for kernel density estimation I did not
PR found a corresponding function. I could apply Simpson's Rule for
PR integrating, but perhaps somebody knows a better solution.
PR 
PR Thanks a lot for help!
PR 
PR Pedro
PR 
PR _
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR 
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-08 Thread Pedro Ramirez
In mathematical terms the optimal bandwith for density estimation
decreases at rate n^{-1/5}, while the one for distribution function
decreases at rate n^{-1/3}, if n is the sample size. In practical terms,
one must choose an appreciably smaller bandwidth in the second case
than in the first one.

Thanks a lot for your remark! I was not aware of the fact that the
optimal bandwidths for density and distribution do not decrease
at the same rate.

Besides the computational aspect, there is a statistical one:
the optimal choice of bandwidth for estimating the density function
is not optimal (and possibly not even jsut sensible) for estimating
the distribution function, and the stated problem is equivalent to
estimation of the distribution function.

The given interval 0x3 was only an example, in fact I would
like to estimate the probability for intervals such as

0=x1 , 1=x2 , 2=x3 , 3=x4 , 

and compare it with the estimates of a corresponding histogram.
In this case the stated problem is not anymore equivalent to the
estimation of the distribution function. What do you think, can
I go a ahead in this case with the optimal bandwidth for the
density? Thanks a lot for your help!

Best wishes
Pedro




best wishes,

Adelchi


PR
PR 
PR --
PR Gregory (Greg) L. Snow Ph.D.
PR Statistical Data Center
PR Intermountain Healthcare
PR [EMAIL PROTECTED]
PR (801) 408-8111
PR 
PR 
PR -Original Message-
PR From: [EMAIL PROTECTED]
PR [mailto:[EMAIL PROTECTED] On Behalf Of Pedro
PR Ramirez Sent: Wednesday, June 07, 2006 11:00 AM
PR To: r-help@stat.math.ethz.ch
PR Subject: [R] Density Estimation
PR 
PR Dear R-list,
PR 
PR I have made a simple kernel density estimation by
PR 
PR x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
PR kde - density(x,n=100)
PR 
PR Now I would like to know the estimated probability that a new
PR observation falls into the interval 0x3.
PR 
PR How can I integrate over the corresponding interval?
PR In several R-packages for kernel density estimation I did not
PR found a corresponding function. I could apply Simpson's Rule for
PR integrating, but perhaps somebody knows a better solution.
PR 
PR Thanks a lot for help!
PR 
PR Pedro
PR 
PR _
PR 
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR 
PR
PR __
PR R-help@stat.math.ethz.ch mailing list
PR https://stat.ethz.ch/mailman/listinfo/r-help
PR PLEASE do read the posting guide!
PR http://www.R-project.org/posting-guide.html
PR

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2006-06-07 Thread Pedro Ramirez
Dear R-list,

I have made a simple kernel density estimation by

x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde - density(x,n=100)

Now I would like to know the estimated probability that a
new observation falls into the interval 0x3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did
not found a corresponding function. I could apply
Simpson's Rule for integrating, but perhaps somebody
knows a better solution.

Thanks a lot for help!

Pedro

_

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-07 Thread Greg Snow
Not a direct answer to your question, but if you use a logspline density
estimate rather than a kernal density estimate then the logspline
package will help you and it has built in functions for dlogspline,
qlogspline, and plogspline that do the integrals for you.

If you want to stick with the KDE, then you could find the area under
each of the kernals for the range you are interested in (need to work
out the standard deviation used from the bandwidth, then use pnorm for
the default gaussian kernal), then just sum the individual areas. 

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pedro Ramirez
Sent: Wednesday, June 07, 2006 11:00 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Density Estimation

Dear R-list,

I have made a simple kernel density estimation by

x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde - density(x,n=100)

Now I would like to know the estimated probability that a new
observation falls into the interval 0x3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did not found a
corresponding function. I could apply Simpson's Rule for integrating,
but perhaps somebody knows a better solution.

Thanks a lot for help!

Pedro

_

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-07 Thread Rolf Turner

Pedro wrote:

 I have made a simple kernel density estimation by
 
 x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
 kde - density(x,n=100)
 
 Now I would like to know the estimated probability that a
 new observation falls into the interval 0x3.
 
 How can I integrate over the corresponding interval?
 In several R-packages for kernel density estimation I did
 not found a corresponding function. I could apply
 Simpson's Rule for integrating, but perhaps somebody
 knows a better solution.

One possibility is to use splinefun():

 spiffy - splinefun(kde$x,kde$y)
 integrate(spiffy,0,3)
0.2353400 with absolute error  2e-09

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2006-06-07 Thread Pedro Ramirez
Not a direct answer to your question, but if you use a logspline density
estimate rather than a kernal density estimate then the logspline
package will help you and it has built in functions for dlogspline,
qlogspline, and plogspline that do the integrals for you.

If you want to stick with the KDE, then you could find the area under
each of the kernals for the range you are interested in (need to work
out the standard deviation used from the bandwidth, then use pnorm for
the default gaussian kernal), then just sum the individual areas.

Hope this helps,

Thanks a lot for your quick help! I think I will follow your first 
suggestion (logspline
density estimation) instead of summing over the kernel areas because at the
boundaries of the range truncated kernel areas can occur, so I think it is
easier to do it with logsplines. Thanks again for your help!!

Pedro




--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Pedro Ramirez
Sent: Wednesday, June 07, 2006 11:00 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Density Estimation

Dear R-list,

I have made a simple kernel density estimation by

x - c(2,1,3,2,3,0,4,5,10,11,12,11,10)
kde - density(x,n=100)

Now I would like to know the estimated probability that a new
observation falls into the interval 0x3.

How can I integrate over the corresponding interval?
In several R-packages for kernel density estimation I did not found a
corresponding function. I could apply Simpson's Rule for integrating,
but perhaps somebody knows a better solution.

Thanks a lot for help!

Pedro

_

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2006-03-12 Thread Jacob van Wyk
Hallo
I am trying to use the package LocFit to follow the example given in an
Introductory note of C Loader concerning density estimation. It involves
the geyser dataset (107 observations on durations, inlc in the
package).
I have tried the following (using the latest version of R):

fit.of - locfit(~geyser,flim=c(1,6),alpha=c(0.15,0.9))
plot(fit.of,get.data=T,mpv=200)

This produces a plot (after several warnings).
My question is: how can I get the plot to cover the range: 1 - 6 ? for
durations. The plot covers the observed data range only.
It appears there is a problem with

flim=c(1,6)

flim is not actually correct, and consequently c(1,6) is not used
correctly. I have also tried to use xlim=c(1,6), but without success.

I need some help on this please.
Thanks
Jacob


Jacob L van Wyk
Department of Statistics
University of Johannesburg APK
P O Box 524
Auckland Park 2006
South Africa
Tel: +27-11-489-3080
Fax: +27-11-489-2832

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density estimation with monotonic constaints

2006-02-03 Thread Spencer Graves
  There are multiple functions for density estimation in R, but I don't 
know of any for estimating a monotonically decreasing density.  If you 
haven't already, I encourage you to use, e.g., the help.search and 
RSiteSearch functions to find and explore their capabililties.

  Why do you ask?  Are you interested in analyzing particular data 
set(s) or are you doing research on density estimation?

  If it were my problem, I might just try something like the function 
density and then evaluate the results to find out if it satisfied my 
constraints.  If it did and if I were only interested in that data set, 
I'd be done.  If not, I'd increase the smoothing until I got something 
that was monotonic.  If I wanted a more general method, I might wrap a 
call to a function like density inside another function, and 
automatically adjust the smoothing until it satisfied some optimality 
criterion I might devise.  If I didn't get what I wanted doing that, I 
might list, e.g., the density function and walk through it line by 
line until I figured out what I needed to change to get what I wanted. 
I just listed density and found that it consists solely of a call to 
UseMethod.  To get beyond that, I tried 'methods(density), which 
told me there was only one method called density.default.  Then 
requesting density.default gave me the code for that.  Another tip:  I 
find debug extrememly helpeful for walking through code like this.

  I suspect this will not solve your problem, but I hope at least it 
helps.  If you'd like further assistance from this listserve, please 
submit another post.  However, I encourage you first to PLEASE do read 
the posting guide! www.R-project.org/posting-guide.html.  Doing so 
might increase your chances for getting useful information more quickly.

  spencer graves

Debayan Datta wrote:
 Hi All,
I have a sample x={x1,x2,..,xn} fom a distribution with density f. I 
 wish to estimate the density. I know a priori that the density is 
 monotonically decreasing. Is there a way to do this in R?
 Thanks
 Debayan
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density estimation with monotonic constaints

2006-01-31 Thread Debayan Datta
Hi All,
   I have a sample x={x1,x2,..,xn} fom a distribution with density f. I 
wish to estimate the density. I know a priori that the density is 
monotonically decreasing. Is there a way to do this in R?
Thanks
Debayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] density estimation

2005-05-10 Thread Hui Han
Hi,
I have been looking for a method of estimating a parametric model from 
the output (x, y) from the R function density. Below is my thought and 
wonder if it looks OK. Suppose that we build a single gaussian model for 
each input data point x (x is the mean),  the overal model may be a sum 
of these gaussian models built on each x, i.e. P(y) = \sum_x P(y|x, 
\sigma), where y is any new data point. Is this right? Any normalization 
is applied?

Thanks in advance for any suggestion that you may offer me!
Best regards,
Hui
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] density estimation

2005-04-22 Thread Bernard Palagos
hello
sorry for my english
I would like  estimate density  for multivariate variable,( f(x,y) , f(x,y 
,z) for example) ; for calculate mutual information
how is posible with R?
thanks
Bernard

Bernard Palagos
Unité Mixte de Recherche Cemagref - Agro.M - CIRAD
Information et Technologie pour les Agro-Procédés
Cemagref - BP 5095
34033 MONTPELLIER Cedex 1
France
http://www.montpellier.cemagref.fr/teap/default.htm
Tel: 04 67 04 63 13
Fax: 04 67 04 37 82


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] density estimation with weighted sample

2005-04-07 Thread Tomassini, Lorenzo
Dear all

I would like to perform density estimation with a weighted sample
(output of an Importance Sampling procedure) in R. Could anybody give me
an advice on what function to use (in which package)?

Thanks a lot,
Lorenzo

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] density estimation with weighted sample

2005-04-07 Thread Prof Brian Ripley
On Thu, 7 Apr 2005, Tomassini, Lorenzo wrote:
I would like to perform density estimation with a weighted sample
(output of an Importance Sampling procedure) in R. Could anybody give me
an advice on what function to use (in which package)?
This could mean
1) You have a sample with weights w, so `w=4' means `I have 4 of those'.
2) You have a sample from a density proportional to w(x)f(x) and want to 
estimate f.

Your title suggests the first, your comment the second.  If it is the 
second, use any package (even density() in R) to estimate the density g of 
the sampled distribution, for ghat/w and rescale to unit area.  If you 
know a lot about w (e.g. in stereology) there are specialized methods 
which are better.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] density estimation: compute sum(value * probability) for

2004-11-14 Thread Ted Harding
On 13-Nov-04 bogdan romocea wrote:
 Dear R users,
 
 However, how do I compute sum(values*probabilities)? The
 probabilities produced by the density function sum to only 26%: 
 sum(den$y)
 [1] 0.2611142
 
 Would it perhaps be ok to simply do
 sum(den$x*den$y) * (1/sum(den$y))
 [1] 1073.22
 ?

What you're missing is the dx! A density estimation estimates
the probability density function g(x) such that int[g(x)*dx] = 1,
and R's 'density' function returns estimated values of g at a
discrete set of points.

An integral can be approximated by a discrete summation of the
form

sum(g(x.i)*delta.x

You can recover the set of x-values at which the density is estimated,
and hence the implicit value of delta.x, from the returned density.

Example:

  X-rnorm(1000)
  f-density(X)
  x-f$x
  delta.x-x[2]-x[1]
  g-f$y
  sum(g*delta.x)

  [1] 1.000976

Hoping this helps,
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 14-Nov-04   Time: 08:50:53
-- XFMail --

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] density estimation: compute sum(value * probability) for given distribution

2004-11-13 Thread Liaw, Andy
First thing you probably should realize is that density is _not_
probability.  A probability density function _integrates_ to one, not _sum_
to one.  If X is an absolutely continuous RV with density f, then Pr(X=x)=0
for all x, and Pr(a  X  b) = \int_a^b f(x) dx.

sum x*Pr(X=x) (over all possible values of x) for a discrete distribution is
just the expectation, or mean, of the distribution.  The expectation for a
continuous distribution is \int x f(x) dx, where the integral is over the
support of f.  This is all elementary math stat that you can find in any
textbook.

Could you tell us exactly what you are trying to compute, or why you're
computing it?

HTH,
Andy

 From: bogdan romocea
 
 Dear R users,
 
 This is a KDE beginner's question. 
 I have this distribution:
  length(cap)
 [1] 200
  summary(cap)
Min. 1st Qu.  MedianMean 3rd Qu.Max. 
   459.9   802.3   991.6  1066.0  1242.0  2382.0 
 I need to compute the sum of the values times their probability of
 occurence.
 
 The graph is fine,
 den - density(cap, from=min(cap), 
to=max(cap), give.Rkern=F)
 plot(den)
 
 However, how do I compute sum(values*probabilities)? The
 probabilities produced by the density function sum to only 26%: 
  sum(den$y)
 [1] 0.2611142
 
 Would it perhaps be ok to simply do
  sum(den$x*den$y) * (1/sum(den$y))
 [1] 1073.22
 ?
 
 Thank you,
 b.
 
 __
 [EMAIL PROTECTED] mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] density estimation: compute sum(value * probability) for given distribution

2004-11-13 Thread bogdan romocea
Andy,

Thanks a lot for the clarifications. I was running a simulation a
number of times and trying to come up with a number to summarize the
results. And, I failed to realize from the beginning that what I was
trying to compute was just the mean.

Regards,
b.


--- Liaw, Andy [EMAIL PROTECTED] wrote:

 First thing you probably should realize is that density is _not_
 probability.  A probability density function _integrates_ to one,
 not _sum_
 to one.  If X is an absolutely continuous RV with density f, then
 Pr(X=x)=0
 for all x, and Pr(a  X  b) = \int_a^b f(x) dx.
 
 sum x*Pr(X=x) (over all possible values of x) for a discrete
 distribution is
 just the expectation, or mean, of the distribution.  The
 expectation for a
 continuous distribution is \int x f(x) dx, where the integral is
 over the
 support of f.  This is all elementary math stat that you can find
 in any
 textbook.
 
 Could you tell us exactly what you are trying to compute, or why
 you're
 computing it?
 
 HTH,
 Andy
 
  From: bogdan romocea
  
  Dear R users,
  
  This is a KDE beginner's question. 
  I have this distribution:
   length(cap)
  [1] 200
   summary(cap)
 Min. 1st Qu.  MedianMean 3rd Qu.Max. 
459.9   802.3   991.6  1066.0  1242.0  2382.0 
  I need to compute the sum of the values times their probability
 of
  occurence.
  
  The graph is fine,
  den - density(cap, from=min(cap), 
 to=max(cap), give.Rkern=F)
  plot(den)
  
  However, how do I compute sum(values*probabilities)? The
  probabilities produced by the density function sum to only 26%: 
   sum(den$y)
  [1] 0.2611142
  
  Would it perhaps be ok to simply do
   sum(den$x*den$y) * (1/sum(den$y))
  [1] 1073.22
  ?
  
  Thank you,
  b.
  
  __
  [EMAIL PROTECTED] mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
  
 
 

--
 Notice:  This e-mail message, together with any attachments,
 contains information of Merck  Co., Inc. (One Merck Drive,
 Whitehouse Station, New Jersey, USA 08889), and/or its affiliates
 (which may be known outside the United States as Merck Frosst,
 Merck Sharp  Dohme or MSD and in Japan, as Banyu) that may be
 confidential, proprietary copyrighted and/or legally privileged. It
 is intended solely for the use of the individual or entity named on
 this message.  If you are not the intended recipient, and have
 received this message in error, please notify us immediately by
 reply e-mail and then delete it from your system.

--


__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] density estimation: compute sum(value * probability) for given distribution

2004-11-13 Thread Uwe Ligges
bogdan romocea wrote:
Dear R users,
This is a KDE beginner's question. 
I have this distribution:

length(cap)
[1] 200
summary(cap)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  459.9   802.3   991.6  1066.0  1242.0  2382.0 
I need to compute the sum of the values times their probability of
occurence.

The graph is fine,
den - density(cap, from=min(cap), 
   to=max(cap), give.Rkern=F)
plot(den)

However, how do I compute sum(values*probabilities)? 
I don't get the point. You are estimating using a gaussian kernel.
Hint: What's the probability to get x=0 for a N(0,1) distribution?
So sum(values*probabilities) is zero!
 The
probabilities produced by the density function sum to only 26%: 
and could also sum to, e.g., 783453.9, depending on the number of 
observations and the estimated parameters of the desnity ...

sum(den$y)
[1] 0.2611142
Would it perhaps be ok to simply do
sum(den$x*den$y) * (1/sum(den$y))
[1] 1073.22
?
No. den$x is a point where the density function is equal to den$y, but 
den$y is not the probability to get den$x (you know, the stuff with 
intervals)! I fear you are mixing theory from discrete with continuous 
distributions.

Uwe Ligges

Thank you,
b.
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] density estimation: compute sum(value * probability) for given distribution

2004-11-12 Thread bogdan romocea
Dear R users,

This is a KDE beginner's question. 
I have this distribution:
 length(cap)
[1] 200
 summary(cap)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  459.9   802.3   991.6  1066.0  1242.0  2382.0 
I need to compute the sum of the values times their probability of
occurence.

The graph is fine,
den - density(cap, from=min(cap), 
   to=max(cap), give.Rkern=F)
plot(den)

However, how do I compute sum(values*probabilities)? The
probabilities produced by the density function sum to only 26%: 
 sum(den$y)
[1] 0.2611142

Would it perhaps be ok to simply do
 sum(den$x*den$y) * (1/sum(den$y))
[1] 1073.22
?

Thank you,
b.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2004-09-15 Thread Brian Mac Namee
Hi there,

Sorry if this is a rather loing post. I have a simple list of single
feature data points from which I would like to generate a probability
that an unseen point comes from the same distribution. To do this I am
trying to estimate the probability density of the list of points and
use this to generate a probability for the new unseen points. I have
managed to use the R density function to generate the density estimate
but have not been able to do anything with this - i.e. generate a
rpobability that a new point comes from the same distribution. Is
there a function to do this, or am I way off the mark using the
density function at all?

Thanks in advance,

Brian.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2004-09-15 Thread Vito Ricci
Dear Brian,

I can suggest you to use density() function to get an
estimate of the pdf you're finding (I believe it's
unknown). Then you can plot the point you got by
density() using plot(). In this way you have a graphic
representation of you unknown pdf. According its shape
and helping by the graphic you could try to understand
what kind of pdf it would be (normal, gamma, weibul,
etc.)
After you can estimate parameters of pdf using your
data with LS or ML methods.
Then you can calculate the goodness of fit for each
model of pdf and use the best one.

I hope I get you a little help.

Cordially
Vito Ricci

[EMAIL PROTECTED]  wrote:

Hi there,

Sorry if this is a rather loing post. I have a simple
list of single
feature data points from which I would like to
generate a probability
that an unseen point comes from the same distribution.
To do this I am
trying to estimate the probability density of the list
of points and
use this to generate a probability for the new unseen
points. I have
managed to use the R density function to generate the
density estimate
but have not been able to do anything with this - i.e.
generate a
rpobability that a new point comes from the same
distribution. Is
there a function to do this, or am I way off the mark
using the
density function at all?

Thanks in advance,

Brian.

=
Diventare costruttori di soluzioni

Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml



___

http://it.seriea.fantasysports.yahoo.com/

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2004-09-15 Thread Bob Wheeler
Try fitting it with a Johnson function -- see SuppDists. If you can fit 
it you will then be able to use the functions in SuppDists just as you 
can for any other distribution supported by R.

Brian Mac Namee wrote:
Hi there,
Sorry if this is a rather loing post. I have a simple list of single
feature data points from which I would like to generate a probability
that an unseen point comes from the same distribution. To do this I am
trying to estimate the probability density of the list of points and
use this to generate a probability for the new unseen points. I have
managed to use the R density function to generate the density estimate
but have not been able to do anything with this - i.e. generate a
rpobability that a new point comes from the same distribution. Is
there a function to do this, or am I way off the mark using the
density function at all?
Thanks in advance,
Brian.
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

--
Bob Wheeler --- http://www.bobwheeler.com/
ECHIP, Inc. ---
Randomness comes in bunches.
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Density Estimation

2004-09-15 Thread Wolski
Hi!

The function density returns you a object of class density.
This object has an x and an y attribute which you can access by x y,
Hi!

Use approx and runif.

eg.:

dd-density(rnorm(100,3,5))
plot(dd)

Using the function ?approx you can compute the density value for any x.
#the x is a dummy here.
mydist-function(x,dd)
{

while(1)
{
tmp - runif(1,min=min(dd$x),max=max(dd$x))
lev - approx(dd$x,dd$y,tmp)$y
if(runif(1,c(0,1)) = lev)
{
return(tmp)
}
}
}

x - 0
mydist(x,dd)

res-rep(0,500)
res-sapply(res,mydist,dd)
lines(density(res),col=2)


/E.



*** REPLY SEPARATOR  ***

On 9/15/2004 at 12:36 PM Brian Mac Namee wrote:

Hi there,

Sorry if this is a rather loing post. I have a simple list of single
feature data points from which I would like to generate a probability
that an unseen point comes from the same distribution. To do this I am
trying to estimate the probability density of the list of points and
use this to generate a probability for the new unseen points. I have
managed to use the R density function to generate the density estimate
but have not been able to do anything with this - i.e. generate a
rpobability that a new point comes from the same distribution. Is
there a function to do this, or am I way off the mark using the
density function at all?

Thanks in advance,

Brian.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



Dipl. bio-chem. Witold Eryk Wolski @ MPI-Moleculare Genetic   
Ihnestrasse 63-73 14195 Berlin'v'
tel: 0049-30-83875219/   \   
mail: [EMAIL PROTECTED]---W-Whttp://www.molgen.mpg.de/~wolski 
  [EMAIL PROTECTED]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Density Estimation

2004-09-15 Thread Ted Harding
On 15-Sep-04 Brian Mac Namee wrote:
 Sorry if this is a rather loing post. I have a simple list of single
 feature data points from which I would like to generate a probability
 that an unseen point comes from the same distribution. To do this I am
 trying to estimate the probability density of the list of points and
 use this to generate a probability for the new unseen points. I have
 managed to use the R density function to generate the density estimate
 but have not been able to do anything with this - i.e. generate a
 rpobability that a new point comes from the same distribution. Is
 there a function to do this, or am I way off the mark using the
 density function at all?

It's not clear what you're really after, but it looks as though you
may be wanting to sample from the distribution estimated by 'density'.

A possible approach, which you could refine, is exemplified by

  x-rnorm(1000)
  d-density(x,n=4096)
  y-sample(d$x,size=1000,prob=d$y)

Check performance with

  hist(y)

Looks OK to me! See ?density and ?sample.

On an alternative interpretation, perhaps you want to first estimate
the density based on data you already have, and then when you have
got further data (but these would then be seen and not unseen)
come to a judgement about whether these new points are compatible
with coming from the distributikon you have estimated.

A possible approach to this question (again susceptible to refinement)
would be as follows.

1. Use a fine-grained grid for 'density', i.e. a large value for n.

2. Replace each of the points in the new data by the nearest point
   in this grid. Call these values z1, z2, ... , zk corresponding
   to index values i1, i2, ... , ik in d$x.

3. Evaluate the probability P(z1,...,zk) from the density as the
   product of d$y[i] where i-c(i1,...,ik).
   Better still, evaluated the logarithm of this. Call the result L.

4. Now simulate a large number of draws of k values from d on the
   lines of sample(d$x,size=k,prob=d$y) as above, and evaluate L
   for each  of these. Where is the value of L from (3) situated in
   the distribution of these values of L from (4)? If (say) only
   1 per cent of the simulated values of L from d are less than
   the value of L from (3), then you have a basis for a test that
   your new data did not come from the distribution you have estimated
   from your old data, in that the new data are from the low-density
   part of the estimated distribution.

There are of course alternative ways to view this question. The
value of k is relevant. In particular, if k is small (say 3
or 4) then the suggestion in (4) is probably the best way to
approach it. However, if k is large then you can use a test on
the lines of Kolmogorov-Smirnov with the reference distribution
estimated as the cumulative distribution of d$y and the distribution
being tested as the empirical cumulative distribution of your new
data.

Even sharper focus is available if you are in a position to make
a paramatric model for your data, but your description does not
suggest that this is the case.

Best wishes,
Ted.



E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 167 1972
Date: 15-Sep-04   Time: 15:07:33
-- XFMail --

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Density Estimation

2004-04-10 Thread Prof Brian Ripley
help.search(kernel density) reports

KernSec(GenKern)Univariate kernel density estimate
KernSur(GenKern)Bivariate kernel density estimation
bkde(KernSmooth)Compute a Binned Kernel Density Estimate
bkde2D(KernSmooth)  Compute a 2D Binned Kernel Density Estimate
dpik(KernSmooth)Select a Bandwidth for Kernel Density
Estimation
kde2d(MASS) Two-Dimensional Kernel Density Estimation

amongst others, and package sm also has a user-friendly selection.

So, apart from point out alternatives I wanted to point out how easy it 
was to find the information originally requested.


On Sat, 10 Apr 2004, Ko-Kang Kevin Wang wrote:

  -Original Message-
  From: [EMAIL PROTECTED]
 
  Dear Sir/Madam;
  Would you please tell me what is the command that allows the
  estimation of the Kernel Density for some data.
  Thanks,
 
 ?density

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Density Estimation

2004-04-09 Thread Thami Rachidi
Dear Sir/Madam;
Would you please tell me what is the command that allows the estimation of the Kernel 
Density for some data.
Thanks,
Thami Rachidi
[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Density Estimation

2004-04-09 Thread Ko-Kang Kevin Wang
 -Original Message-
 From: [EMAIL PROTECTED]

 Dear Sir/Madam;
 Would you please tell me what is the command that allows the
 estimation of the Kernel Density for some data.
 Thanks,

?density

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html