Re: [R] overlap histogram and density

2010-11-25 Thread Roslina Zakaria
Hi Ted,

Regarding your examples, is it possible to get a smooth line for the density 
which overlap with the histogram?

Regards,

Roslina



From: "ted.hard...@wlandres.net" 
To: r-help@r-project.org

Sent: Fri, November 12, 2010 6:42:31 AM
Subject: Re: [R] overlap histogram and density

[OOPS!!I accidentally reproduced my second example below
as my third example. Now corrected. See below.]

On 11-Nov-10 20:02:29, Ted Harding wrote:

On 11-Nov-10 18:39:34, Roslina Zakaria wrote:
> Hi,
> Does anybody encounter the same problem when we overlap histogram
> and density that the density line seem to shift to the right a
> little bit?
> 
> If you do have the same problem, what should we do to correct that?
> Thank you.
> 
> par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
> hist(datobs,prob=TRUE,
>      main ="Volume of a catchment from four stations",
>      col="yellowgreen", cex.axis=1, xlab="rainfall",
>      ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200))
> 
> lines(density(dd), lwd=3,col="red")
> 
>#legend("topright",c("observed","generated"),
>#      lty=c(0,1),fill=c("blue",""),bty="n")
> 
> legend("topright", legend = c("observed","generated"),
> col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), 
> lwd=c(0,3),bty="n", pt.cex=2)
> box()
> 
> Thank you.

In theory that is not a problem. The density() function will
estimate a density whose integral over each of the intervals
in the histogram is equal to the probability of that interval,
and the proportion of the data expected in that interval will
also be its probability.

In practice, the estent to which you observe what you describe
(or a displacement to the left) will depend on how your data
are distributed within the intervals, and on the precision
with which density() happens to estimate the true density.

The following 3 cases of the same data sampled from a log-Normal
distribution, illustrate different impressions of the kind that
one might get, depending on the details of the histogram. Note
that there is no overall effect of "displacement to the right
in any histogram, while the extent to which one observes it
varies according to the histogram. Without knowledge of your
data it is not possible to comment further on the extent to
[[elided Yahoo spam]]

set.seed(54321)
N  <- 1000
X  <- exp(rnorm(N,sd=0.4))
dd <- density(X)

# A coarse histogram
H  <- hist(X,prob=TRUE,
          xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.5*(0:8))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)

## A finer histogram
H  <- hist(X,prob=TRUE,
          xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)

## A still finer histogram
H  <- hist(X,prob=TRUE,
## OOPS!!  xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
          xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.20*(0:20))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)


Ted.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 11-Nov-10                                      Time: 20:12:27
-- XFMail --



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlap histogram and density

2010-11-11 Thread Ted Harding
[OOPS!!I accidentally reproduced my second example below
 as my third example. Now corrected. See below.]

On 11-Nov-10 20:02:29, Ted Harding wrote:
 
 On 11-Nov-10 18:39:34, Roslina Zakaria wrote:
> Hi,
> Does anybody encounter the same problem when we overlap histogram
> and density that the density line seem to shift to the right a
> little bit?
> 
> If you do have the same problem, what should we do to correct that?
> Thank you.
> 
> par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
> hist(datobs,prob=TRUE,
>  main ="Volume of a catchment from four stations",
>  col="yellowgreen", cex.axis=1, xlab="rainfall",
>  ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200))
> 
> lines(density(dd), lwd=3,col="red")
> 
>#legend("topright",c("observed","generated"),
>#   lty=c(0,1),fill=c("blue",""),bty="n")
> 
> legend("topright", legend = c("observed","generated"),
> col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), 
> lwd=c(0,3),bty="n", pt.cex=2)
> box()
> 
> Thank you.
 
In theory that is not a problem. The density() function will
estimate a density whose integral over each of the intervals
in the histogram is equal to the probability of that interval,
and the proportion of the data expected in that interval will
also be its probability.

In practice, the estent to which you observe what you describe
(or a displacement to the left) will depend on how your data
are distributed within the intervals, and on the precision
with which density() happens to estimate the true density.

The following 3 cases of the same data sampled from a log-Normal
distribution, illustrate different impressions of the kind that
one might get, depending on the details of the histogram. Note
that there is no overall effect of "displacement to the right
in any histogram, while the extent to which one observes it
varies according to the histogram. Without knowledge of your
data it is not possible to comment further on the extent to
which you have observed it yourself!

set.seed(54321)
N  <- 1000
X  <- exp(rnorm(N,sd=0.4))
dd <- density(X)

# A coarse histogram
H  <- hist(X,prob=TRUE,
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.5*(0:8))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)
 
## A finer histogram
H  <- hist(X,prob=TRUE,
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)
 
## A still finer histogram
H  <- hist(X,prob=TRUE,
## OOPS!!  xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.20*(0:20))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)
 
 
 Ted.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 11-Nov-10   Time: 20:12:27
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlap histogram and density

2010-11-11 Thread Ted Harding

On 11-Nov-10 18:39:34, Roslina Zakaria wrote:
> Hi,
> Does anybody encounter the same problem when we overlap histogram
> and density that the density line seem to shift to the right a
> little bit?
> 
> If you do have the same problem, what should we do to correct that?
> Thank you.
> 
> par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
> hist(datobs,prob=TRUE,
>  main ="Volume of a catchment from four stations",
>  col="yellowgreen", cex.axis=1, xlab="rainfall",
>  ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200))
> 
> lines(density(dd), lwd=3,col="red")
> 
>#legend("topright",c("observed","generated"),
>#   lty=c(0,1),fill=c("blue",""),bty="n")
> 
> legend("topright", legend = c("observed","generated"),
> col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), 
> lwd=c(0,3),bty="n", pt.cex=2)
> box()
> 
> Thank you.

In theory that is not a problem. The density() function will
estimate a density whose integral over each of the intervals
in the histogram is equal to the probability of that interval,
and the proportion of the data expected in that interval will
also be its probability.

In practice, the estent to which you observe what you describe
(or a displacement to the left) will depend on how your data
are distributed within the intervals, and on the precision
with which density() happens to estimate the true density.

The following 3 cases of the same data sampled from a log-Normal
distribution, illustrate different impressions of the kind that
one might get, depending on the details of the histogram. Note
that there is no overall effect of "displacement to the right
in any histogram, while the extent to which one observes it
varies according to the histogram. Without knowledge of your
data it is not possible to comment further on the extent to
which you have observed it yourself!

set.seed(54321)
N  <- 1000
X  <- exp(rnorm(N,sd=0.4))
dd <- density(X)

## A coarse histogram
H  <- hist(X,prob=TRUE,
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.5*(0:8))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)

## A finer histogram
H  <- hist(X,prob=TRUE,
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)

## A still finer histogram
H  <- hist(X,prob=TRUE,
   xlim=c(-0.5,4),ylim=c(0,max(dd$y)),breaks=0.25*(0:16))
dx <- unique(diff(H$breaks))
lines(dd$x,dd$y)


Ted.


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 11-Nov-10   Time: 20:02:24
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] overlap histogram and density

2010-11-11 Thread Ben Bolker
Roslina Zakaria  yahoo.com> writes:

> 
> Hi,
> 
> Does anybody encounter the same problem when we overlap histogram and density 
>     that the density line seem to shift to the right a little bit?
>      

>     par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
>     hist(datobs,prob=TRUE, main ="Volume of a catchment from four 
>     stations",col="yellowgreen", cex.axis=1,
>     xlab="rainfall",ylab="Relative frequency", ylim= c(0,.003), 
> xlim=c(0,1200))
>     lines(density(dd), lwd=3,col="red")
>     legend("topright", legend = c("observed","generated"),
>        col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), 
>        lwd=c(0,3),bty="n", pt.cex=2)
>     box()

   Are dd and datobs the same?
   There is nothing obviously (to me) wrong here.
   Density estimation by definition smears out sharp peaks, which
can lead to differences between the histogram and density estimate.
   Hard to say any more without a reproducible example.

z <- rnorm(5000)
hist(z,prob=TRUE,col="gray",breaks=100)
lines(density(z),col="red")

  looks fine to me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] overlap histogram and density

2010-11-11 Thread Roslina Zakaria
Hi,

Does anybody encounter the same problem when we overlap histogram and density 
    that the density line seem to shift to the right a little bit?
     
    If you do have the same problem, what should we do to correct that?
     
    Thank you.
     
    par(mar=c(4,4,2,1.2),oma=c(0,0,0,0))
    hist(datobs,prob=TRUE, main ="Volume of a catchment from four 
    stations",col="yellowgreen", cex.axis=1,
    xlab="rainfall",ylab="Relative frequency", ylim= c(0,.003), xlim=c(0,1200))
    lines(density(dd), lwd=3,col="red")
    
#legend("topright",c("observed","generated"),lty=c(0,1),fill=c("blue",""),bty="n")

    legend("topright", legend = c("observed","generated"),
       col = c("yellowgreen", "red"), pch=c(15,NA), lty = c(0, 1), 
       lwd=c(0,3),bty="n", pt.cex=2)
    box()

Thank you.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.