Density means that the AREAS of the bars add to 1, not the HEIGHTS of the bars. You probably have intervals that are less than 1. Eg:
> set.seed(42) > x <- rpois(1000, 5)/100 > info <- hist(x, prob=TRUE) > info $breaks [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 $counts [1] 42 88 151 177 178 131 97 70 43 14 6 2 1 $density [1] 4.2 8.8 15.1 17.7 17.8 13.1 9.7 7.0 4.3 1.4 0.6 0.2 0.1 $mids [1] 0.005 0.015 0.025 0.035 0.045 0.055 0.065 0.075 0.085 0.095 0.105 0.115 [13] 0.125 $xname [1] "x" $equidist [1] TRUE attr(,"class") [1] "histogram" > diff(info$breaks)*info$density # Areas of each bar [1] 0.042 0.088 0.151 0.177 0.178 0.131 0.097 0.070 0.043 0.014 0.006 0.002 [13] 0.001 > sum(diff(info$breaks)*info$density) # Sum of the areas [1] 1 ------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77840-4352 -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sarah Goslee Sent: Thursday, June 13, 2013 10:36 AM To: Mohamed Badawy Cc: r-help@r-project.org Subject: Re: [R] Unexpected behavior from hist() Hi, On Thu, Jun 13, 2013 at 11:13 AM, Mohamed Badawy <mbad...@pm-engr.com> wrote: > Hi... I'm still a beginner in R. While doing some curve-fitting with a raw data set of length 22,000, here is what I had: > > > >> hist(y,col="red") > > gives me the frequency histogram, 13 total rectangles, highest is near 5000. > You don't provide a reproducible example, so here's some fake data: somedata <- runif(1000) > Now > >> hist(y,prob=TRUE,col="red",ylim=c(0,1.5)) > > gives me the density (probability?) histogram, same number f rectangles, but the highest rectangle is obviously higher than 1, how can this be?!!! Because you misread the help. using freq=FALSE (equivalent to prob=TRUE, which is a legacy option), you are getting: freq: logical; if 'TRUE', the histogram graphic is a representation of frequencies, the 'counts' component of the result; if 'FALSE', probability densities, component 'density', are plotted (so that the histogram has a total area of one). Defaults to 'TRUE' _if and only if_ 'breaks' are equidistant (and 'probability' is not specified). It sounds like what you actually want is: somehist <- hist(somedata, plot=FALSE) somehist$counts <- somehist$counts/sum(somehist$counts) plot(somehist) > P.S. I had to post this thread via email as it got rejected as I posted it from Nabble, reason was "Message rejected by filter rule match" Nabble is not the R-help mailing list. Posting via email is the correct thing to do. Sarah -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.