Yep, that's it. Thanks a lot for the replies I got.
I guess the point I was struggling with (as I was curve fitting a distribution 
to sample data) is the discrete vs continuous densities.
But if one wants to model sample densities with a continuous, say normal, 
distribution then the histogram should have a total area of 1.

Best.

From: David Carlson [via R] [mailto:ml-node+s789695n466946...@n4.nabble.com]
Sent: Thursday, June 13, 2013 10:58 AM
To: Mohamed Badawy
Subject: Re: Unexpected behavior from hist()

Density means that the AREAS of the bars add to 1, not the HEIGHTS
of the bars. You probably have intervals that are less than 1. Eg:

> set.seed(42)
> x <- rpois(1000, 5)/100
> info <- hist(x, prob=TRUE)
> info
$breaks
 [1] 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11
0.12 0.13

$counts
 [1]  42  88 151 177 178 131  97  70  43  14   6   2   1

$density
 [1]  4.2  8.8 15.1 17.7 17.8 13.1  9.7  7.0  4.3  1.4  0.6  0.2
0.1

$mids
 [1] 0.005 0.015 0.025 0.035 0.045 0.055 0.065 0.075 0.085 0.095
0.105 0.115
[13] 0.125

$xname
[1] "x"

$equidist
[1] TRUE

attr(,"class")
[1] "histogram"
> diff(info$breaks)*info$density # Areas of each bar
 [1] 0.042 0.088 0.151 0.177 0.178 0.131 0.097 0.070 0.043 0.014
0.006 0.002
[13] 0.001
> sum(diff(info$breaks)*info$density) # Sum of the areas
[1] 1

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352


-----Original Message-----
From: [hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=0>
[mailto:[hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=1>] On 
Behalf Of Sarah Goslee
Sent: Thursday, June 13, 2013 10:36 AM
To: Mohamed Badawy
Cc: [hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=2>
Subject: Re: [R] Unexpected behavior from hist()

Hi,

On Thu, Jun 13, 2013 at 11:13 AM, Mohamed Badawy
<[hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=3>> wrote:
> Hi... I'm still a beginner in R. While doing some curve-fitting
with a raw data set of length 22,000, here is what I had:
>
>
>
>> hist(y,col="red")
>
> gives me the frequency histogram, 13 total rectangles, highest is
near 5000.
>

You don't provide a reproducible example, so here's some fake data:

somedata <- runif(1000)


> Now
>
>> hist(y,prob=TRUE,col="red",ylim=c(0,1.5))
>
> gives me the density (probability?) histogram, same number f
rectangles, but the highest rectangle is obviously higher than 1,
how can this be?!!!

Because you misread the help. using freq=FALSE (equivalent to
prob=TRUE, which is a legacy option), you are getting:

freq: logical; if 'TRUE', the histogram graphic is a representation
          of frequencies, the 'counts' component of the result; if
          'FALSE', probability densities, component 'density', are
          plotted (so that the histogram has a total area of one).
          Defaults to 'TRUE' _if and only if_ 'breaks' are
equidistant
          (and 'probability' is not specified).


It sounds like what you actually want is:

somehist <- hist(somedata, plot=FALSE)
somehist$counts <- somehist$counts/sum(somehist$counts)
plot(somehist)

> P.S. I had to post this thread via email as it got rejected as I
posted it from Nabble, reason was "Message rejected by filter rule
match"

Nabble is not the R-help mailing list. Posting via email is the
correct thing to do.

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

______________________________________________
[hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=4> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[hidden email]</user/SendEmail.jtp?type=node&node=4669465&i=5> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

________________________________
If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/Unexpected-behavior-from-hist-tp4669457p4669465.html
To unsubscribe from Unexpected behavior from hist(), click 
here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4669457&code=bWJhZGF3eUBwbS1lbmdyLmNvbXw0NjY5NDU3fDEyNDIwMTc1MzA=>.
NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: 
http://r.789695.n4.nabble.com/Unexpected-behavior-from-hist-tp4669457p4669468.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to