Well ... In my own recent example, it was plotting the raw data as a histogram that finally directed me to the "truth" of what the data had to say. As you may recall, the dataset was inter-arrival times of calls to a computer routine, known only from timestamps truncated (not rounded) to the nearest second. I started with kernel density (sm.density, with the default parameters, to be precise) and was unsatisfied with the result. Yesterday, when I plotted the raw counts (how many values were 0, how many 1, etc.) as a histogram, I was struck by two things:
1. There really are only two peaks -- the "stuff" in between them is, for the purpose of business decisions, irrelevant. 2. The inter-arrival time value "0" in such a dataset represents all the values that are greater than or equal to zero and *less than 1*, and so on. There is a natural "histogramming" going on via the timestamp truncation, which implies to me that the *midpoint* of the "bin" -- say, for the 0 values, 0.5 -- is the "natural" value to choose for the "x-axis" in the absence of any better information. This also rather neatly disposes of the issue of zero-valued inter-arrival times. :) Are the "old ways" best? Maybe not. Can I make reasonable business decisions without histograms? I'm not convinced that's the case; it certainly wasn't the case this time. Finally, while I've never been fortunate enough to use S, the existence of R has caused a revolution in the way I do the analysis of computer performance data. Before R came along, the only tools I had available were Excel, Minitab, and any special-purpose code I was willing to write to accomplish tasks not in the vocabulary of Excel or Minitab. For example, it's difficult, though not impossible, to do a non-linear regression or kernel density estimation with either tool. In R, they're one-liners. If there was a Nobel Prize for scientific software, I'd nominate R and its creators. (Of course, there *is* a Nobel in Economics.) :) -- M. Edward (Ed) Borasky mailto:[EMAIL PROTECTED] http://www.borasky-research.net > -----Original Message----- > Things have moved on since the ASH work too, but I would > agree that density estimation is often a better way than > histograms. However, close > to state-of-the-art density estimation is built into R > (?density) and packages `polspline', `KernSmooth' and `sm' > are also much more advanced > than `ash'. > > It was the advent of enough computing power that changed > this, and the S > language has been in the forefront of making the state of the art > available. You'll see that MASS (the book) covers histograms and > alternatives in its chapter on Univariate Distributions, and > it has since > its 1994 first edition (when did you go to `school'?) ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help