Gautam Mukunda wrote:

> Most players and commentators and so on still think that batting average
is
> a meaningful statistic.  But it's not, and we've known that for a long
time.

Why not?  Please explain.  Thanks.

        Julia

Doug gave a very good explanation, but I'll try and give a little bit more
detail.  I'm sure you know most of what I'm going to write, but for the
benefit of anyone who doesn't know anything about baseball I'll start out at
a very basic level.  Batting average is calculated by taking the number of
hits and dividing it by (# of plate appearances - walks - sac. flies - sac.
bunts - hit by pitch).  Essentially all it tells you is the chance that
someone will get a hit (of any sort) in a non-walk plate appearance.  I
don't remember the exact numbers from the regression analysis, but I believe
that variation in batting average describes less than 30% of variation in
run scoring for teams.  Something in that range, I'm fairly confident.  When
you think about it, it's clear that the statistic makes no sense.  Two
players, one hits .300 with no walks and no home runs, the other hits .270
with 100 walks and 40 homeruns.  Which one is the better hitter?  It's
immediately obvious that the second hitter is vastly superior - and, in
fact, he'll get paid much better as well, suggesting that at some level,
team executives understand that as well, even though they usually don't
_talk_ as if they do.

The best single offensive statistic (imo) was invented by Bill James about
20 years ago, I believe, and is called Runs Created.  Essentially Runs
Created takes into account every possible outcome of a plate appearance,
plus stolen bases and caught stealing.  Then it calculates the "expected"
runs of a lineup of 9 players with those statistics, and divides by 9.
Variations in runs created explain something like 90% of variation in run
scoring for teams.

Runs created is quite complicated, however, which led to the search for a
"quick and dirty" statistic that is as easily usable as batting average.
What sabermetricians ended up with was, amazingly enough, invented by Branch
Rickey decades ago, but only he had ever used it.  This probably explains a
large part of his success as a GM, in my opinion :-)  It's now named OPS,
and it is calculated as On Base Percentage + Slugging Percentage:
(walks + hits + hit by pitch) / (at bats + sac. flies + sac. bunts + hit by
pitch) + ((singles + 2 * doubles + 3 * triples + 4 * home runs) / (at bats))

I have a feeling I might have missed something small in those formulae, but
they're at least very close to being right.  Variation in OPS explains
something over 80% of variation in run scoring.  It also makes things fairly
easy to judge by eye.  An OPS over 800 is good (depending on position) an
OPS over 900 is excellent, and an OPS over 1000 is pretty much Hall of Fame
caliber.  Who is the all-time record holder for career OPS?  Babe Ruth.  Who
is second?  Ted Williams.  This is not only mathematically useful, it's
intuitively correct.  We all _know_ that Babe Ruth and Ted Williams were the
two best hitters in major league history, and a statistic (like batting
average) that tells us otherwise is clearly wrong.

If you don't feel like calculating RC (and who does?) the best way to judge
the offensive contributions of different players is to go to
www.baseballprospectus.com and look at their tables for Equivalent Average.
I'm not exactly sure how they calculate it (it might even be proprietary),
but essentially they have an offensive statistic that they've normalized to
look like batting average - so a player with a .300 EA is very good, and so
on.  But OPS is what most people use for snap judgments, myself included.
EA, however, has the advantage of being park-adjusted.  This is important in
ordinary circumstances and now critical with the advent of Coors Field, the
greatest hitting park of all time, which inflates run scoring by _30%_.

The end result of which is - some teams pay attention to OPS, and some teams
to batting average.  The teams that pay attention to OPS teach their players
to draw walks and wait for a pitch they can drive.  The teams that pay
attention to batting average emphasize "making contact."  The two most
prominent exponents of the OPS strategy are the Oakland A's and the New York
Yankees.  The two most prominent exponents of the batting average strategy
are the Kansas City Royals and the Tampa Bay Devil Rays.  The outcomes are,
to say the least, suggestive.

Which, in the end, is what I meant by saying that I know stuff about
baseball that a lot of old timers don't.  They all know much _more_ about
baseball than I do.  But they know a lot of things that have the small
disadvantage of not being true.  The true stuff that they know is far more
than what I know too, of course.  But the untrue stuff - batting average is
important, stolen bases are critical, strikeouts are a bad thing - makes the
true stuff less valuable.  The two best GMs in the game are (imo) Billy
Beane and Brian Cashman.  Neither, I believe, has ever played baseball at a
professional level.  I am sure that is the case for Cashman.  But both
decided to take a mathematical approach to the game, and both have been well
rewarded for doing that.  The managers who emphasize a similar approach -
Larry Dierker, and Davey Johnson, for example, have similarly been
exceptionally successful during their careers.

Gautam

Reply via email to