On Jan 12, 2009, at 2:40 PM, Bill Stoddard wrote:
I'm sure most of you know about Vadim's Apache stats project for
tracking download statistics for the verious Apache projects:
http://people.apache.org/~vgritsenko/stats/index.html
A fun little project but exceedingly difficult (not to mention time
consuming) for Vadim to dig into the details of each project in
order to present project stats with finer details.
Just out of curiosity, I did some Ruby hacking to modify Vadim's
apache log mining script to filter out Geronimo project data with
finer resolution. Here are the results:
http://people.apache.org/~stoddard/stats/data/
Cool. Thanks Bill!
I'll not bother commenting or summarizing on the different results
because it's exciting in exactly the same way as watching paint dry.
The one item that might need a bit of explaining is the reference to
'206W', so I'll cover that briefly... A 'successful' reply to an
HTTP Range request is a status '206' response (see RFC 2616 if you
want to know about range requests). So the '206' in 206W refers to
a successful reply to a Range request. The 'W' means 'weighted'....
more on 'W' in a bit.
An example... if the size of a file to download is 100M, a client
can make 10 range requests, each requesting a different 10MB segment
of the file. There are various reasons why a client might issue a
range request (PDF, acrobat and similar viewers, high bandwidth but
very low latency connections between the server and client and so
forth. reason is not important to this explanation... ). Each of
the 10 Range requests will create a 206 reply entry in the web
server's log file. So... if we are counting downloads of that 100MB
file, it would be incorrect to count each 206 reply as a download.
The 'w', which stand for weighted... in this case, the '206W'
download count would be '1'. The 10 206 replies are equivalent to 1
download of the 100 MB file.
fyi...
Bill