A lot of oppinions has been expressed with regard to if the favicon should
be default on or off since it might spam webservers with requests to a
non-existing file.

It would be really interesting to get some hard numbers on this. Just
looking at the current logs will not really say anything since very few
people browse with a mozilla with this pref turned on. So we need to come up
with some way to approximate the number of 404s per (for example) month in
the event of a browser with, say, 30% marketshare using the current
configuration.

Since the absence of a /favicon.ico is cached the number of 404-ing requests
will be much lower then the numbers of pagehits. Brendan says that the
absense is cached "persistently and with never-expire", does that mean that
mozilla won't request /favicon.ico again unless the user manually clears the
cache? In that case the number of 404s will be approximatly equal to the
number of "new users" every month * 30%.

If it's not possible to extract the number of new users from the logs i
think that the number of new IP-addresses * 1.5 is a good enough estimation.
There are probably more then 1.5 user per IP on average, but all users
probably don't visit the site. If someone have a better number then 1.5,
please speak up, my guess is very uneducated.

However it seems a bit wrong to me that a resource is cached "forever". What
if a site want to start supporting /favicon.ico? Will only new users see the
new icon? IMHO a resource should be reloaded at least sometime so that if
the resource appears/changes we will eventually catch it.

So say that we reload every 2 weeks. That means every user will reload
/favicon.ico once every 14th day, which means that the number of 404s will
be "number of destict users during 14-days" * 30% * 30/14.

So, we've got:

Hits = newUsersPerMonth * 0.3                 if we cache indefenetly

Hits = distinctUsersPerXDays * 0.3 * 30/X     if we refetch every X days

Where IP-addresses * 1.5 could approximate number of users. IMHO the right
thing would be to use the second formula with X ~= 14.

So it would be great if someone with access to the logs to a rather heavily
used site could run these formulas and compare that to the number of
"normal" 404s.

/ Jonas Sicking



Reply via email to