These stats are very interesting (especially BlackBerry 0.43%), and the use of a cookie (with the provisos listed at the bottom of the page) to track 'users' provides a good insight.
Is it possible that these stats could be provided automatically, say on a daily basis so it can be used to track the use of browsers and platforms. The BBC, as a public service, would be doing a great service for the rest of the industry to have these stats available as a 'live page', perhaps with some nice graphs and things. Having it a resource would, IMHO, help UK web developers. Please email me back if you need any more help. Brian Butterworth www.ukfree.tv > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Kim Plowright > Sent: 28 March 2007 11:04 > To: [email protected] > Subject: RE: [backstage] Browser Stats > > If you read Martin Belam (hello Martin!) on the methods he > used to derive these figures, you'll note that he's extremely > thorough in his data analysis. > http://www.currybet.net/articles/user_agents/index.php I > think you should read a little levity in to Jem's use of a > grin after the Linux comment! > > Below are the stats, taken from our Sage Analyst system > (http://www.sagemetrics.com/content/sageanalyst/overview.html > - about the system, currently very slow!), from the 24th of > march - the most recent 24h period available. We tend to run > a bit late, as, IIRC, the daily server logs run to around > 5gigabytes of data, which needs to be warehoused and processed. > > These figures are for all visits, to all pages of the whole > of bbc.co.uk, not just the homepage. > > Automated requests (from bots, spiders etc) are stripped from > our data; as far as I know we comply with JICWEBS and IFABC > standards that require this. This is done using browser > string filtering, against an industry standard set of strings > supplied by IFABC. > > I provide these OS breakdowns both as % of Total Page Views, > and % of users. Unique users are deduplicated, based on > Cookie data - so you should caveat that with the usual cookie > churn stuff*. However, as we're looking at percentage shares > in a very large (6.5million+) user sample, I think it should > be considered a good indicative slice. > > > By Page Impression > Operating Systems for Mar 24, 2007 for Entire Site from Entire World > OS Type % of Total Page Views > Windows 88.37 > Macintosh 4.51 > Liberate 3.32 > Nokia 1.09 > SonyEricsson 0.67 > BlackBerry 0.43 > Motorola 0.36 > Samsung 0.23 > LG 0.17 > NEC 0.08 > Orange 0.04 > Sagem 0.03 > O2 0.02 > TMobile 0.01 > Sharp 0.01 > Linux 0.01 > DOS 0 > Panasonic 0 > BenQ 0 > Sprint 0 > ZTE 0 > Philips 0 > Unix 0 > VK 0 > Siemens 0 > Toshiba 0 > Sun 0 > Sanyo 0 > IRIX 0 > OSF1 0 > Unidentified 0.65 > > By User > Operating Systems for Mar 24, 2007 for Entire Site from Entire World > OS Type % of Total Users > Windows 85.39 > Macintosh 6.51 > Nokia 2.26 > Liberate 1.66 > SonyEricsson 1.5 > Motorola 0.84 > BlackBerry 0.76 > Samsung 0.55 > LG 0.18 > Sagem 0.08 > Orange 0.06 > Sharp 0.04 > O2 0.03 > TMobile 0.03 > Linux 0.02 > Panasonic 0.02 > NEC 0.02 > BenQ 0.01 > DOS 0.01 > Philips 0.01 > ZTE 0 > Sprint 0 > Toshiba 0 > VK 0 > Unix 0 > Siemens 0 > Sanyo 0 > Sun 0 > IRIX 0 > OSF1 0 > > - - - > > Breakdown of WINDOWS operating systems > Operating Systems for Mar 24, 2007 for Entire Site from Entire World > OS Type % of Total > Page Views > Windows XP 53.71 > Windows XP SP2 31.96 > Windows 2000 6.94 > Windows NT 2.65 > Windows Vista 2.25 > Windows 98 1.23 > Windows ME 0.72 > Windows CE 0.35 > Windows 32 0.13 > Windows 95 0.06 > Windows 64 0.01 > Windows 31 0 > > Breakdown of MAC os'es > Operating Systems for Mar 24, 2007 for Entire Site from > Entire World > OS Type % of Total Page Views > Macintosh X 97.21 > Macintosh PowerPC 2.53 > Macintosh 0.26 > Macintosh OS8 0 > > Breakdown of LINUX oses > Operating Systems for Mar 24, 2007 for Entire Site from Entire World > OS Type % of Total Page Views > Linux 24 43.17 > Linux 22 36.4 > Linux 20 20.43 > > *From our guidance notes, internally: > Figures for unique users are based on the BBCUID. > This is a unique identifier - known as a cookie - which is > sent to a user's computer the first time they request a page > from a BBC web site. Provided the cookie is accepted by the > requesting computer then it will be saved to that computer's > memory and will be returned to the web server with all > subsequent requests. > The returned cookies are included in the log records for each > request and because each cookie is unique it is then possible > to track the activity of each user across time. > The total number of unique users is really a count of the > number of unique BBCUID values seen in the logs. > Note that although each cookie may appear many times in the > log it must only be counted once. It is this "de-duplication" > that makes unique user figures difficult to calculate. > > Some important points to note about unique users: > > * Users are not "people". Cookies attach to browsers, to > user logins or possibly to a combination of these. If 2 > people share the same machine and the same user login they > would share the same BBCUID and appear as the same person. > Equally if the same person were to use two different machines > then they would be counted as two users. > * Some browsers do not accept cookies. When this happens > a new cookie will be sent out for every request that browser > makes. If we counted these cookies as users it would push the > number of users up. So we don't count cookies we send out, > only those that we get back. > * There may be a number of situations where cookies, > including the BBCUID, will get deleted from a computer. Some > companies wipe cookies from machines at regular intervals. In > some environments, e.g. internet café's or schools, computers > will destroy cookies when a person logs off from a session. > Many browsers offer options to easily delete cookies. In any > case where the BBCUID cookie is deleted then the next time a > request is made from that machine or user a new cookie will > be issued and will appear as a new user. > * Unique user figures should never be added (or > subtracted) in case the same BBCUIDs are included in the > numbers in the calculation. E.g. you could not add the users > of Eastenders to the users of Radio 1 because the total would > double count any users that had used both sites. > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Andy > > Sent: 27 March 2007 17:19 > > To: [email protected] > > Subject: Re: [backstage] Browser Stats > > > > On 26/03/07, Jeremy Stone <[EMAIL PROTECTED]> wrote: > > > 0.4% of users at the time used a Linux operating system ;) > > > > That's not entirely true is it? > > Please do not try to mislead people. > > > > What is more likely is: > > 0.4% of users WHERE DETECTED AS using a Linux operating > system AT THE > > TIME THEY VISITED THE BBC SITE. > > > > This number can be wrong for a multitude of reasons. > > > > 1) the BBC stats are biased, the site is target at Windows > users and > > on certain pages blocks users of other OSes (bbc.co.uk uses ActiveX > > for instance) > > > > 2) Detection software may not have been as tuned to > recognize a Linux > > OS, after all many distros don't call them selves 'Linux', > it may not > > be in the user agent string. (simply looking for the word > Linux is not > > good enough). > > > > 3) A Linux user may have been misreporting the Operating System > > (commonly used to cater for sites that do user agent > sniffing badly, > > also used to blend in with the crowd for anonymity). > > > > 4) Someone may have a dual boot (or triple or more), and > may only be > > using Windows to view bbc.co.ku, possibly due to being > locked out by > > previously mentioned technological practices of the BBC. > > > > 5) Some 'users' may not be real people, they may be robots spoofing > > there user agent. 90% of email is spam. How have you > accounted for web > > robots browsing your site looking for email addresses or trying to > > post spam comments (they would not hit robots.txt or say > robot in the > > user agent, that would give them away)? I am thinking most > spam bots > > would impersonate IE on Windows as it probably has the > highest market > > share so much harder o filter. (by how high we are unsure). > > > > Additionally you could argue you would get the less > knowledgable users > > in this sampling, I rarely hit the BBC home page, why > bother? I know > > where I want to go and I get the news feeds in a handy RSS so I > > probably don't hit news.bbc.co.uk's homepage either. > > I have the pages I need on bookmarks, (Favourites for you IE users). > > > > This is the great thing about statistics people like you claim they > > show something and try to cover up the failings of how the sampling > > was done. > > > > It shows only as much as it records. The number of recognized User > > Agent strings for hits on the BBC website. > > > > (Quick question, is this per IP or per page hit? page hit > would be bad > > as it would allow robots to skew the results badly as they > would hit > > far more pages). > > > > I really do dislike statistics, especially when people try to claim > > that they prove something without accounting for the method of > > gathering. > > > > And now a quote: > > > There are three kinds of commonly recognised untruths: > > > > > > Lies, damn lies and statistics. > > > - Mark Twain > > > > > > This quote from Mark Twain is accurate; statistics are > > often used to > > > lie to the public because most people do not understand how > > statistics work. > > > > And this quote is from where you ask? Why it is from the BBC of > > course! (well I had to use the BBC quote didn't I? > > especially it is the first result on Google for: lies damn lies > > statistics) > > > > Maybe you should improve your stats? > > 1.Group each unique header together and have a Skilled Human with > > knowledge of all operating system classify them according to OS. > > 2. Make each visitor pass a Turing Test prior to using there User > > Agent. > > 3. Verify details of OS using other methods, i.e. Javascript could > > check, or use OS fingerprinting (hopefully it wouldn't hit NAT > > routers, otherwise you'd probably get the OS of a router,. which > > although interesting is not what we are looking for is it?). > > > > On the subject of whether to support IE 5, is it supported by > > Microsoft or has it been end of lifed? If it's been end of > lifed then > > maybe you don't need to support it. > > > > Why do you need to 'support' specific browsers anyway? This is what > > standards are ofr, I don't need to check the compatibility > with every > > piece of software on every switch between here and my destination > > node, they are using a standard I just make sure I follow that > > standard. Why should the HTML content be any different? > > > > The underlying TCP/IP and HTTP system seem to work much more > > compatibly than all these websites, many of which display poorly if > > you stray so slightly of the most common browser and settings, does > > this not show that standards work better? > > > > Andy > > > > -- > > First they ignore you > > then they laugh at you > > then they fight you > > then you win. > > - Mohandas Gandhi > > - > > Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, > > please visit > > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > > Unofficial list archive: > > http://www.mail-archive.com/[email protected]/ > > > > - > Sent via the backstage.bbc.co.uk discussion group. To > unsubscribe, please visit > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. > Unofficial list archive: > http://www.mail-archive.com/[email protected]/ > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.446 / Virus Database: 268.18.20/736 - Release > Date: 27/03/2007 16:38 > > -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.446 / Virus Database: 268.18.20/736 - Release Date: 27/03/2007 16:38 - Sent via the backstage.bbc.co.uk discussion group. To unsubscribe, please visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html. Unofficial list archive: http://www.mail-archive.com/[email protected]/

