These stats are very interesting (especially BlackBerry 0.43%), and the use
of a cookie (with the provisos listed at the bottom of the page) to track
'users' provides a good insight.

Is it possible that these stats could be provided automatically, say on a
daily basis so it can be used to track the use of browsers and platforms.

The BBC, as a public service, would be doing a great service for the rest of
the industry to have these stats available as a 'live page', perhaps with
some nice graphs and things.

Having it a resource would, IMHO, help UK web developers.

Please email me back if you need any more help.
 
Brian Butterworth
www.ukfree.tv


> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Kim Plowright
> Sent: 28 March 2007 11:04
> To: backstage@lists.bbc.co.uk
> Subject: RE: [backstage] Browser Stats
> 
> If you read Martin Belam (hello Martin!) on the methods he 
> used to derive these figures, you'll note that he's extremely 
> thorough in his data analysis. 
> http://www.currybet.net/articles/user_agents/index.php I 
> think you should read a little levity in to Jem's use of a 
> grin after the Linux comment!
> 
> Below are the stats, taken from our Sage Analyst system 
> (http://www.sagemetrics.com/content/sageanalyst/overview.html 
> - about the system, currently very slow!), from the 24th of 
> march - the most recent 24h period available. We tend to run 
> a bit late, as, IIRC, the daily server logs run to around 
> 5gigabytes of data, which needs to be warehoused and processed.
> 
> These figures are for all visits, to all pages of the whole 
> of bbc.co.uk, not just the homepage.
> 
> Automated requests (from bots, spiders etc) are stripped from 
> our data; as far as I know we comply with JICWEBS and IFABC  
> standards that require this. This is done using browser 
> string filtering, against an industry standard set of strings 
> supplied by IFABC.
> 
> I provide these OS breakdowns both as % of Total Page Views, 
> and % of users. Unique users are deduplicated, based on 
> Cookie data - so you should caveat that with the usual cookie 
> churn stuff*. However, as we're looking at percentage shares 
> in a very large (6.5million+) user sample, I think it should 
> be considered a good indicative slice. 
> 
> 
> By Page Impression
> Operating Systems for Mar 24, 2007 for Entire Site from Entire World  
> OS Type       % of Total Page Views  
> Windows       88.37
> Macintosh     4.51
> Liberate      3.32
> Nokia 1.09
> SonyEricsson  0.67
> BlackBerry    0.43
> Motorola      0.36
> Samsung       0.23
> LG    0.17
> NEC   0.08
> Orange        0.04
> Sagem 0.03
> O2    0.02
> TMobile       0.01
> Sharp 0.01
> Linux 0.01
> DOS   0
> Panasonic     0
> BenQ  0
> Sprint        0
> ZTE   0
> Philips       0
> Unix  0
> VK    0
> Siemens       0
> Toshiba       0
> Sun   0
> Sanyo 0
> IRIX  0
> OSF1  0
> Unidentified  0.65
> 
> By User
> Operating Systems for Mar 24, 2007 for Entire Site from Entire World  
> OS Type           % of Total Users     
> Windows       85.39
> Macintosh     6.51
> Nokia 2.26
> Liberate      1.66
> SonyEricsson  1.5
> Motorola      0.84
> BlackBerry    0.76
> Samsung       0.55
> LG    0.18
> Sagem 0.08
> Orange        0.06
> Sharp 0.04
> O2    0.03
> TMobile       0.03
> Linux 0.02
> Panasonic     0.02
> NEC   0.02
> BenQ  0.01
> DOS   0.01
> Philips       0.01
> ZTE   0
> Sprint        0
> Toshiba       0
> VK    0
> Unix  0
> Siemens       0
> Sanyo 0
> Sun   0
> IRIX  0
> OSF1  0
> 
> - - - 
> 
> Breakdown of WINDOWS operating systems        
> Operating Systems for Mar 24, 2007 for Entire Site from Entire World  
>                 OS Type                         % of Total 
> Page Views  
> Windows XP    53.71
> Windows XP SP2        31.96
> Windows 2000  6.94
> Windows NT    2.65
> Windows Vista 2.25
> Windows 98    1.23
> Windows ME    0.72
> Windows CE    0.35
> Windows 32    0.13
> Windows 95    0.06
> Windows 64    0.01
> Windows 31    0
> 
> Breakdown of MAC os'es        
> Operating Systems  for  Mar 24, 2007 for Entire Site from 
> Entire World          
> OS Type       % of Total Page Views  
> Macintosh X   97.21
> Macintosh PowerPC     2.53
> Macintosh     0.26
> Macintosh OS8 0
>       
> Breakdown of LINUX oses       
> Operating Systems for Mar 24, 2007 for Entire Site from Entire World  
> OS Type       % of Total Page Views  
> Linux 24      43.17
> Linux 22      36.4
> Linux 20      20.43
> 
> *From our guidance notes, internally: 
> Figures for unique users are based on the BBCUID.
> This is a unique identifier - known as a cookie - which is 
> sent to a user's computer the first time they request a page 
> from a BBC web site. Provided the cookie is accepted by the 
> requesting computer then it will be saved to that computer's 
> memory and will be returned to the web server with all 
> subsequent requests.
> The returned cookies are included in the log records for each 
> request and because each cookie is unique it is then possible 
> to track the activity of each user across time.
> The total number of unique users is really a count of the 
> number of unique BBCUID values seen in the logs.
> Note that although each cookie may appear many times in the 
> log it must only be counted once. It is this "de-duplication" 
> that makes unique user figures difficult to calculate.
> 
> Some important points to note about unique users:
> 
>     * Users are not "people". Cookies attach to browsers, to 
> user logins or possibly to a combination of these. If 2 
> people share the same machine and the same user login they 
> would share the same BBCUID and appear as the same person. 
> Equally if the same person were to use two different machines 
> then they would be counted as two users.
>     * Some browsers do not accept cookies. When this happens 
> a new cookie will be sent out for every request that browser 
> makes. If we counted these cookies as users it would push the 
> number of users up. So we don't count cookies we send out, 
> only those that we get back.
>     * There may be a number of situations where cookies, 
> including the BBCUID, will get deleted from a computer. Some 
> companies wipe cookies from machines at regular intervals. In 
> some environments, e.g. internet café's or schools, computers 
> will destroy cookies when a person logs off from a session. 
> Many browsers offer options to easily delete cookies. In any 
> case where the BBCUID cookie is deleted then the next time a 
> request is made from that machine or user a new cookie will 
> be issued and will appear as a new user.
>     * Unique user figures should never be added (or 
> subtracted) in case the same BBCUIDs are included in the 
> numbers in the calculation. E.g. you could not add the users 
> of Eastenders to the users of Radio 1 because the total would 
> double count any users that had used both sites.      
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Andy
> > Sent: 27 March 2007 17:19
> > To: backstage@lists.bbc.co.uk
> > Subject: Re: [backstage] Browser Stats
> > 
> > On 26/03/07, Jeremy Stone <[EMAIL PROTECTED]> wrote:
> > > 0.4% of users at the time used a Linux operating system  ;)
> > 
> > That's not entirely true is it?
> > Please do not try to mislead people.
> > 
> > What is more likely is:
> > 0.4% of users WHERE DETECTED AS using a Linux operating 
> system AT THE 
> > TIME THEY VISITED THE BBC SITE.
> > 
> > This number can be wrong for a multitude of reasons.
> > 
> > 1) the BBC stats are biased, the site is target at Windows 
> users and 
> > on certain pages blocks users of other OSes (bbc.co.uk uses ActiveX 
> > for instance)
> > 
> > 2) Detection software may not have been as tuned to 
> recognize a Linux 
> > OS, after all many distros don't call them selves 'Linux', 
> it may not 
> > be in the user agent string. (simply looking for the word 
> Linux is not 
> > good enough).
> > 
> > 3) A Linux user may have been misreporting the Operating System 
> > (commonly used to cater for sites that do user agent 
> sniffing badly, 
> > also used to blend in with the crowd for anonymity).
> > 
> > 4) Someone may have a dual boot (or triple or more), and 
> may only be 
> > using Windows to view bbc.co.ku, possibly due to being 
> locked out by 
> > previously mentioned technological practices of the BBC.
> > 
> > 5) Some 'users' may not be real people, they may be robots spoofing 
> > there user agent. 90% of email is spam. How have you 
> accounted for web 
> > robots browsing your site looking for email addresses or trying to 
> > post spam comments (they would not hit robots.txt or say 
> robot in the 
> > user agent, that would give them away)? I am thinking most 
> spam bots 
> > would impersonate IE on Windows as it probably has the 
> highest market 
> > share so much harder o filter. (by how high we are unsure).
> > 
> > Additionally you could argue you would get the less 
> knowledgable users 
> > in this sampling, I rarely hit the BBC home page, why 
> bother? I know 
> > where I want to go and I get the news feeds in a handy RSS so I 
> > probably don't hit news.bbc.co.uk's homepage either.
> > I have the pages I need on bookmarks, (Favourites for you IE users).
> > 
> > This is the great thing about statistics people like you claim they 
> > show something and try to cover up the failings of how the sampling 
> > was done.
> > 
> > It shows only as much as it records. The number of recognized User 
> > Agent strings for hits on the BBC website.
> > 
> > (Quick question, is this per IP or per page hit? page hit 
> would be bad 
> > as it would allow robots to skew the results badly as they 
> would hit 
> > far more pages).
> > 
> > I really do dislike statistics, especially when people try to claim 
> > that they prove something without accounting for the method of 
> > gathering.
> > 
> > And now a quote:
> > > There are three kinds of commonly recognised untruths:
> > >
> > >      Lies, damn lies and statistics.
> > >      - Mark Twain
> > >
> > > This quote from Mark Twain is accurate; statistics are
> > often used to
> > > lie to the public because most people do not understand how
> > statistics work.
> > 
> > And this quote is from where you ask? Why it is from the BBC of 
> > course! (well I had to use the BBC quote didn't I?
> > especially it is the first result on Google for: lies damn lies 
> > statistics)
> > 
> > Maybe you should improve your stats?
> > 1.Group each unique header together and have a Skilled Human with 
> > knowledge of all operating system classify them according to OS.
> > 2. Make each visitor pass a Turing Test prior to using there User 
> > Agent.
> > 3. Verify details of OS using other methods, i.e. Javascript could 
> > check, or use OS fingerprinting (hopefully it wouldn't hit NAT 
> > routers, otherwise you'd probably get the OS of a router,. which 
> > although interesting is not what we are looking for is it?).
> > 
> > On the subject of whether to support IE 5, is it supported by 
> > Microsoft or has it been end of lifed? If it's been end of 
> lifed then 
> > maybe you don't need to support it.
> > 
> > Why do you need to 'support' specific browsers anyway? This is what 
> > standards are ofr, I don't need to check the compatibility 
> with every 
> > piece of software on every switch between here and my destination 
> > node, they are using a standard I just make sure I follow that 
> > standard. Why should the HTML content be any different?
> > 
> > The underlying TCP/IP and HTTP system seem to work much more 
> > compatibly than all these websites, many of which display poorly if 
> > you stray so slightly of the most common browser and settings, does 
> > this not show that standards work better?
> > 
> > Andy
> > 
> > --
> > First they ignore you
> > then they laugh at you
> > then they fight you
> > then you win.
> > - Mohandas Gandhi
> > -
> > Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, 
> > please visit 
> > http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
> >   Unofficial list archive: 
> > http://www.mail-archive.com/backstage@lists.bbc.co.uk/
> > 
> 
> -
> Sent via the backstage.bbc.co.uk discussion group.  To 
> unsubscribe, please visit 
> http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.
>   Unofficial list archive: 
> http://www.mail-archive.com/backstage@lists.bbc.co.uk/
> 
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.446 / Virus Database: 268.18.20/736 - Release 
> Date: 27/03/2007 16:38
>  
> 

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 268.18.20/736 - Release Date: 27/03/2007
16:38
 


-
Sent via the backstage.bbc.co.uk discussion group.  To unsubscribe, please 
visit http://backstage.bbc.co.uk/archives/2005/01/mailing_list.html.  
Unofficial list archive: http://www.mail-archive.com/backstage@lists.bbc.co.uk/

Reply via email to