https://bugzilla.wikimedia.org/show_bug.cgi?id=71790

            Bug ID: 71790
           Summary: By counting HTTP redirects, webstatscollector
                    reporting too high numbers
           Product: Analytics
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: Unprioritized
         Component: General/Unknown
          Assignee: wikibugs-l@lists.wikimedia.org
          Reporter: christ...@quelltextlich.at
                CC: bugwatc...@sb-mail.wmflabs.org,
                    christ...@quelltextlich.at, kle...@wikimedia.org,
                    oke...@wikimedia.org, tneg...@wikimedia.org
       Web browser: ---
   Mobile Platform: ---

One of the longstanding issues with Webstatscollector is that it
counts redirects at the HTTP level.

So for example:
- Requesting a page with a lower case first letter [1],
- Requesting a page from the desktop site on a mobile device [2], or
- Requesting to www.wikipedia.org (first part is www, not a language) [3]
causes two requests to the caches, and webstatscollector counts both,
although actually only a single page is shown to the user.
Thereby too high numbers get reported.

Since we're about the deploy a new webstatscollector anyways, and this
double counting should not be too hard to fix, let's get it fixed too.

(Note that redirects above the HTTP level are not affected. So for example
  http://en.wikipedia.org/wiki/Michael_J_Fox
(no dot after the J) is, was and will be one request, although it shows
the content of
  http://en.wikipedia.org/wiki/Michael_J._Fox
(dot after the J). Such redirects at Wiki level are not affected.)





[1]
_________________________________________________________________
christian@spencer // jobs: 0 // time: 13:13:36 // exit code: 0
cwd: ~
wget -O /dev/null 'http://en.wikipedia.org/wiki/main_page'
--2014-10-08 13:13:39--  http://en.wikipedia.org/wiki/main_page
Resolving en.wikipedia.org... 91.198.174.192
Connecting to en.wikipedia.org|91.198.174.192|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://en.wikipedia.org/wiki/Main_page [following]
--2014-10-08 13:13:39--  http://en.wikipedia.org/wiki/Main_page
Reusing existing connection to en.wikipedia.org:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `/dev/null'

    [ <=>                                                                      
                                           ] 67,779      --.-K/s   in 0.1s    

2014-10-08 13:13:39 (472 KB/s) - `/dev/null' saved [67779]



[2]
_________________________________________________________________
christian@spencer // jobs: 0 // time: 13:13:39 // exit code: 0
cwd: ~
wget -O /dev/null --user-agent 'iPhone'
'http://en.wikipedia.org/wiki/Main_Page'
--2014-10-08 13:13:44--  http://en.wikipedia.org/wiki/Main_Page
Resolving en.wikipedia.org... 91.198.174.192
Connecting to en.wikipedia.org|91.198.174.192|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://en.m.wikipedia.org/wiki/Main_Page [following]
--2014-10-08 13:13:44--  http://en.m.wikipedia.org/wiki/Main_Page
Resolving en.m.wikipedia.org... 91.198.174.204
Connecting to en.m.wikipedia.org|91.198.174.204|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `/dev/null'

    [ <=>                                                                      
                                           ] 22,002      --.-K/s   in 0.05s   

2014-10-08 13:13:44 (416 KB/s) - `/dev/null' saved [22002]



[3]
_________________________________________________________________
christian@spencer // jobs: 0 // time: 13:13:44 // exit code: 0
cwd: ~
wget -O /dev/null 'http://www.wikipedia.org/wiki/Main_Page'
--2014-10-08 13:13:49--  http://www.wikipedia.org/wiki/Main_Page
Resolving www.wikipedia.org... 91.198.174.192
Connecting to www.wikipedia.org|91.198.174.192|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://en.wikipedia.org/wiki/Main_Page [following]
--2014-10-08 13:13:49--  http://en.wikipedia.org/wiki/Main_Page
Resolving en.wikipedia.org... 91.198.174.192
Reusing existing connection to www.wikipedia.org:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `/dev/null'

    [ <=>                                                                      
                                           ] 67,565      --.-K/s   in 0.1s    

2014-10-08 13:13:49 (471 KB/s) - `/dev/null' saved [67565]

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to