So, in sequence:

Gergo: Either the false anchors are sent to the server or some conniving
elf has been inserting thousands of fake requests into our logs ;). I'm
seeing a lot of requests with #mediaviewer/ URLs, some internal and some
with referers from outside the WMF (implying someone following a link). The
proposed ways forward are useful, but as Erik M says, reorganising active
products for the sake of avoiding a pageviews filter is probably not worth
it unless it's a truly trivial change, so let's just stick with the status
quo for now and I'll build in a filter.

Gilles: see above, re Erik's comments.

Thanks to everyone for their commentary and help; I'll build a filter into
the definition this morning :)

On 26 November 2014 at 05:07, Gilles Dubuc <[email protected]> wrote:

> Server logs of page hits provide less and less value in terms of knowing
> what people are doing (was it ever possible to truly tell bots apart from
> humans? to compensate for caching proxies run by organizations?), the more
> client-side and mobile apps we develop. I think that it's inevitable that
> any meaningful tracking will have to be done client-side. Looking for ways
> to adapt our URL schemes for the sake of server logs seems like rearranging
> the deck chairs on the titanic to me. We should be trying to put as little
> work into it as possible. Our stats efforts should be rather focused on
> more fine-grained client-side and mobile tracking, which is what we need to
> truly answer questions, even on our old "static" pages like the articles
> themselves. The same way that I've been working on tracking how long images
> are being viewed for at the Amsterdam hackathon in preparation for Erik
> Zachte's RFC on image views, we should be doing the same sort of
> measurements on articles.
>
> On Wed, Nov 26, 2014 at 12:51 AM, Gergo Tisza <[email protected]>
> wrote:
>
>> On Tue, Nov 25, 2014 at 1:59 PM, Oliver Keyes <[email protected]>
>> wrote:
>>
>>> Actually, I'd argue it's not equivalent at all, for two reasons:
>>>
>>>
>>>    1. it doesn't present all of the same data. In fact, it presents
>>>    very little data, compared to a pageview of the "File" page;
>>>    2. The argument behind MMV is, as I understand it, that people are
>>>    focusing on the images. It is designed so that people do so, on the basis
>>>    that people clicking on images probably want those images. As such, it'd 
>>> be
>>>    inaccurate to weight it as equivalent to say
>>>    https://az.wikipedia.org/wiki/Mar%C3%A7ello_Malpigi
>>>    
>>> <https://az.wikipedia.org/wiki/Mar%C3%A7ello_Malpigi#mediaviewer/File:Marcello_Malpighi_large.jpg>
>>>    in textual value - we believe (correct me if I'm wrong) that someone
>>>    clicking for an image wants a media file, not a wall of text.
>>>
>>>
>> MediaViewer hash loads and File page requests have little to do with each
>> other. File page request happens when 1) someone clicks on a thumbnail, 2)
>> someone shares the URL of a file page and someone else follows that URL. In
>> the case of MediaViewer, only the first case results in a text/html request
>> to the server. The second case (which is about 30x more frequent) only
>> results in a bunch of AJAX calls and an image request (actually more than
>> one, due to preloading). Those AJAX calls could easily be made unique, if
>> that is of any interest.
>>
>> So basically when you click on an image, MediaViewer uses AJAX requests
>> to load some of the information from the file page, then creates an <img>
>> tag so the browser loads a large image thumbnail. When you visit an URL
>> ending in #mediaviewer/..., that just tells the MV code to simulate an
>> image click as soon as the page has loaded.
>>
>> _______________________________________________
>> Multimedia mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/multimedia
>>
>>
>
> _______________________________________________
> Multimedia mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/multimedia
>
>


-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Multimedia mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/multimedia

Reply via email to