Well, the raw Double-entry_bookkeeping_system only has 14k views in
that hour, so I have to assume that (55k-14k) views are coming from
some oddly localised URI. Not sanitising input is...one of the many
things we should fix.

But, I would warn you that this is likely automata. Some things I have
seen that would explain it:

1. Live mirrors. Spammers (largely operating off Wordpress instances)
steal our content because it looks like legit tests and fools
particularly stupid spam/SEO filters. They normally do this through
live mirroring, so we get all the random hits from people who click
through on their emails.
2. Automata. There is not, to my knowledge, any automata filtering
performed on the pageviews data at the moment. I had hoped it would be
my next priority after the pageviews definition itself, and I hope
whoever is tasked with picking up improving our pageviews
infrastructure works on it. Analytics can do very simple things to
make this better; time will tell whether making things better on this
front is actually a priority.

On 9 March 2015 at 11:59, Oliver Keyes <[email protected]> wrote:
> It's more likely that it's just an attack by automata, rather than a
> sharp peak of genuine interest. Since 20150306 is within the last 30
> days I can look and check, and will do so now.
>
> On 8 March 2015 at 15:18, Roni Wiener <[email protected]> wrote:
>> Hi
>>
>> I was goofing around with the Wikipedia page counts dumps and noticed some
>> strange anomalies.
>>
>> For example:
>>
>> The page “Double-entry_bookkeeping_system” had 55921 page views on
>> pagecounts-20150306-070000.gz
>>
>> Where it only had 54 views on pagecounts-20150306-100000.gz (3 hours later).
>>
>> Is there a bug in the page counting system? How likely is it to have a sharp
>> peak of interest in Double-entry_bookkeeping_system?
>>
>> Best regards
>>
>> Roni Wiener, Keotic
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
>
> --
> Oliver Keyes
> Research Analyst
> Wikimedia Foundation



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to