Hi all,

I just caused another small webrequest log data loss.  I merged a change
that was supposed to have no effect, but unfortunately it did.  Between
21:54 and 22:15 UTC today. A puppet change was merged in which an important
firewall rule dealing with IPSec was lost. This kept all varnishkafkas in
remote datacenters from producing to Kafka during this time.

I have documented this here:
https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest#Changes_and_known_problems_since_2015-03-04

Apologies to all!

-Andrew Otto

---------- Forwarded message ----------
From: Marcel Ruiz Forns <[email protected]>
Date: Wed, Dec 16, 2015 at 10:29 AM
Subject: [Analytics] [Outage] Small data loss in raw_webrequest on
2015-12-15
To: "A mailing list for the Analytics Team at WMF and everybody who has an
interest in Wikipedia and analytics." <[email protected]>


Hi Analytics,

Yesterday, Dec 15, during the course of 1 hour (17h to 18h UTC) there was
an irrecoverable raw_webrequest data loss of ~30%: 25.6% (misc), 19.5%
(mobile), 19.1% (text), 39.1% (upload). This represents around 1% of the
data for that day.

The loss was due to the enabling of IPSec, which encrypts varniskafka
traffic between caches in remote datacenters and the Kafka brokers in
eqiad. During a period of about 40ish minutes, no webrequest logs from
remote datacenters were successfully produced to Kafka.

Here's the outage note:
https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest#Changes_and_known_problems_since_2015-03-04
Sorry for the inconvenience.

-- 
*Marcel Ruiz Forns*
Analytics Developer
Wikimedia Foundation

_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to