If people are interested in zooming into the loss data a little more,
there's a zoomable graph here: http://debugging.wmflabs.org/

On Wed, Aug 26, 2015 at 4:54 PM, Pine W <[email protected]> wrote:

> Thanks for reporting this.
>
> Pine
> On Aug 26, 2015 1:27 PM, "Andrew Otto" <[email protected]> wrote:
>
>> Hi all,
>>
>> Now that we’ve had a little space to analyze the problem, I wanted to
>> call out a recent webrequest data loss issue that we experienced on two
>> separate occasions.
>>
>> We attempted to upgrade to Kafka 0.8.2.1, and it wasn’t until the second
>> attempt that we actually found the problem.  Kafka 0.8.2.1 ships with a
>> buggy version of Snappy[1] that causes messages to not be compressed
>> properly.  This caused a ~4x increase network and disk I/O around the
>> cluster all at once.
>>
>> We’ve documented the incidents and the occasions of significant data loss
>> here:
>>
>> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150803-Kafka
>>
>>
>> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150810-Kafka#Conclusions
>>
>> https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest
>>
>> This loss will affect the output of pagecount* and pageview datasets, as
>> well as other webrequest generated statistics.  Please consider statistics
>> that are generated from webrequest data using the following UTC hours
>> unreliable:
>>
>>   2015-08-03T18:00 - 2015-08-03T23:00
>>   2015-08-10T15:00 - 2015-08-10T21:00
>>   2015-08-11T17:00 - 2015-08-11T18:00
>>
>> Many apologies for any inconvenience this causes.  We’ve learned a lot
>> during this turmoil, and have a lot of ideas on how to hopefully prevent
>> this from happening in the future, and also how to reduce loss and
>> complexity if and when it does.  The analytics engineering team will be
>> doing a post mortem on this soon, in which we will document these ideas.
>>
>> Thanks,
>> -Andrew Otto
>>
>> [1] https://issues.apache.org/jira/browse/KAFKA-2189
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to