Thanks for reporting this.

Pine
On Aug 26, 2015 1:27 PM, "Andrew Otto" <[email protected]> wrote:

> Hi all,
>
> Now that we’ve had a little space to analyze the problem, I wanted to call
> out a recent webrequest data loss issue that we experienced on two separate
> occasions.
>
> We attempted to upgrade to Kafka 0.8.2.1, and it wasn’t until the second
> attempt that we actually found the problem.  Kafka 0.8.2.1 ships with a
> buggy version of Snappy[1] that causes messages to not be compressed
> properly.  This caused a ~4x increase network and disk I/O around the
> cluster all at once.
>
> We’ve documented the incidents and the occasions of significant data loss
> here:
>
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150803-Kafka
>
>
> https://wikitech.wikimedia.org/wiki/Incident_documentation/20150810-Kafka#Conclusions
>
> https://wikitech.wikimedia.org/wiki/Analytics/Data/Webrequest
>
> This loss will affect the output of pagecount* and pageview datasets, as
> well as other webrequest generated statistics.  Please consider statistics
> that are generated from webrequest data using the following UTC hours
> unreliable:
>
>   2015-08-03T18:00 - 2015-08-03T23:00
>   2015-08-10T15:00 - 2015-08-10T21:00
>   2015-08-11T17:00 - 2015-08-11T18:00
>
> Many apologies for any inconvenience this causes.  We’ve learned a lot
> during this turmoil, and have a lot of ideas on how to hopefully prevent
> this from happening in the future, and also how to reduce loss and
> complexity if and when it does.  The analytics engineering team will be
> doing a post mortem on this soon, in which we will document these ideas.
>
> Thanks,
> -Andrew Otto
>
> [1] https://issues.apache.org/jira/browse/KAFKA-2189
>
>
> _______________________________________________
> Analytics mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to