Hi all,

We attempted a Kafka upgrade last[1] and this week[2], and during both 
occasions had incidents of webrequest data loss.  We are still resolving these, 
and still nailing down an estimate of how much data was lost and when.

One thing we do know: any webrequest_text related data since about 
2016-08-11T16:00 is missing around (at least) 8% of data.  Camus is busy 
reimporting this missing data from Kafka since that time, and jobs that have 
been run since then will be rerun.  This includes pageview_hourly and any other 
webrequest related jobs.

We will document what we know about what data is really gone when we know more 
and also let you know when the refined webrequest data after 2016-08-11T16:00 
is ready for use.

Really sorry for this inconvenience.  We are scrambling to get everything back 
in order.

-Andrew + Analytics Engineering Team

[1] https://wikitech.wikimedia.org/wiki/Incident_documentation/20150803-Kafka 
<https://wikitech.wikimedia.org/wiki/Incident_documentation/20150803-Kafka>
[2] https://wikitech.wikimedia.org/wiki/Incident_documentation/20150810-Kafka 
<https://wikitech.wikimedia.org/wiki/Incident_documentation/20150810-Kafka>


_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to