Hi, Quick follow-up: All data has been backfilled, you can get back to normal cluster activity :) Sorry for the inconvenience. Joseph
On Tue, Mar 1, 2016 at 2:26 PM, Joseph Allemandou <[email protected] > wrote: > Hi, > > *TL,DR: Please don't use hive / spark / hadoop before next week.* > > Last week the Analytics Team performed an upgrade to the Hadoop Cluster. > It went reasonably well except for many of the hadoop processes were > launched with a special option to NOT use utf-8 as default encoding. > This issue caused trouble particularly in page title extraction and was > detected last sunday (many kudos to the people having filled bugs on > Analytics API about encoding :) > We found the bug and fixed it yesterday, and backfill starts today, with > the cluster recomputing every dataset starting 2016-02-23 onward. > This means you shouldn't query last week data during this week, first > because it is incorrect, and second because you'll curse the cluster for > being too slow :) > > We are sorry for the inconvenience. > Don't hesitate to contact us if you have any question > > > -- > *Joseph Allemandou* > Data Engineer @ Wikimedia Foundation > IRC: joal > -- *Joseph Allemandou* Data Engineer @ Wikimedia Foundation IRC: joal
_______________________________________________ Analytics mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/analytics
