Hi,
Quick follow-up: All data has been backfilled, you can get back to normal
cluster activity :)
Sorry for the inconvenience.
Joseph


On Tue, Mar 1, 2016 at 2:26 PM, Joseph Allemandou <[email protected]
> wrote:

> Hi,
>
> *TL,DR: Please don't use hive / spark / hadoop before next week.*
>
> Last week the Analytics Team performed an upgrade to the Hadoop Cluster.
> It went reasonably well except for many of the hadoop processes were
> launched with a special option to NOT use utf-8 as default encoding.
> This issue caused trouble particularly in page title extraction and was
> detected last sunday (many kudos to the people having filled bugs on
> Analytics API about encoding :)
> We found the bug and fixed it yesterday, and backfill starts today, with
> the cluster recomputing every dataset starting 2016-02-23 onward.
> This means you shouldn't query last week data during this week, first
> because it is incorrect, and second because you'll curse the cluster for
> being too slow :)
>
> We are sorry for the inconvenience.
> Don't hesitate to contact us if you have any question
>
>
> --
> *Joseph Allemandou*
> Data Engineer @ Wikimedia Foundation
> IRC: joal
>



-- 
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal
_______________________________________________
Analytics mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to