Hi again folks, The operation has successfully finished.
The `wmf.webrequest` hive table now contains data coming from HAProxy. We still have to rerun downstream jobs (pageviews mostly) for the today's first hours so that we have a clean cut on data at April 1st. 6y. This will be done in the next hours but shouldn't disrupt your work. *An important thing to note* is that for cluster space reasons we only have data since mid-March in the `wmf.webrequest` table. It'll build up to the 90 days retention in the next months. If you need older data, you can use the `wmf_deprecated.webrequest` hive table that contains the 'old' Varnish data we keep in case something goes wrong with HAProxy. We will keep this table for at least a month. Of course if you spot something odd or if you have questions, please let us know :) On Tue, Apr 1, 2025 at 10:25 AM Joseph Allemandou <[email protected]> wrote: > Good morning data folks, > This morning we migrate our webrequest dataset to feed from HAProxy > instead of Varnish. > The expected differences are documented in this google doc > <https://docs.google.com/document/d/1cCSGzLUfVWUHjqG5v5VdLADsbzmMklczQ1YG7oghGl8/edit?tab=t.0#heading=h.501d0uw4oyze>, > as well as the detailed analysis and the migration plan. > We expect to be done in a few hours, in the meantime you may experience > failing queries if you use the webrequest data, and new data will not be > flowing until we are done. > For those interested in following the operation, we'll post on slack in this > thread > <https://wikimedia.slack.com/archives/C05RHK7PS6Q/p1743494545802669>. > We'll post here again at the end of the operation. > Thank you for your understanding :) > > -- > Joseph Allemandou (joal) (he / him) > Staff Data Engineer > Wikimedia Foundation > -- Joseph Allemandou (joal) (he / him) Staff Data Engineer Wikimedia Foundation
_______________________________________________ Analytics mailing list -- [email protected] To unsubscribe send an email to [email protected]
