On Fri, Feb 13, 2015 at 10:11 PM, Dan Andreescu <[email protected]> wrote: > Ah, John, sorry. That's a known problem with the dumps process. It's been > taking longer and longer and is harder and harder to manage because of the > increased size. We weren't even able to update our reportcard lately > because the process is taking so long it doesn't leave Erik Z. the time to > run his analysis. I have started talking to people privately about > revamping the dumps process. We need it in Analytics for some very > important work that Aaron Halfaker is doing on diff analysis and folks like > you need it for your work. From the start it's clear we need: > > * incremental dumps > * fast access to them > * reliable bandwidth or a cluster to explore on > > This is a million times easier said than done, but I'll keep making the case > for it.
This epic in Phabricator <https://phabricator.wikimedia.org/T88728> would be a great place to document desired use-cases and user stories for the dumps process. If we can find enough interest to justify this project it could make in on to the MediaWiki-Core team's priorities. Even if we can't find a team to pick it up that soon, getting the problem better defined will make it easier to pitch the project. Bryan -- Bryan Davis Wikimedia Foundation <[email protected]> [[m:User:BDavis_(WMF)]] Sr Software Engineer Boise, ID USA irc: bd808 v:415.839.6885 x6855 _______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
