Ah, John, sorry. That's a known problem with the dumps process. It's been taking longer and longer and is harder and harder to manage because of the increased size. We weren't even able to update our reportcard lately because the process is taking so long it doesn't leave Erik Z. the time to run his analysis. I have started talking to people privately about revamping the dumps process. We need it in Analytics for some very important work that Aaron Halfaker is doing on diff analysis and folks like you need it for your work. From the start it's clear we need:
* incremental dumps * fast access to them * reliable bandwidth or a cluster to explore on This is a million times easier said than done, but I'll keep making the case for it. On Fri, Feb 13, 2015 at 11:51 PM, John <[email protected]> wrote: > I thought I included the link.... https://phabricator.wikimedia.org/T47646 > is for the two year old ticket. (that should make context a little clearer) > > Dan the basic dumps from dumps.wikimedia.org is all that I need, if you > take a look at the path I provided the dumps for > > 20150112 > 20150204 > 20150205 > > are all missing. > > On Fri, Feb 13, 2015 at 11:39 PM, Dan Andreescu <[email protected]> > wrote: > >> Sorry to hear, John. While I'm not ops, is there anything I can help >> with to get your immediate need filled? What would you do with the dump? >> Is labsdb a good alternative or do you already have scripts? Do you use >> http://dumps.wikimedia.org/ ? Are the dumps you need not there? I know >> that site's experiencing some rate limiting but that's simply a budget >> issue. >> >> I'm on the analytics team and one of my goals is to make datasets and raw >> data publicly available, so I appreciate your perspective and I'm sorry in >> advance if I can't help. >> >> On Fri, Feb 13, 2015 at 11:23 PM, John <[email protected]> wrote: >> >>> I am looking at a ticket filed almost two years ago for labs to support >>> the -latest format that the toolserver had, and guess what? Zero progress >>> has been made. >>> >>> This is getting sad, when labs was created it was supposed to be a >>> replacement and improvement on the toolserver, yet a basic feature of >>> running tools on database dumps has yet to be implemented, >>> >>> So knowing that, I got a request to run a database scan today. I took a >>> look at /public/dumps/public/enwiki to figure out the path to the most >>> current dump. Guess what? we don't have it on labs. The most current dump >>> for enwiki is from last year.... /public/dumps/public/enwiki/20141208/ >>> >>> Something needs to happen, key, basic functionality of the toolserver is >>> still missing, its not rocket science, yet ops has consistently failed to >>> provide needed functionality in this area, filing tickets gets me nowhere, >>> so the real question here is why is this still an issue and who do I need >>> to call in order to get things resolved? >>> >>> _______________________________________________ >>> Labs-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>> >>> >> >> _______________________________________________ >> Labs-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/labs-l >> >> > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l > >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
