On Thu, Feb 26, 2009 at 10:53 AM, Aryeh Gregor < [email protected] <simetrical%[email protected]>> wrote:
> On Thu, Feb 26, 2009 at 9:49 AM, Anthony <[email protected]> wrote: > > Accepted by whom? Co-lo a box on the Internet, and ask the Foundation > for > > permission to create the dump. A single thread downloading articles to a > > single server isn't going to impact the project. It probably wouldn't > even > > be *noticed*. > > It would also take even longer than the Wikimedia dumps. What's your estimate of how long it's going to take to get the next full history English Wikipedia dump? Say 500ms average to request a single revision. Why say that? 500ms is a long time. Besides, through the API, you can get multiple revisions at once. > Multiply by 250,000,000 revisions. You'd only need to get revisions since the last successful dump - maybe 150,000,000. I get a figure of four years. You're going to need a lot > more than a single thread to get a remotely recent dump. You probably > couldn't even keep up with the rate of new revision creation with a > single thread blocking on each HTTP request. I just got 50 revisions of [[Georgia]] in 6.389 seconds using the API and my slow internet connection. Even at that rate all the revisions since the last dump could be downloaded in seven months, which is much less than the time since the last successful full history dump. Ongoing it'd take 7 hours to download a new day's edits, but more realistically a live feed could be set up at that point. And this is a worst case scenario. It assumes the WMF doesn't help *at all* aside from allowing a single thread to access its servers. _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
