On Sun, Nov 30, 2008 at 12:58 PM, Thomas Dalton <[EMAIL PROTECTED]> wrote: >> I saw this the other day as well and found it odd. While enwiki dumps >> do take the longest, this does seem like an _incredibly_ long time for >> "All pages with complete page edit history (.bz2)" to finish (May 2009). > > Do you know how many pages enwiki has and how much edit history they > each have? It's a lot! > > I think the dumps work by starting with the last successful dump and > just adding in anything that's changed, but because there haven't been > any successful dumps of the whole of enwiki in a long time, it > basically has to start from scratch, which is going to take a long > time (and means it probably won't succeed - ie. we have a catch-22). > It seems to me that (if my understanding of the problem is correct), > the answer is to devote a more powerful computer to the dump for just > this one so that we can get things moving again - I'm sure if we asked > around someone could lend us a really powerful computer for a few > weeks to do the dump on.
No, dumps are total, not incremental. It is really more than throwing a big computer at it. The dumping process ought to be redesigned to be more fault tolerant and faster. It is ridiculous to have a process that is expected to take months and yet have no method of saving one's progress as it goes and restarting it in case of trouble. -Robert Rohde _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
