Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-04-08 Thread Anthony
I'd like to add that the md5 of the *uncompressed* file is cd4eee6d3d745ce716db2931c160ee35 . That's what I got from both the uncompressed 7z and the uncompressed bz2. They matched, whew. Uncompressing and md5ing the bz2 took well over a week. Uncompressing and md5ing the 7z took less than a

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-04-08 Thread Anthony
On Thu, Apr 8, 2010 at 7:34 PM, Q overlo...@gmail.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On 4/8/2010 4:28 PM, Anthony wrote: I'd like to add that the md5 of the *uncompressed* file is cd4eee6d3d745ce716db2931c160ee35 . That's what I got from both the uncompressed 7z

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-29 Thread Anthony
Got an md5sum? On Mon, Mar 29, 2010 at 5:46 PM, Tomasz Finc tf...@wikimedia.org wrote: I love lzma compression. enwiki-20100130-pages-meta-history.xml.bz2 280.3 GB enwiki-20100130-pages-meta-history.xml.7z 31.9 GB Download at http://tinyurl.com/yeelbse Enjoy! --tomasz Tomasz Finc

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-29 Thread Tomasz Finc
You can find all the md5sums at http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-md5sums.txt --tomasz Anthony wrote: Got an md5sum? On Mon, Mar 29, 2010 at 5:46 PM, Tomasz Finc tf...@wikimedia.org mailto:tf...@wikimedia.org wrote: I love lzma compression.

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread zh509
On Mar 19 2010, Platonides wrote: Zeyi wrote: Hi, Firstly, congratulations for this! as i Know it has taken for a long time! and May I ask a small question: what difference between current dump and history dump. I know current one only includes current edits, and history one has all

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread Conrad Irwin
On 03/19/2010 11:02 AM, zh...@york.ac.uk wrote: What I mean is that if the current dump show there are 30 edits under the particular article name, and history dump show there are 100 edits under the same article. what's different between these 30 and 100? The current dump shows 1 edit for

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-19 Thread zh509
On Mar 19 2010, Conrad Irwin wrote: On 03/19/2010 11:02 AM, zh...@york.ac.uk wrote: What I mean is that if the current dump show there are 30 edits under the particular article name, and history dump show there are 100 edits under the same article. what's different between these 30 and

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-18 Thread zh509
Hi, Firstly, congratulations for this! as i Know it has taken for a long time! and May I ask a small question: what difference between current dump and history dump. I know current one only includes current edits, and history one has all edits as introduction said. More specifically, how

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-18 Thread Platonides
Zeyi wrote: Hi, Firstly, congratulations for this! as i Know it has taken for a long time! and May I ask a small question: what difference between current dump and history dump. I know current one only includes current edits, and history one has all edits as introduction said. You have

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-17 Thread Jamie Morken
Date: Wed, 17 Mar 2010 15:15:24 +0100 From: Platonides platoni...@gmail.com Subject: Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D To: wikitech-l@lists.wikimedia.org Message-ID: hnqo49$it...@dough.gmane.org Content

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-17 Thread Felipe Ortega
--- El mar, 16/3/10, Kevin Webb kpw...@gmail.com escribió: De: Kevin Webb kpw...@gmail.com Asunto: Re: [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D Para: Tomasz Finc tf...@wikimedia.org CC: Wikimedia developers

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-17 Thread Felipe Ortega
Let alone that, for some of us outside USA (and even with a good connection to the EU resarch network) the download process takes, so to say, slightly more time than expected (and is prone to errors as the file gets larger). So other +1 to replace bzip with 7zip. F. --- El mar, 16/3/10,

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-16 Thread Tomasz Finc
Tomasz Finc wrote: New full history en wiki snapshot is hot off the presses! It's currently being checksummed which will take a while for 280GB+ of compressed data but for those brave souls willing to test please grab it from

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-11 Thread Felipe Ortega
--- El jue, 11/3/10, Tomasz Finc tf...@wikimedia.org escribió: De: Tomasz Finc tf...@wikimedia.org Asunto: [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D Para: Wikimedia developers wikitech-l@lists.wikimedia.org,

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-11 Thread Platonides
Tomasz Finc wrote: Brian J Mingus wrote: On Wed, Mar 10, 2010 at 8:54 PM, Tomasz Finctf...@wikimedia.org mailto:tf...@wikimedia.org wrote: Yup, that's the one. If you have a fast upload pipe then I'm more then happy to setup space for it. Otherwise it should be arriving in our

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-10 Thread Tomasz Finc
Thankfully due to an awesome volunteer we'll be able to get that 2008 snapshot in our archive. I'll mail out when it shows up in our snail mail. --tomasz Erik Zachte wrote: I'm thrilled. Big thanks to Tim and Tomasz for pulling this off. For the record the 2008-10-03 dump existed for a short

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-10 Thread Tomasz Finc
Yup, that's the one. If you have a fast upload pipe then I'm more then happy to setup space for it. Otherwise it should be arriving in our snail mail after a couple of days. -tomasz Kevin Webb wrote: Many thanks to everyone involved. Also, in case it's of use to anyone I have a copy of the

Re: [Wikitech-l] [Xmldatadumps-admin-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D

2010-03-10 Thread Tomasz Finc
Brian J Mingus wrote: On Wed, Mar 10, 2010 at 8:54 PM, Tomasz Finc tf...@wikimedia.org mailto:tf...@wikimedia.org wrote: Yup, that's the one. If you have a fast upload pipe then I'm more then happy to setup space for it. Otherwise it should be arriving in our snail mail after