Re: [Foundation-l] Old Wikipedia backups discovered
phoebe ayers wrote: It's not news but AFAIK an actual image of the flag used is missing. So if that turns up, that would be cool :) But I think it was already gone by Feb. 2001. -- phoebe Isn't it the first piece of http://meta.wikimedia.org/wiki/File:Terribly_wrong.png ? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Larry didn't have an exaggerated role, he really did run the project in the early days. On Tue, Dec 14, 2010 at 7:50 PM, Tim Starling tstarl...@wikimedia.orgwrote: On 15/12/10 11:17, Brian J Mingus wrote: Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions: [http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal wiki encyclopedia project. - http://grey.colorado.edu/wikipedia_2001/979694938.txt EditorInChief of NuPedia and instigator of Nupedia's wiki. http://grey.colorado.edu/wikipedia_2001/979690096.txt Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea. I've long suspected that the early FAQs and history pages gave Larry Sanger an exaggerated role because he wrote them himself. It will be interesting to see if any such conclusion can be drawn from the archives. Note that 979694938 was by dhcp058.246.lvcm.com, which appears to be Larry. By the way, the numbers in the revisions, e.g. 979694938, are UNIX timestamps. That one was 17 Jan 2001, 01:28:58 UTC. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Good news from Wiki-research-l in case you're not subscribed to it... Nemo Messaggio Originale Oggetto: Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered Data: Thu, 16 Dec 2010 13:53:14 -0500 Da: Joseph Reagle I have the first 10K edits up reconstructed in their various pages at: http://cyber.law.harvard.edu/~reagle/wp-redux/ Messaggio Originale Oggetto: Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered Data: Fri, 17 Dec 2010 00:03:00 +1100 Da: Tim Starling On 16/12/10 23:10, Joseph Reagle wrote: On Wednesday, December 15, 2010, Tim Starling wrote: There were some changes made to the page text that weren't represented in diff_log, specifically changing certain camel-case links to free links. It appears my problems were related to some CR/LF issues not round-tripping between diff and patch, but I hope to be able to address that. And yes, in addition to some of the CamelCase issues, I expect another problem is that if a page is blanked Describe the new page here. will reappear outside of the diff_log. I don't think that will be a problem. But there are other problems that I've encountered. UseMod had a deletion feature. It turns out to be easy enough to skip deleted pages, since they don't have a corresponding entry in rclog. It also had an admin-only rename feature, which optionally fixed links in all pages. This accounts for the free link changes I was seeing earlier. And it had a link replacement feature which could be invoked without a page move. These features were rarely used, due to the arcane interface, usually people just moved pages by copying and pasting. But during the free-link conversion, a lot of pages were renamed using the admin-only feature. All these admin-only features were unlogged, but it turns out to be possible to reconstruct page moves, because when a page was moved, its name was updated in rclog but not in diff_log. By finding the first diff_log entry with the new name, you can roughly work out when the page moves were done. Anyway, I'm developing a script which will import the dump into a modified MediaWiki instance, the idea being that I can then export XML from it. Once it works, I'll upload the XML to somewhere. I'm not sure when that will be. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Brian J Mingus, 15/12/2010 01:36: Here is an interesting bit of history - the Wikipedia logo was first an American flag. Then Scott Moonen suggested we make it a globe: No news, this is already on Meta: http://meta.wikimedia.org/wiki/Logo_history http://meta.wikimedia.org/wiki/OldWikiPediaLogo Nemo ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
True to FT2's vision, this story has already been picked up by the major media! http://www.examiner.com/wiki-edits-in-national/original-copy-of-wikipedia-discovered Original copy of Wikipedia discovered December 14, 2010 - by Gregory Kohs, for Examiner.com ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Wed, Dec 15, 2010 at 12:39 PM, ResearchBiz research...@gmail.com wrote: True to FT2's vision, this story has already been picked up by the major media! http://www.examiner.com/wiki-edits-in-national/original-copy-of-wikipedia-discovered Original copy of Wikipedia discovered December 14, 2010 - by Gregory Kohs, for Examiner.com ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l Major media might be overstating your reach just a little bit, Greg. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 15 December 2010 17:39, ResearchBiz research...@gmail.com wrote: True to FT2's vision, this story has already been picked up by the major media! http://www.examiner.com/[spam url snipped] examiner.com is basically a paid blogging host with the only relation to media being a news-site-like skin. http://en.wikipedia.org/wiki/Examiner.com#Pay_scale Basically, the pay is 0.5-1c per click. I suggest any links to examiner.com on this list be treated as spam. - d. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
ViswaPrabha (വിശ്വപ്രഭ), 15/12/2010 01:03: And here is the first http://wikipedia.com archive link available at web archive. http://web.archive.org/web/20010727112808/http://www.wikipedia.org/ No, the first is http://web.archive.org/web/20010331173908/http://www.wikipedia.com/ Tim Starling, 15/12/2010 00:30: You may find this interesting: http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox Uh, didn't know anything about it. http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox Isn't intlwiki-l completely archived on gmane? http://blog.gmane.org/gmane.science.linguistics.wikipedia.international If not, we could import this mbox. Nemo ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Is the current CC license retroactive to all of the old versions from the beginning to now? W ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
That's fantastic news, and just in time for the 10th anniversary too, when I'm sure the early days of Wikipedia will be in the limelight. Great find Tim! Would it be at all possible to import these into the current system? I know someone was importing edits from the Nostalgia wiki. It would be wonderful to finally have a complete article history. Pete / the wub On 14 December 2010 15:54, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits: This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work! -Chad ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Tim, wonderful news! Thank you for making them publicly available! Of course I immediately downloaded them, and I must have a look at them later this week. Though they are from before I became active (2003) I am very curious if the articles in these files still exist, and how much they changed. teun spaans On Tue, Dec 14, 2010 at 4:54 PM, Tim Starling tstarl...@wikimedia.orgwrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7zhttp://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Great news indeed! Now I can finally figure out when my first edit was :-) Magnus On Tue, Dec 14, 2010 at 3:54 PM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 12/14/2010 7:54 AM, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! I guess producing database dumps was easier in those days. Seriously though, this is absolutely fantastic news! --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits: This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work! -Chad ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling AWESOME. This is so cool. I've copied the research list too, since there's many Wikipedia historians that will be eager to see the older versions. I hope we can get them up in a browsable way, like nostalgia.wikipedia.org! -- phoebe ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
This is definitely a tremendous asset leading up to our big bday in January. I hope we can extract and post some of the real gems. Thanks for the resourcefulness and the sharing, Tim. On Dec 14, 2010, at 10:04 AM, phoebe ayers wrote: On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling AWESOME. This is so cool. I've copied the research list too, since there's many Wikipedia historians that will be eager to see the older versions. I hope we can get them up in a browsable way, like nostalgia.wikipedia.org! -- phoebe ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l -- Jay Walsh Head of Communications WikimediaFoundation.org blog.wikimedia.org +1 (415) 839 6885 x 609, @jansonw ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. Wow, this is really, really amazing! I'm not sure just how you avoided having a heart attack after seeing this: -- HomePage|979586833 1c1 Describe the new page here. --- This is the new WikiPedia! Great work! Rob ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits: This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work! -Chad ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
In a message dated 12/14/2010 8:21:09 AM Pacific Standard Time, steven.wall...@gmail.com writes: This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share Hmm I wonder if some things can be added there (sound of feathers ruffling) Btw how does one *open* this tarball thing (on Windows) ? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
FYI, there is an existing timeline at: http://meta.wikimedia.org/wiki/Wikipedia_timeline And lots of other wikipedia history pages on English, too. :) Phoebe On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Winrar's your best bet. Other archivers may be equally good. FT2 On Tue, Dec 14, 2010 at 5:53 PM, wjhon...@aol.com wrote: In a message dated 12/14/2010 8:21:09 AM Pacific Standard Time, steven.wall...@gmail.com writes: This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share Hmm I wonder if some things can be added there (sound of feathers ruffling) Btw how does one *open* this tarball thing (on Windows) ? ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Would prefer on its own wiki as this is comprehensive up to a given date. Maybe January2001.wikipedia.org -- immediate impact. (DNS software cannot handle 2001.wikipedia.org) FT2 On Tue, Dec 14, 2010 at 6:04 PM, phoebe ayers phoebe.w...@gmail.com wrote: On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling AWESOME. This is so cool. I've copied the research list too, since there's many Wikipedia historians that will be eager to see the older versions. I hope we can get them up in a browsable way, like nostalgia.wikipedia.org ! -- phoebe ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
See see also etc in [[History of Wikipedia]]. FT2 On Tue, Dec 14, 2010 at 7:27 PM, phoebe ayers phoebe.w...@gmail.com wrote: FYI, there is an existing timeline at: http://meta.wikimedia.org/wiki/Wikipedia_timeline And lots of other wikipedia history pages on English, too. :) Phoebe On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 12:53 PM, wjhon...@aol.com wrote: Btw how does one *open* this tarball thing (on Windows) ? I'm a fan of http://www.7-zip.org/ -- James Alexander jameso...@gmail.com ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Right in time! And the rightly early version too! Kudos to the diggers and bashers! On Tue, Dec 14, 2010 at 21:23, Moka Pantages mpanta...@wikimedia.orgwrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! This is exciting, because there is lots of article history in here which was assumed to be lost forever. I've long been interested in Wikipedia's history, and I've tried in the past to locate such backups. I asked various people who might have had one. I had given up hope. The history of particularly old Wikipedia articles, as seen in the present Wikipedia database, is incomplete, due to Usemod's policy of deleting old revisions of pages after about a month. The script which Brion wrote to import the article histories from UseMod to MediaWiki only fetched those revisions which hadn't been purged yet. I didn't want to believe that those revisions had been lost forever, and I even opened the UseMod source code and stared forlornly at the unlink() call. What I (and Brion before) missed is that UseMod appends a record of every change made to two files, called diff_log and rclog. In these two files is a record of every change made to Wikipedia from January 15 to August 17, 2001. I've put the two log files up on the web, at: http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7zhttp://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z The 7-zip archive is only 8.4MB -- much more manageable than today's backups. rclog contains IP addresses. The Usemod software made IP addresses of logged-in users public, so the people who made these edits had no expectation that their IP address would be kept private. That, coupled with the passage of time, makes me think that no harm to user privacy can come from releasing these files. -- Tim Starling I have to say this is super cool. It's like digging up a time capsule right before the 10th anniversary. One of my favorite early edits: This is the new WikiPedia! The idea here is to write a complete encyclopedia from scratch, without peer review process, etc. Some people think that this may be a hopeless endeavor, that the result will necessarily suck. We aren't so sure. So, let's get to work! -Chad ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? Ciao Henning ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann h.schlottm...@gmx.net wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? There was only English back in the day... ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well. http://grey.colorado.edu/wikipedia_2001/ http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =) - Brian On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote: FYI, there is an existing timeline at: http://meta.wikimedia.org/wiki/Wikipedia_timeline And lots of other wikipedia history pages on English, too. :) Phoebe On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Hi Magnus, On 14.12.2010 22:35, Magnus Manske wrote: On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann h.schlottm...@gmx.net wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? There was only English back in the day... Not true. The first other languages were introduced on March 15 and could be part of this archive if the different Wikipedias were in one database under UseMod. Do you remember how this worked? Ciao Henning ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On Tue, Dec 14, 2010 at 9:49 PM, Henning Schlottmann h.schlottm...@gmx.net wrote: Hi Magnus, On 14.12.2010 22:35, Magnus Manske wrote: On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann h.schlottm...@gmx.net wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? There was only English back in the day... Not true. The first other languages were introduced on March 15 and could be part of this archive if the different Wikipedias were in one database under UseMod. My earliest recorded entry in de.wikipedia dates September 2001 (and I have a low two-digit user ID, which was created upon the switch to MediaWiki), so there seem to be some versions missing indeed. Do you know the oldest preserved esit on de.wp? Do you remember how this worked? AFAIR, every language had its own UseMod setup. My import script only took the last version; Brion later wrote one that filled in the previous ones from the stored diffs. Magnus ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 14.12.2010 23:47, Magnus Manske wrote: On Tue, Dec 14, 2010 at 9:49 PM, Henning Schlottmann Not true. The first other languages were introduced on March 15 and could be part of this archive if the different Wikipedias were in one database under UseMod. My earliest recorded entry in de.wikipedia dates September 2001 (and I have a low two-digit user ID, which was created upon the switch to MediaWiki), so there seem to be some versions missing indeed. Do you know the oldest preserved esit on de.wp? Local lore claims it is your edit http://de.wikipedia.org/w/index.php?title=Polymerase-Kettenreaktionoldid=2613 in Polymerase-Kettenreaktion. But I never checked that. Do you remember how this worked? AFAIR, every language had its own UseMod setup. My import script only took the last version; Brion later wrote one that filled in the previous ones from the stored diffs. That's unfortunate but only a small dent in the wonderful news that Wikipedia has its very first (English) edits back. Ciao Henning ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 15/12/10 07:36, Henning Schlottmann wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? Just English, unfortuately. You may find this interesting: http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
I hope some of you may have seen/discussed these pages (as well as the connected pages): http://web.archive.org/web/20010418152404/www.nupedia.com/ upto http://web.archive.org/web/20030730075209/http://www.nupedia.org/ Of course the domain name then, was nupedia.org. -vp On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.org wrote: On 15/12/10 07:36, Henning Schlottmann wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? Just English, unfortuately. You may find this interesting: http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
And here is the first http://wikipedia.com archive link available at web archive. http://web.archive.org/web/20010727112808/http://www.wikipedia.org/ 2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com I hope some of you may have seen/discussed these pages (as well as the connected pages): http://web.archive.org/web/20010418152404/www.nupedia.com/ upto http://web.archive.org/web/20030730075209/http://www.nupedia.org/ Of course the domain name then, was nupedia.org. -vp On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.orgwrote: On 15/12/10 07:36, Henning Schlottmann wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? Just English, unfortuately. You may find this interesting: http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions: [http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal wiki encyclopedia project. - http://grey.colorado.edu/wikipedia_2001/979694938.txt EditorInChief of NuPedia and instigator of Nupedia's wiki. http://grey.colorado.edu/wikipedia_2001/979690096.txt Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea. And Sanger clearly didn't have much faith in the concept: None of this is to say that the Nupedia wiki will ''replace'' the main encyclopedia; of course it won't. But it will be an interesting ancillary endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt - Brian On Tue, Dec 14, 2010 at 2:41 PM, Brian brian.min...@colorado.edu wrote: Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well. http://grey.colorado.edu/wikipedia_2001/ http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =) - Brian On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote: FYI, there is an existing timeline at: http://meta.wikimedia.org/wiki/Wikipedia_timeline And lots of other wikipedia history pages on English, too. :) Phoebe On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Here is an interesting bit of history - the Wikipedia logo was first an American flag. Then Scott Moonen suggested we make it a globe: In its first day of existences, because the nearest thing to hand for JimmyWales that was suitable for a logo was an American flag, WikiPedia had the American flag, OldGlory, for a logo. ScottMoonen sensibly suggested: I'd recommend you change the American flag logo. Exremely ethno-centric ''et. al.'' I think a globe logo would be much more fitting, if you want to keep with that metaphor. Or perhaps a book. http://grey.colorado.edu/wikipedia_2001/979773872.txt - Brian On Tue, Dec 14, 2010 at 5:17 PM, Brian brian.min...@colorado.edu wrote: Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions: [http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal wiki encyclopedia project. - http://grey.colorado.edu/wikipedia_2001/979694938.txt EditorInChief of NuPedia and instigator of Nupedia's wiki. http://grey.colorado.edu/wikipedia_2001/979690096.txt Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea. And Sanger clearly didn't have much faith in the concept: None of this is to say that the Nupedia wiki will ''replace'' the main encyclopedia; of course it won't. But it will be an interesting ancillary endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt - Brian On Tue, Dec 14, 2010 at 2:41 PM, Brian brian.min...@colorado.edu wrote: Here are a couple of quick indexes into the dump file. I didn't venture into the binary revision data. You'll find an alphabetized list of articles that contains all the diffs for each article in the order that they occured in the dump and a sorted index into each revision as well. http://grey.colorado.edu/wikipedia_2001/ http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't even have enough time to dig through this at all. Guess I just wanted a distraction =) - Brian On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote: FYI, there is an existing timeline at: http://meta.wikimedia.org/wiki/Wikipedia_timeline And lots of other wikipedia history pages on English, too. :) Phoebe On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote: This is so exciting! To Steven's point: we've also started a page where folks can add bits of interesting information as they excavate the files [1]. Can't wait to dig in! Congrats, Tim! [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning Date: Tue, 14 Dec 2010 08:20:10 -0800 From: Steven Walling steven.wall...@gmail.com Subject: Re: [Foundation-l] Old Wikipedia backups discovered To: Wikimedia Foundation Mailing List foundation-l@lists.wikimedia.org Message-ID: aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 This is fantastic, and the timing could not be better. If anyone finds anything noteworthy, please add it to the timeline of Wikipedia that we're building at the 10th anniversary wiki,[1] as well as the other tools for cataloging interesting tidbits from our history.[2] 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline 2. http://ten.wikipedia.org/wiki/Share On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote: On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
Is there any database backup of Nupedia? Or the articles were posted as HTML pages? 2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com And here is the first http://wikipedia.com archive link available at web archive. http://web.archive.org/web/20010727112808/http://www.wikipedia.org/ 2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com I hope some of you may have seen/discussed these pages (as well as the connected pages): http://web.archive.org/web/20010418152404/www.nupedia.com/ upto http://web.archive.org/web/20030730075209/http://www.nupedia.org/ Of course the domain name then, was nupedia.org. -vp On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.org wrote: On 15/12/10 07:36, Henning Schlottmann wrote: On 14.12.2010 16:54, Tim Starling wrote: I was looking through some old files in our SourceForge project. I opened a file called wiki.tar.gz, and inside were three complete backups of the text of Wikipedia, from February, March and August 2001! That's wonderful news. Is this for enWP only or were all languages in one database back then? Just English, unfortuately. You may find this interesting: http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Old Wikipedia backups discovered
On 15/12/10 11:17, Brian J Mingus wrote: Browsing through the earliest revisions in the revision index ( http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather interesting and full of fodder for founder debates. Consider these very early revisions: [http://www.nupedia.com Nupedia.com] is an open content, international, peer reviewed project run by LarrySanger, who got the idea of supplementing NuPedia with a less formal wiki encyclopedia project. - http://grey.colorado.edu/wikipedia_2001/979694938.txt EditorInChief of NuPedia and instigator of Nupedia's wiki. http://grey.colorado.edu/wikipedia_2001/979690096.txt Sanger's claims to coming up with the idea of adding the wiki concept to the online encyclopedia concept clearly go all the way back to the beginning. Of course, that doesn't speak to offline conversations that gave rise to the idea. I've long suspected that the early FAQs and history pages gave Larry Sanger an exaggerated role because he wrote them himself. It will be interesting to see if any such conclusion can be drawn from the archives. Note that 979694938 was by dhcp058.246.lvcm.com, which appears to be Larry. By the way, the numbers in the revisions, e.g. 979694938, are UNIX timestamps. That one was 17 Jan 2001, 01:28:58 UTC. -- Tim Starling ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l