Would prefer on its own wiki as this is comprehensive up to a given date. Maybe January2001.wikipedia.org -- immediate impact.
(DNS software cannot handle 2001.wikipedia.org) FT2 On Tue, Dec 14, 2010 at 6:04 PM, phoebe ayers <[email protected]> wrote: > On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling <[email protected]> > wrote: > > I was looking through some old files in our SourceForge project. I > > opened a file called wiki.tar.gz, and inside were three complete > > backups of the text of Wikipedia, from February, March and August 2001! > > > > This is exciting, because there is lots of article history in here > > which was assumed to be lost forever. > > > > I've long been interested in Wikipedia's history, and I've tried in > > the past to locate such backups. I asked various people who might have > > had one. I had given up hope. > > > > The history of particularly old Wikipedia articles, as seen in the > > present Wikipedia database, is incomplete, due to Usemod's policy of > > deleting old revisions of pages after about a month. The script which > > Brion wrote to import the article histories from UseMod to MediaWiki > > only fetched those revisions which hadn't been purged yet. > > > > I didn't want to believe that those revisions had been lost forever, > > and I even opened the UseMod source code and stared forlornly at the > > unlink() call. What I (and Brion before) missed is that UseMod appends > > a record of every change made to two files, called diff_log and rclog. > > In these two files is a record of every change made to Wikipedia from > > January 15 to August 17, 2001. > > > > I've put the two log files up on the web, at: > > > > http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z > > > > The 7-zip archive is only 8.4MB -- much more manageable than today's > > backups. > > > > rclog contains IP addresses. The Usemod software made IP addresses of > > logged-in users public, so the people who made these edits had no > > expectation that their IP address would be kept private. That, coupled > > with the passage of time, makes me think that no harm to user privacy > > can come from releasing these files. > > > > -- Tim Starling > > AWESOME. This is so cool. I've copied the research list too, since > there's many Wikipedia historians that will be eager to see the older > versions. > > I hope we can get them up in a browsable way, like nostalgia.wikipedia.org > ! > > -- phoebe > > _______________________________________________ > foundation-l mailing list > [email protected] > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l > _______________________________________________ foundation-l mailing list [email protected] Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
