Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-23 Thread Platonides
phoebe ayers wrote:
 It's not news but AFAIK an actual image of the flag used is missing.
 So if that turns up, that would be cool :) But I think it was already
 gone by Feb. 2001.
 
 -- phoebe

Isn't it the first piece of
http://meta.wikimedia.org/wiki/File:Terribly_wrong.png ?


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-21 Thread The Cunctator
Larry didn't have an exaggerated role, he really did run the project in the
early days.

On Tue, Dec 14, 2010 at 7:50 PM, Tim Starling tstarl...@wikimedia.orgwrote:

 On 15/12/10 11:17, Brian J Mingus wrote:
  Browsing through the earliest revisions in the revision index (
  http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather
  interesting and full of fodder for founder debates. Consider these very
  early revisions:
 
  [http://www.nupedia.com Nupedia.com] is an open content, international,
  peer reviewed project run by LarrySanger, who got the idea of
 supplementing
  NuPedia with a less formal wiki encyclopedia project.  -
  http://grey.colorado.edu/wikipedia_2001/979694938.txt
 
  EditorInChief of NuPedia and instigator of Nupedia's wiki. 
  http://grey.colorado.edu/wikipedia_2001/979690096.txt
 
  Sanger's claims to coming up with the idea of adding the wiki concept to
 the
  online encyclopedia concept clearly go all the way back to the beginning.
 Of
  course, that doesn't speak to offline conversations that gave rise to the
  idea.

 I've long suspected that the early FAQs and history pages gave Larry
 Sanger an exaggerated role because he wrote them himself. It will be
 interesting to see if any such conclusion can be drawn from the
 archives. Note that 979694938 was by dhcp058.246.lvcm.com, which
 appears to be Larry.

 By the way, the numbers in the revisions, e.g. 979694938, are UNIX
 timestamps. That one was 17 Jan 2001, 01:28:58 UTC.

 -- Tim Starling


 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-16 Thread Federico Leva (Nemo)
Good news from Wiki-research-l in case you're not subscribed to it...

Nemo

 Messaggio Originale  
Oggetto: Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered
Data: Thu, 16 Dec 2010 13:53:14 -0500
Da: Joseph Reagle

I have the first 10K edits up reconstructed in their various pages at:
   http://cyber.law.harvard.edu/~reagle/wp-redux/

 Messaggio Originale  
Oggetto: Re: [Wiki-research-l] [WikiEN-l] Old Wikipedia backups discovered
Data: Fri, 17 Dec 2010 00:03:00 +1100
Da: Tim Starling

On 16/12/10 23:10, Joseph Reagle wrote:
  On Wednesday, December 15, 2010, Tim Starling wrote:
  There were some changes made to the page text that weren't represented
  in diff_log, specifically changing certain camel-case links to free
  links.
  It appears my problems were related to some CR/LF issues not 
round-tripping between diff and patch, but I hope to be able to address 
that. And yes, in addition to some of the CamelCase issues, I expect 
another problem is that if a page is blanked Describe the new page 
here. will reappear outside of the diff_log.

I don't think that will be a problem. But there are other problems
that I've encountered.

UseMod had a deletion feature. It turns out to be easy enough to skip
deleted pages, since they don't have a corresponding entry in rclog.

It also had an admin-only rename feature, which optionally fixed links
in all pages. This accounts for the free link changes I was seeing
earlier. And it had a link replacement feature which could be invoked
without a page move. These features were rarely used, due to the
arcane interface, usually people just moved pages by copying and
pasting. But during the free-link conversion, a lot of pages were
renamed using the admin-only feature.

All these admin-only features were unlogged, but it turns out to be
possible to reconstruct page moves, because when a page was moved, its
name was updated in rclog but not in diff_log. By finding the first
diff_log entry with the new name, you can roughly work out when the
page moves were done.

Anyway, I'm developing a script which will import the dump into a
modified MediaWiki instance, the idea being that I can then export XML
from it. Once it works, I'll upload the XML to somewhere. I'm not sure
when that will be.

-- Tim Starling

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread Federico Leva (Nemo)
Brian J Mingus, 15/12/2010 01:36:
 Here is an interesting bit of history - the Wikipedia logo was first an
 American flag. Then Scott Moonen suggested we make it a globe:

No news, this is already on Meta:
http://meta.wikimedia.org/wiki/Logo_history
http://meta.wikimedia.org/wiki/OldWikiPediaLogo

Nemo

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread ResearchBiz
True to FT2's vision, this story has already been picked up by the major
media!

http://www.examiner.com/wiki-edits-in-national/original-copy-of-wikipedia-discovered

Original copy of Wikipedia discovered
December 14, 2010
- by Gregory Kohs, for Examiner.com
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread Nathan
On Wed, Dec 15, 2010 at 12:39 PM, ResearchBiz research...@gmail.com wrote:
 True to FT2's vision, this story has already been picked up by the major
 media!

 http://www.examiner.com/wiki-edits-in-national/original-copy-of-wikipedia-discovered

 Original copy of Wikipedia discovered
 December 14, 2010
 - by Gregory Kohs, for Examiner.com
 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Major media might be overstating your reach just a little bit, Greg.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread David Gerard
On 15 December 2010 17:39, ResearchBiz research...@gmail.com wrote:

 True to FT2's vision, this story has already been picked up by the major
 media!
 http://www.examiner.com/[spam url snipped]


examiner.com is basically a paid blogging host with the only relation
to media being a news-site-like skin.

http://en.wikipedia.org/wiki/Examiner.com#Pay_scale

Basically, the pay is 0.5-1c per click.

I suggest any links to examiner.com on this list be treated as spam.


- d.

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread Federico Leva (Nemo)
ViswaPrabha (വിശ്വപ്രഭ), 15/12/2010 01:03:
 And here is the first http://wikipedia.com archive link available at web
 archive.
 
 http://web.archive.org/web/20010727112808/http://www.wikipedia.org/

No, the first is 
http://web.archive.org/web/20010331173908/http://www.wikipedia.com/

Tim Starling, 15/12/2010 00:30:
  You may find this interesting:
 
  
http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox

Uh, didn't know anything about it.

 
  
http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox

Isn't intlwiki-l completely archived on gmane? 
http://blog.gmane.org/gmane.science.linguistics.wikipedia.international
If not, we could import this mbox.

Nemo

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-15 Thread WJhonson
Is the current CC license retroactive to all of the old versions from the 
beginning to now?

W
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Peter Coombe
That's fantastic news, and just in time for the 10th anniversary too,
when I'm sure the early days of Wikipedia will be in the limelight.
Great find Tim!

Would it be at all possible to import these into the current system? I
know someone was importing edits from the Nostalgia wiki. It would be
wonderful to finally have a complete article history.

Pete / the wub


On 14 December 2010 15:54, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.

 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.

 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.

 I've put the two log files up on the web, at:

 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z

 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.

 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.

 -- Tim Starling

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Chad
On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.

 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.

 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.

 I've put the two log files up on the web, at:

 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z

 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.

 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.

 -- Tim Starling


I have to say this is super cool. It's like digging up a time capsule
right before the 10th anniversary. One of my favorite early edits:

This is the new WikiPedia!  The idea here is to write a complete
encyclopedia from scratch, without peer review process, etc.
Some people think that this may be a hopeless endeavor, that
the result will necessarily suck.  We aren't so sure.  So, let's get
to work!

-Chad

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread teun spaans
Tim,

wonderful news!
Thank you for making them publicly available!

Of course I immediately downloaded them, and I must have a look at them
later this week. Though they are from before I became active (2003) I am
very curious if the articles in these files still exist, and how much they
changed.

teun spaans




On Tue, Dec 14, 2010 at 4:54 PM, Tim Starling tstarl...@wikimedia.orgwrote:

 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.

 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.

 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.

 I've put the two log files up on the web, at:

 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7zhttp://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z

 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.

 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.

 -- Tim Starling

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Magnus Manske
Great news indeed!

Now I can finally figure out when my first edit was :-)

Magnus



On Tue, Dec 14, 2010 at 3:54 PM, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.

 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.

 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.

 I've put the two log files up on the web, at:

 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z

 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.

 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.

 -- Tim Starling

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Michael Snow
On 12/14/2010 7:54 AM, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!
I guess producing database dumps was easier in those days. Seriously 
though, this is absolutely fantastic news!

--Michael Snow

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Steven Walling
This is fantastic, and the timing could not be better.

If anyone finds anything noteworthy, please add it to the timeline of
Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
the other tools for cataloging interesting tidbits from our history.[2]

1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
2. http://ten.wikipedia.org/wiki/Share

On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:

 On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
 wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!
 
  This is exciting, because there is lots of article history in here
  which was assumed to be lost forever.
 
  I've long been interested in Wikipedia's history, and I've tried in
  the past to locate such backups. I asked various people who might have
  had one. I had given up hope.
 
  The history of particularly old Wikipedia articles, as seen in the
  present Wikipedia database, is incomplete, due to Usemod's policy of
  deleting old revisions of pages after about a month. The script which
  Brion wrote to import the article histories from UseMod to MediaWiki
  only fetched those revisions which hadn't been purged yet.
 
  I didn't want to believe that those revisions had been lost forever,
  and I even opened the UseMod source code and stared forlornly at the
  unlink() call. What I (and Brion before) missed is that UseMod appends
  a record of every change made to two files, called diff_log and rclog.
  In these two files is a record of every change made to Wikipedia from
  January 15 to August 17, 2001.
 
  I've put the two log files up on the web, at:
 
  http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
 
  The 7-zip archive is only 8.4MB -- much more manageable than today's
  backups.
 
  rclog contains IP addresses. The Usemod software made IP addresses of
  logged-in users public, so the people who made these edits had no
  expectation that their IP address would be kept private. That, coupled
  with the passage of time, makes me think that no harm to user privacy
  can come from releasing these files.
 
  -- Tim Starling
 

 I have to say this is super cool. It's like digging up a time capsule
 right before the 10th anniversary. One of my favorite early edits:

 This is the new WikiPedia!  The idea here is to write a complete
 encyclopedia from scratch, without peer review process, etc.
 Some people think that this may be a hopeless endeavor, that
 the result will necessarily suck.  We aren't so sure.  So, let's get
 to work!

 -Chad

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread phoebe ayers
On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.

 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.

 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.

 I've put the two log files up on the web, at:

 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z

 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.

 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.

 -- Tim Starling

AWESOME. This is so cool. I've copied the research list too, since
there's many Wikipedia historians that will be eager to see the older
versions.

I hope we can get them up in a browsable way, like nostalgia.wikipedia.org!

-- phoebe

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Jay Walsh
This is definitely a tremendous asset leading up to our big bday in January. I 
hope we can extract and post some of the real gems.  

Thanks for the resourcefulness and the sharing, Tim.

On Dec 14, 2010, at 10:04 AM, phoebe ayers wrote:

 On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!
 
 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.
 
 I've long been interested in Wikipedia's history, and I've tried in
 the past to locate such backups. I asked various people who might have
 had one. I had given up hope.
 
 The history of particularly old Wikipedia articles, as seen in the
 present Wikipedia database, is incomplete, due to Usemod's policy of
 deleting old revisions of pages after about a month. The script which
 Brion wrote to import the article histories from UseMod to MediaWiki
 only fetched those revisions which hadn't been purged yet.
 
 I didn't want to believe that those revisions had been lost forever,
 and I even opened the UseMod source code and stared forlornly at the
 unlink() call. What I (and Brion before) missed is that UseMod appends
 a record of every change made to two files, called diff_log and rclog.
 In these two files is a record of every change made to Wikipedia from
 January 15 to August 17, 2001.
 
 I've put the two log files up on the web, at:
 
 http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
 
 The 7-zip archive is only 8.4MB -- much more manageable than today's
 backups.
 
 rclog contains IP addresses. The Usemod software made IP addresses of
 logged-in users public, so the people who made these edits had no
 expectation that their IP address would be kept private. That, coupled
 with the passage of time, makes me think that no harm to user privacy
 can come from releasing these files.
 
 -- Tim Starling
 
 AWESOME. This is so cool. I've copied the research list too, since
 there's many Wikipedia historians that will be eager to see the older
 versions.
 
 I hope we can get them up in a browsable way, like nostalgia.wikipedia.org!
 
 -- phoebe
 
 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

-- 
Jay Walsh
Head of Communications
WikimediaFoundation.org
blog.wikimedia.org
+1 (415) 839 6885 x 609, @jansonw


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Rob Lanphier
On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 This is exciting, because there is lots of article history in here
 which was assumed to be lost forever.

Wow, this is really, really amazing!  I'm not sure just how you
avoided having a heart attack after seeing this:
 --
 HomePage|979586833
 1c1
  Describe the new page here.
 ---
  This is the new WikiPedia!

Great work!

Rob

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Moka Pantages
This is so exciting!  To Steven's point: we've also started a page
where folks can add bits of interesting information as they excavate
the files [1].   Can't wait to dig in!

Congrats, Tim!

[1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning


Date: Tue, 14 Dec 2010 08:20:10 -0800
From: Steven Walling steven.wall...@gmail.com
Subject: Re: [Foundation-l] Old Wikipedia backups discovered
To: Wikimedia Foundation Mailing List
   foundation-l@lists.wikimedia.org
Message-ID:
   aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1

This is fantastic, and the timing could not be better.

If anyone finds anything noteworthy, please add it to the timeline of
Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
the other tools for cataloging interesting tidbits from our history.[2]

1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
2. http://ten.wikipedia.org/wiki/Share

On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:

 On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
 wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!
 
  This is exciting, because there is lots of article history in here
  which was assumed to be lost forever.
 
  I've long been interested in Wikipedia's history, and I've tried in
  the past to locate such backups. I asked various people who might have
  had one. I had given up hope.
 
  The history of particularly old Wikipedia articles, as seen in the
  present Wikipedia database, is incomplete, due to Usemod's policy of
  deleting old revisions of pages after about a month. The script which
  Brion wrote to import the article histories from UseMod to MediaWiki
  only fetched those revisions which hadn't been purged yet.
 
  I didn't want to believe that those revisions had been lost forever,
  and I even opened the UseMod source code and stared forlornly at the
  unlink() call. What I (and Brion before) missed is that UseMod appends
  a record of every change made to two files, called diff_log and rclog.
  In these two files is a record of every change made to Wikipedia from
  January 15 to August 17, 2001.
 
  I've put the two log files up on the web, at:
 
  http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
 
  The 7-zip archive is only 8.4MB -- much more manageable than today's
  backups.
 
  rclog contains IP addresses. The Usemod software made IP addresses of
  logged-in users public, so the people who made these edits had no
  expectation that their IP address would be kept private. That, coupled
  with the passage of time, makes me think that no harm to user privacy
  can come from releasing these files.
 
  -- Tim Starling
 

 I have to say this is super cool. It's like digging up a time capsule
 right before the 10th anniversary. One of my favorite early edits:

 This is the new WikiPedia!  The idea here is to write a complete
 encyclopedia from scratch, without peer review process, etc.
 Some people think that this may be a hopeless endeavor, that
 the result will necessarily suck.  We aren't so sure.  So, let's get
 to work!

 -Chad

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread WJhonson
In a message dated 12/14/2010 8:21:09 AM Pacific Standard Time, 
steven.wall...@gmail.com writes:


 This is fantastic, and the timing could not be better.
 
 If anyone finds anything noteworthy, please add it to the timeline of
 Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
 the other tools for cataloging interesting tidbits from our history.[2]
 
 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
 2. http://ten.wikipedia.org/wiki/Share
 

Hmm I wonder if some things can be added there (sound of feathers 
ruffling)

Btw how does one *open* this tarball thing (on Windows) ?
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread phoebe ayers
FYI, there is an existing timeline at:

http://meta.wikimedia.org/wiki/Wikipedia_timeline

And lots of other wikipedia history pages on English, too.

:)
Phoebe

On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org wrote:
 This is so exciting!  To Steven's point: we've also started a page
 where folks can add bits of interesting information as they excavate
 the files [1].   Can't wait to dig in!

 Congrats, Tim!

 [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning


 Date: Tue, 14 Dec 2010 08:20:10 -0800
 From: Steven Walling steven.wall...@gmail.com
 Subject: Re: [Foundation-l] Old Wikipedia backups discovered
 To: Wikimedia Foundation Mailing List
       foundation-l@lists.wikimedia.org
 Message-ID:
       aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 This is fantastic, and the timing could not be better.

 If anyone finds anything noteworthy, please add it to the timeline of
 Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
 the other tools for cataloging interesting tidbits from our history.[2]

 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
 2. http://ten.wikipedia.org/wiki/Share

 On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:

 On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
 wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread FT2
Winrar's your best bet. Other archivers may be equally good.

FT2

On Tue, Dec 14, 2010 at 5:53 PM, wjhon...@aol.com wrote:

 In a message dated 12/14/2010 8:21:09 AM Pacific Standard Time,
 steven.wall...@gmail.com writes:


  This is fantastic, and the timing could not be better.
 
  If anyone finds anything noteworthy, please add it to the timeline of
  Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
  the other tools for cataloging interesting tidbits from our history.[2]
 
  1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
  2. http://ten.wikipedia.org/wiki/Share
 

 Hmm I wonder if some things can be added there (sound of feathers
 ruffling)

 Btw how does one *open* this tarball thing (on Windows) ?
  ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread FT2
Would prefer on its own wiki as this is comprehensive up to a given date.
Maybe January2001.wikipedia.org -- immediate impact.

(DNS software cannot handle 2001.wikipedia.org)

FT2

On Tue, Dec 14, 2010 at 6:04 PM, phoebe ayers phoebe.w...@gmail.com wrote:

  On Tue, Dec 14, 2010 at 7:54 AM, Tim Starling tstarl...@wikimedia.org
 wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!
 
  This is exciting, because there is lots of article history in here
  which was assumed to be lost forever.
 
  I've long been interested in Wikipedia's history, and I've tried in
  the past to locate such backups. I asked various people who might have
  had one. I had given up hope.
 
  The history of particularly old Wikipedia articles, as seen in the
  present Wikipedia database, is incomplete, due to Usemod's policy of
  deleting old revisions of pages after about a month. The script which
  Brion wrote to import the article histories from UseMod to MediaWiki
  only fetched those revisions which hadn't been purged yet.
 
  I didn't want to believe that those revisions had been lost forever,
  and I even opened the UseMod source code and stared forlornly at the
  unlink() call. What I (and Brion before) missed is that UseMod appends
  a record of every change made to two files, called diff_log and rclog.
  In these two files is a record of every change made to Wikipedia from
  January 15 to August 17, 2001.
 
  I've put the two log files up on the web, at:
 
  http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7z
 
  The 7-zip archive is only 8.4MB -- much more manageable than today's
  backups.
 
  rclog contains IP addresses. The Usemod software made IP addresses of
  logged-in users public, so the people who made these edits had no
  expectation that their IP address would be kept private. That, coupled
  with the passage of time, makes me think that no harm to user privacy
  can come from releasing these files.
 
  -- Tim Starling

 AWESOME. This is so cool. I've copied the research list too, since
 there's many Wikipedia historians that will be eager to see the older
 versions.

 I hope we can get them up in a browsable way, like nostalgia.wikipedia.org
 !

 -- phoebe

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread FT2
See see also etc in [[History of Wikipedia]].

FT2

On Tue, Dec 14, 2010 at 7:27 PM, phoebe ayers phoebe.w...@gmail.com wrote:

 FYI, there is an existing timeline at:

 http://meta.wikimedia.org/wiki/Wikipedia_timeline

 And lots of other wikipedia history pages on English, too.

 :)
 Phoebe

 On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org
 wrote:
  This is so exciting!  To Steven's point: we've also started a page
  where folks can add bits of interesting information as they excavate
  the files [1].   Can't wait to dig in!
 
  Congrats, Tim!
 
  [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
 
 
  Date: Tue, 14 Dec 2010 08:20:10 -0800
  From: Steven Walling steven.wall...@gmail.com
  Subject: Re: [Foundation-l] Old Wikipedia backups discovered
  To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
  Message-ID:
aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
  Content-Type: text/plain; charset=ISO-8859-1
 
  This is fantastic, and the timing could not be better.
 
  If anyone finds anything noteworthy, please add it to the timeline of
  Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
  the other tools for cataloging interesting tidbits from our history.[2]
 
  1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
  2. http://ten.wikipedia.org/wiki/Share
 
  On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:
 
  On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
 
  wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August
 2001!

  ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread James Alexander
On Tue, Dec 14, 2010 at 12:53 PM, wjhon...@aol.com wrote:


 Btw how does one *open* this tarball thing (on Windows) ?


I'm a fan of http://www.7-zip.org/

-- 
James Alexander
jameso...@gmail.com
___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread വിശ്വപ്രഭ
Right in time! And the rightly early version too!
Kudos to the diggers and bashers!




On Tue, Dec 14, 2010 at 21:23, Moka Pantages mpanta...@wikimedia.orgwrote:

 This is so exciting!  To Steven's point: we've also started a page
 where folks can add bits of interesting information as they excavate
 the files [1].   Can't wait to dig in!

 Congrats, Tim!

 [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning


 Date: Tue, 14 Dec 2010 08:20:10 -0800
 From: Steven Walling steven.wall...@gmail.com
 Subject: Re: [Foundation-l] Old Wikipedia backups discovered
 To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
 Message-ID:
   aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 This is fantastic, and the timing could not be better.

 If anyone finds anything noteworthy, please add it to the timeline of
 Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
 the other tools for cataloging interesting tidbits from our history.[2]

 1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
 2. http://ten.wikipedia.org/wiki/Share

 On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:

  On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
  wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August 2001!
  
   This is exciting, because there is lots of article history in here
   which was assumed to be lost forever.
  
   I've long been interested in Wikipedia's history, and I've tried in
   the past to locate such backups. I asked various people who might have
   had one. I had given up hope.
  
   The history of particularly old Wikipedia articles, as seen in the
   present Wikipedia database, is incomplete, due to Usemod's policy of
   deleting old revisions of pages after about a month. The script which
   Brion wrote to import the article histories from UseMod to MediaWiki
   only fetched those revisions which hadn't been purged yet.
  
   I didn't want to believe that those revisions had been lost forever,
   and I even opened the UseMod source code and stared forlornly at the
   unlink() call. What I (and Brion before) missed is that UseMod appends
   a record of every change made to two files, called diff_log and rclog.
   In these two files is a record of every change made to Wikipedia from
   January 15 to August 17, 2001.
  
   I've put the two log files up on the web, at:
  
   http://noc.wikimedia.org/~tstarling/wikipedia-logs-2001-08-17.7zhttp://noc.wikimedia.org/%7Etstarling/wikipedia-logs-2001-08-17.7z
  
   The 7-zip archive is only 8.4MB -- much more manageable than today's
   backups.
  
   rclog contains IP addresses. The Usemod software made IP addresses of
   logged-in users public, so the people who made these edits had no
   expectation that their IP address would be kept private. That, coupled
   with the passage of time, makes me think that no harm to user privacy
   can come from releasing these files.
  
   -- Tim Starling
  
 
  I have to say this is super cool. It's like digging up a time capsule
  right before the 10th anniversary. One of my favorite early edits:
 
  This is the new WikiPedia!  The idea here is to write a complete
  encyclopedia from scratch, without peer review process, etc.
  Some people think that this may be a hopeless endeavor, that
  the result will necessarily suck.  We aren't so sure.  So, let's get
  to work!
 
  -Chad
 
  ___
  foundation-l mailing list
  foundation-l@lists.wikimedia.org
  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Henning Schlottmann
On 14.12.2010 16:54, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

That's wonderful news. Is this for enWP only or were all languages in
one database back then?

Ciao Henning


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Magnus Manske
On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann
h.schlottm...@gmx.net wrote:
 On 14.12.2010 16:54, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 That's wonderful news. Is this for enWP only or were all languages in
 one database back then?

There was only English back in the day...

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Brian J Mingus
Here are a couple of quick indexes into the dump file. I didn't venture into
the binary revision data. You'll find an alphabetized list of articles that
contains all the diffs for each article in the order that they occured in
the dump and a sorted index into each revision as well.

http://grey.colorado.edu/wikipedia_2001/

http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't
even have enough time to dig through this at all. Guess I just wanted a
distraction =)

- Brian

On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote:

 FYI, there is an existing timeline at:

 http://meta.wikimedia.org/wiki/Wikipedia_timeline

 And lots of other wikipedia history pages on English, too.

 :)
 Phoebe

 On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org
 wrote:
  This is so exciting!  To Steven's point: we've also started a page
  where folks can add bits of interesting information as they excavate
  the files [1].   Can't wait to dig in!
 
  Congrats, Tim!
 
  [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
 
 
  Date: Tue, 14 Dec 2010 08:20:10 -0800
  From: Steven Walling steven.wall...@gmail.com
  Subject: Re: [Foundation-l] Old Wikipedia backups discovered
  To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
  Message-ID:
aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
  Content-Type: text/plain; charset=ISO-8859-1
 
  This is fantastic, and the timing could not be better.
 
  If anyone finds anything noteworthy, please add it to the timeline of
  Wikipedia that we're building at the 10th anniversary wiki,[1] as well as
  the other tools for cataloging interesting tidbits from our history.[2]
 
  1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
  2. http://ten.wikipedia.org/wiki/Share
 
  On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:
 
  On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling tstarl...@wikimedia.org
 
  wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August
 2001!

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Henning Schlottmann
Hi Magnus,

On 14.12.2010 22:35, Magnus Manske wrote:
 On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann
 h.schlottm...@gmx.net wrote:
 On 14.12.2010 16:54, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 That's wonderful news. Is this for enWP only or were all languages in
 one database back then?
 
 There was only English back in the day...

Not true. The first other languages were introduced on March 15 and
could be part of this archive if the different Wikipedias were in one
database under UseMod.

Do you remember how this worked?

Ciao Henning


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Magnus Manske
On Tue, Dec 14, 2010 at 9:49 PM, Henning Schlottmann
h.schlottm...@gmx.net wrote:
 Hi Magnus,

 On 14.12.2010 22:35, Magnus Manske wrote:
 On Tue, Dec 14, 2010 at 8:36 PM, Henning Schlottmann
 h.schlottm...@gmx.net wrote:
 On 14.12.2010 16:54, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!

 That's wonderful news. Is this for enWP only or were all languages in
 one database back then?

 There was only English back in the day...

 Not true. The first other languages were introduced on March 15 and
 could be part of this archive if the different Wikipedias were in one
 database under UseMod.

My earliest recorded entry in de.wikipedia dates September 2001 (and I
have a low two-digit user ID, which was created upon the switch to
MediaWiki), so there seem to be some versions missing indeed. Do you
know the oldest preserved esit on de.wp?

 Do you remember how this worked?

AFAIR, every language had its own UseMod setup. My import script only
took the last version; Brion later wrote one that filled in the
previous ones from the stored diffs.

Magnus

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Henning Schlottmann
On 14.12.2010 23:47, Magnus Manske wrote:
 On Tue, Dec 14, 2010 at 9:49 PM, Henning Schlottmann

 Not true. The first other languages were introduced on March 15 and
 could be part of this archive if the different Wikipedias were in one
 database under UseMod.
 
 My earliest recorded entry in de.wikipedia dates September 2001 (and I
 have a low two-digit user ID, which was created upon the switch to
 MediaWiki), so there seem to be some versions missing indeed. Do you
 know the oldest preserved esit on de.wp?

Local lore claims it is your edit
http://de.wikipedia.org/w/index.php?title=Polymerase-Kettenreaktionoldid=2613
in Polymerase-Kettenreaktion. But I never checked that.

 Do you remember how this worked?
 
 AFAIR, every language had its own UseMod setup. My import script only
 took the last version; Brion later wrote one that filled in the
 previous ones from the stored diffs.

That's unfortunate but only a small dent in the wonderful news that
Wikipedia has its very first (English) edits back.

Ciao Henning


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Tim Starling
On 15/12/10 07:36, Henning Schlottmann wrote:
 On 14.12.2010 16:54, Tim Starling wrote:
 I was looking through some old files in our SourceForge project. I
 opened a file called wiki.tar.gz, and inside were three complete
 backups of the text of Wikipedia, from February, March and August 2001!
 
 That's wonderful news. Is this for enWP only or were all languages in
 one database back then?

Just English, unfortuately.

You may find this interesting:

http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox

http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox

-- Tim Starling


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread വിശ്വപ്രഭ
I hope some of you may have seen/discussed  these pages (as well as the
connected pages):

http://web.archive.org/web/20010418152404/www.nupedia.com/

upto

 http://web.archive.org/web/20030730075209/http://www.nupedia.org/

Of course the domain name then, was nupedia.org.

-vp


On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.org wrote:

 On 15/12/10 07:36, Henning Schlottmann wrote:
  On 14.12.2010 16:54, Tim Starling wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!
 
  That's wonderful news. Is this for enWP only or were all languages in
  one database back then?

 Just English, unfortuately.

 You may find this interesting:

 
 http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox
 

 
 http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox
 

 -- Tim Starling


 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread വിശ്വപ്രഭ
And here is the first http://wikipedia.com archive link available at web
archive.

http://web.archive.org/web/20010727112808/http://www.wikipedia.org/


2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com

 I hope some of you may have seen/discussed  these pages (as well as the
 connected pages):

 http://web.archive.org/web/20010418152404/www.nupedia.com/

 upto

  http://web.archive.org/web/20030730075209/http://www.nupedia.org/

 Of course the domain name then, was nupedia.org.

 -vp



 On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.orgwrote:

 On 15/12/10 07:36, Henning Schlottmann wrote:
  On 14.12.2010 16:54, Tim Starling wrote:
  I was looking through some old files in our SourceForge project. I
  opened a file called wiki.tar.gz, and inside were three complete
  backups of the text of Wikipedia, from February, March and August 2001!
 
  That's wonderful news. Is this for enWP only or were all languages in
  one database back then?

 Just English, unfortuately.

 You may find this interesting:

 
 http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox
 

 
 http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox
 

 -- Tim Starling


 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Brian J Mingus
Browsing through the earliest revisions in the revision index (
http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather
interesting and full of fodder for founder debates. Consider these very
early revisions:

[http://www.nupedia.com Nupedia.com] is an open content, international,
peer reviewed project run by LarrySanger, who got the idea of supplementing
NuPedia with a less formal wiki encyclopedia project.  -
http://grey.colorado.edu/wikipedia_2001/979694938.txt

EditorInChief of NuPedia and instigator of Nupedia's wiki. 
http://grey.colorado.edu/wikipedia_2001/979690096.txt

Sanger's claims to coming up with the idea of adding the wiki concept to the
online encyclopedia concept clearly go all the way back to the beginning. Of
course, that doesn't speak to offline conversations that gave rise to the
idea.

And Sanger clearly didn't have much faith in the concept:

None of this is to say that the Nupedia wiki will ''replace'' the main
encyclopedia; of course it won't. But it will be an interesting ancillary
endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt


- Brian

On Tue, Dec 14, 2010 at 2:41 PM, Brian brian.min...@colorado.edu wrote:

 Here are a couple of quick indexes into the dump file. I didn't venture
 into the binary revision data. You'll find an alphabetized list of articles
 that contains all the diffs for each article in the order that they occured
 in the dump and a sorted index into each revision as well.

 http://grey.colorado.edu/wikipedia_2001/

 http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't
 even have enough time to dig through this at all. Guess I just wanted a
 distraction =)

 - Brian


 On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote:

 FYI, there is an existing timeline at:

 http://meta.wikimedia.org/wiki/Wikipedia_timeline

 And lots of other wikipedia history pages on English, too.

 :)
 Phoebe

 On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org
 wrote:
  This is so exciting!  To Steven's point: we've also started a page
  where folks can add bits of interesting information as they excavate
  the files [1].   Can't wait to dig in!
 
  Congrats, Tim!
 
  [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
 
 
  Date: Tue, 14 Dec 2010 08:20:10 -0800
  From: Steven Walling steven.wall...@gmail.com
  Subject: Re: [Foundation-l] Old Wikipedia backups discovered
  To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
  Message-ID:
aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
  Content-Type: text/plain; charset=ISO-8859-1
 
  This is fantastic, and the timing could not be better.
 
  If anyone finds anything noteworthy, please add it to the timeline of
  Wikipedia that we're building at the 10th anniversary wiki,[1] as well
 as
  the other tools for cataloging interesting tidbits from our history.[2]
 
  1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
  2. http://ten.wikipedia.org/wiki/Share
 
  On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com wrote:
 
  On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling 
 tstarl...@wikimedia.org
  wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August
 2001!

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l



___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Brian J Mingus
Here is an interesting bit of history - the Wikipedia logo was first an
American flag. Then Scott Moonen suggested we make it a globe:


In its first day of existences, because the nearest thing to hand for
JimmyWales that was suitable for a logo was an American flag,
WikiPedia had the American flag, OldGlory, for a logo.

 ScottMoonen sensibly suggested:

 I'd recommend you change the American flag logo.  Exremely ethno-centric 
 ''et. al.''  I think a globe logo would be much more fitting, if you want to 
 keep with that metaphor.  Or perhaps a book.

http://grey.colorado.edu/wikipedia_2001/979773872.txt


- Brian

On Tue, Dec 14, 2010 at 5:17 PM, Brian brian.min...@colorado.edu wrote:

 Browsing through the earliest revisions in the revision index (
 http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather
 interesting and full of fodder for founder debates. Consider these very
 early revisions:

 [http://www.nupedia.com Nupedia.com] is an open content, international,
 peer reviewed project run by LarrySanger, who got the idea of supplementing
 NuPedia with a less formal wiki encyclopedia project.  -
 http://grey.colorado.edu/wikipedia_2001/979694938.txt

 EditorInChief of NuPedia and instigator of Nupedia's wiki. 
 http://grey.colorado.edu/wikipedia_2001/979690096.txt

 Sanger's claims to coming up with the idea of adding the wiki concept to
 the online encyclopedia concept clearly go all the way back to the
 beginning. Of course, that doesn't speak to offline conversations that gave
 rise to the idea.

 And Sanger clearly didn't have much faith in the concept:

 None of this is to say that the Nupedia wiki will ''replace'' the main
 encyclopedia; of course it won't. But it will be an interesting ancillary
 endeavor! http://grey.colorado.edu/wikipedia_2001/979695982.txt


 - Brian

 On Tue, Dec 14, 2010 at 2:41 PM, Brian brian.min...@colorado.edu wrote:

 Here are a couple of quick indexes into the dump file. I didn't venture
 into the binary revision data. You'll find an alphabetized list of articles
 that contains all the diffs for each article in the order that they occured
 in the dump and a sorted index into each revision as well.

 http://grey.colorado.edu/wikipedia_2001/

 http://grey.colorado.edu/wikipedia_2001/Given that it's finals I don't
 even have enough time to dig through this at all. Guess I just wanted a
 distraction =)

 - Brian


 On Tue, Dec 14, 2010 at 12:27 PM, phoebe ayers phoebe.w...@gmail.comwrote:

 FYI, there is an existing timeline at:

 http://meta.wikimedia.org/wiki/Wikipedia_timeline

 And lots of other wikipedia history pages on English, too.

 :)
 Phoebe

 On Tue, Dec 14, 2010 at 10:23 AM, Moka Pantages mpanta...@wikimedia.org
 wrote:
  This is so exciting!  To Steven's point: we've also started a page
  where folks can add bits of interesting information as they excavate
  the files [1].   Can't wait to dig in!
 
  Congrats, Tim!
 
  [1] http://ten.wikipedia.org/wiki/Wikipedia_in_the_Beginning
 
 
  Date: Tue, 14 Dec 2010 08:20:10 -0800
  From: Steven Walling steven.wall...@gmail.com
  Subject: Re: [Foundation-l] Old Wikipedia backups discovered
  To: Wikimedia Foundation Mailing List
foundation-l@lists.wikimedia.org
  Message-ID:
aanlktin9cjxr1s_ecfr3nr6xmt6c4o=6ohdhtxp4j...@mail.gmail.com
  Content-Type: text/plain; charset=ISO-8859-1
 
  This is fantastic, and the timing could not be better.
 
  If anyone finds anything noteworthy, please add it to the timeline of
  Wikipedia that we're building at the 10th anniversary wiki,[1] as well
 as
  the other tools for cataloging interesting tidbits from our history.[2]
 
  1. http://ten.wikipedia.org/wiki/Wikipedia_timeline
  2. http://ten.wikipedia.org/wiki/Share
 
  On Tue, Dec 14, 2010 at 8:11 AM, Chad innocentkil...@gmail.com
 wrote:
 
  On Tue, Dec 14, 2010 at 10:54 AM, Tim Starling 
 tstarl...@wikimedia.org
  wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August
 2001!

 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread emijrp
Is there any database backup of Nupedia? Or the articles were posted as HTML
pages?

2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com

 And here is the first http://wikipedia.com archive link available at web
 archive.

 http://web.archive.org/web/20010727112808/http://www.wikipedia.org/


 2010/12/15 ViswaPrabha (വിശ്വപ്രഭ) vp2...@gmail.com

  I hope some of you may have seen/discussed  these pages (as well as the
  connected pages):
 
  http://web.archive.org/web/20010418152404/www.nupedia.com/
 
  upto
 
   http://web.archive.org/web/20030730075209/http://www.nupedia.org/
 
  Of course the domain name then, was nupedia.org.
 
  -vp
 
 
 
  On Wed, Dec 15, 2010 at 02:30, Tim Starling tstarl...@wikimedia.org
 wrote:
 
  On 15/12/10 07:36, Henning Schlottmann wrote:
   On 14.12.2010 16:54, Tim Starling wrote:
   I was looking through some old files in our SourceForge project. I
   opened a file called wiki.tar.gz, and inside were three complete
   backups of the text of Wikipedia, from February, March and August
 2001!
  
   That's wonderful news. Is this for enWP only or were all languages in
   one database back then?
 
  Just English, unfortuately.
 
  You may find this interesting:
 
  
 
 http://web.archive.org/web/20030318055654/http://nupedia.com/pipermail/interpret-l.mbox/interpret-l.mbox
  
 
  
 
 http://web.archive.org/web/20020817032335/www.nupedia.com/pipermail/intlwiki-l.mbox/intlwiki-l.mbox
  
 
  -- Tim Starling
 
 
  ___
  foundation-l mailing list
  foundation-l@lists.wikimedia.org
  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
 
 
 
 ___
 foundation-l mailing list
 foundation-l@lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l


Re: [Foundation-l] Old Wikipedia backups discovered

2010-12-14 Thread Tim Starling
On 15/12/10 11:17, Brian J Mingus wrote:
 Browsing through the earliest revisions in the revision index (
 http://grey.colorado.edu/wikipedia_2001/revisions.html) is rather
 interesting and full of fodder for founder debates. Consider these very
 early revisions:
 
 [http://www.nupedia.com Nupedia.com] is an open content, international,
 peer reviewed project run by LarrySanger, who got the idea of supplementing
 NuPedia with a less formal wiki encyclopedia project.  -
 http://grey.colorado.edu/wikipedia_2001/979694938.txt
 
 EditorInChief of NuPedia and instigator of Nupedia's wiki. 
 http://grey.colorado.edu/wikipedia_2001/979690096.txt
 
 Sanger's claims to coming up with the idea of adding the wiki concept to the
 online encyclopedia concept clearly go all the way back to the beginning. Of
 course, that doesn't speak to offline conversations that gave rise to the
 idea.

I've long suspected that the early FAQs and history pages gave Larry
Sanger an exaggerated role because he wrote them himself. It will be
interesting to see if any such conclusion can be drawn from the
archives. Note that 979694938 was by dhcp058.246.lvcm.com, which
appears to be Larry.

By the way, the numbers in the revisions, e.g. 979694938, are UNIX
timestamps. That one was 17 Jan 2001, 01:28:58 UTC.

-- Tim Starling


___
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l