Re: [Wikitech-l] Importing Wikipedia XML Dumps into MediaWiki

2009-03-13 Thread O. O.
Mohamed Magdy wrote: I don't remember if I already mentioned this: you can split the xml file * into smaller pieces then import it using importDump.php. Use a loop to make a file like this and then run it: #!/bin/bash php maintenance/importDump.php /path/pagexml.1 wait php

[Wikitech-l] HTML not Rendered correctly after Import of Wikipedia

2009-03-13 Thread O. O.
Hi, I attempted to import the English Wikipedia into MediaWiki by first downloading the pages-articles.xml.bz2, uncompressing it, splitting it using xml2sql enwiki-20081008-pages-articles.xml and finally imported the results using mysqlimport -u root -p --local wikidb

[Wikitech-l] Understanding the meaning of “Lis t of page titles”

2009-03-13 Thread O. O.
Hi, I am looking at the dump of the English Wikipedia at http://download.wikimedia.org/enwiki/20081008/ There is a file called “all-titles-in-ns0.gz” which is supposed to contain the List of Page Titles. If I do cat enwiki-20081008-all-titles-in-ns0 | wc -l I get 5716820. On the same

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread Aryeh Gregor
On Fri, Mar 13, 2009 at 2:44 PM, O. O. olson...@yahoo.com wrote: Hi,        I am looking at the dump of the English Wikipedia at http://download.wikimedia.org/enwiki/20081008/ There is a file called “all-titles-in-ns0.gz” which is supposed to contain the List of Page Titles.  If I do cat

Re: [Wikitech-l] how to delete Talk:Project_talk:Community Portal?

2009-03-13 Thread Ilmari Karonen
Aryeh Gregor wrote: On Thu, Mar 12, 2009 at 3:58 PM, jida...@jidanni.org wrote: And how did you create it when it's illegal? Usually this happens when namespace names change, so that formerly it didn't start with a namespace prefix but now it does. Since both namespace names in this case

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread O. O.
Aryeh Gregor wrote: On Fri, Mar 13, 2009 at 2:44 PM, O. O. olson...@yahoo.com wrote: Hi, I am looking at the dump of the English Wikipedia at http://download.wikimedia.org/enwiki/20081008/ There is a file called “all-titles-in-ns0.gz” which is supposed to contain the List of Page

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread Daniel Kinzler
O. O. schrieb: Aryeh Gregor wrote: On Fri, Mar 13, 2009 at 2:44 PM, O. O. olson...@yahoo.com wrote: Hi, I am looking at the dump of the English Wikipedia at http://download.wikimedia.org/enwiki/20081008/ There is a file called “all-titles-in-ns0.gz” which is supposed to contain the

Re: [Wikitech-l] research-oriented toolserver?

2009-03-13 Thread Morten Warncke-Wang
Hi all, Judging by the replies we think we've failed to communicate clearly some of the ideas we wanted to put forward, and we'd like to take the opportunity to try to clear that up. We did not want to narrow this down to be only about a third party toolserver. Before we initiated contact we

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread O. O.
Daniel Kinzler wrote: O. O. schrieb: Aryeh Gregor wrote: On Fri, Mar 13, 2009 at 2:44 PM, O. O. olson...@yahoo.com wrote: Hi, I am looking at the dump of the English Wikipedia at http://download.wikimedia.org/enwiki/20081008/ There is a file called “all-titles-in-ns0.gz” which is

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread Andrew Garrett
On Sat, Mar 14, 2009 at 9:26 AM, O. O. olson...@yahoo.com wrote:        The above link says that “only articles” and no redirects are in the namespace NS0. Also Talk: pages are not included in the NS0. Then, when the current English Wikipedia advertises 2,791,033 Articles, I cannot understand

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread O. O.
Andrew Garrett wrote: On Sat, Mar 14, 2009 at 9:26 AM, O. O. olson...@yahoo.com wrote: The above link says that “only articles” and no redirects are in the namespace NS0. Also Talk: pages are not included in the NS0. Then, when the current English Wikipedia advertises 2,791,033

Re: [Wikitech-l] Understanding the meaning of “List of page titles”

2009-03-13 Thread Andrew Garrett
On Sat, Mar 14, 2009 at 9:34 AM, O. O. olson...@yahoo.com wrote: Andrew Garrett wrote: On Sat, Mar 14, 2009 at 9:26 AM, O. O. olson...@yahoo.com wrote:        The above link says that “only articles” and no redirects are in the namespace NS0. Also Talk: pages are not included in the NS0.

[Wikitech-l] not all tables need to be backed up

2009-03-13 Thread jidanni
Gentlemen, it occurred to me that under close examination one finds that when making a backup of one's wiki's database, some of the tables dumped have various degrees of temporariness, and thus though needing to be present in a proper dump, could perhaps be emptied of their values, saving much