On Mon, Oct 20, 2008 at 8:13 AM, Omid Rouhani <[EMAIL PROTECTED]> wrote: > I'm working with the Wikipedia data dumps to do some data processing. > > I know that a page that contains "{{Disambig}}" is considered a > disambiguation page. > But apparently there are many other tags that also can be used for > marking disambiguation pages, such as {{disambig-cleanup}}, {{airport > disambig}}, {{Geodis}} etc. > I found these examples on http://en.wikipedia.org/wiki/Template:Disambig . > > Does anyone else here know, what is the full list of these > disambiguation templates? > Or how can I generate a full list of disambiguation pages?
The list is: http://en.wikipedia.org/wiki/MediaWiki:Disambiguationspage This is the list that mediawiki uses to generate the list of pages linking to disambiguation pages. It should probably exist on all language projects. A quick check of French and German both have it. http://fr.wikipedia.org/wiki/MediaWiki:Disambiguationspage http://de.wikipedia.org/wiki/MediaWiki:Disambiguationspage (the wonderful Germans, of course only have one clean template it looks like...) :D > Also, some pages start with a prefix such as "Template:", "User:", > "List_of_", "Wikipedia:", "Image:" etc. > I would like to avoid process pages with these type of prefixes. And I > would like to do it for all languages. > Is there a list with of all these prefixes (both for English and > foreign languages)? > If not I can always write a script that detects what prefixes are > frequently occurring for each language, but I thought there might be a > more formal way of getting a full list of these type of prefixes. Many of those are namespaces, which are available from the database I would think. (I don't know how you're getting the data) I probably can't help much here, but here is something to read :) http://en.wikipedia.org/wiki/Wikipedia:Namespace Judson http://en.wikipedia.org/wiki/User:Cohesion ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion