I'm working with the Wikipedia data dumps to do some data processing. I know that a page that contains "{{Disambig}}" is considered a disambiguation page. But apparently there are many other tags that also can be used for marking disambiguation pages, such as {{disambig-cleanup}}, {{airport disambig}}, {{Geodis}} etc. I found these examples on http://en.wikipedia.org/wiki/Template:Disambig .
Does anyone else here know, what is the full list of these disambiguation templates? Or how can I generate a full list of disambiguation pages? Also, if I work with the international data dumps, they have other tags (in respective language). So just looking for "{{Disambig}}" would not work, I would need this tag for each language. How can I solve this if I want write a script that detects all disambiguation pages for other languages. Also, some pages start with a prefix such as "Template:", "User:", "List_of_", "Wikipedia:", "Image:" etc. I would like to avoid process pages with these type of prefixes. And I would like to do it for all languages. Is there a list with of all these prefixes (both for English and foreign languages)? If not I can always write a script that detects what prefixes are frequently occurring for each language, but I thought there might be a more formal way of getting a full list of these type of prefixes. ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Dbpedia-discussion mailing list Dbpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion