I added support for the greek wikipedia in DisambiguationExtractor
other languages can be added easily by setting their disambiguation string
in the extractor
regards,
Jim
### Eclipse Workspace Patch 1.0
#P DBPedia
Index: core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Disambiguation.scala
===================================================================
--- core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Disambiguation.scala (revision 3848)
+++ core/src/main/scala/org/dbpedia/extraction/wikiparser/impl/wikipedia/Disambiguation.scala (working copy)
@@ -39,7 +39,7 @@
"da" -> Set("Flertydig"),
"de" -> Set("Begriffsklärung"),
"dsb" -> Set("Rozjasnjenje zapśimjeśow"),
- "el" -> Set("Project:Σύνδεσμοι_προς_τις_σελίδες_αποσαφήνισης"),
+ "el" -> Set("αποσαφήνιση"),
"en" -> Set("disambig", "Disambig-Chinese-char-title", "Disambig-cleanup", "Fish-dab", "Geodis", "hndis", "Hndis", "hndis-cleanup", "Hndis-cleanup", "Hospitaldis", "Letter disambig", "Mathdab", "NA Broadcast List", "Numberdis", "POWdis", "Roaddis", "Schooldis", "SIA", "Shipindex", "Schooldis", "Mountainindex", "Given name", "Surname"),
"eo" -> Set("Apartigilo"),
"es" -> Set("Desambiguación"),
Index: core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala
===================================================================
--- core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala (revision 3810)
+++ core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala (working copy)
@@ -18,7 +18,13 @@
if (page.title.namespace == WikiTitle.Namespace.Main && page.isDisambiguation)
{
- val cleanPageTitle = page.title.decoded.replace(" (disambiguation)", "") //TODO only works for english titles
+ val language = extractionContext.language.wikiCode
+ require(Set("en", "el").contains(language))
+ val disambiguationTitlePart = Map(
+ "en" -> " (disambiguation)",
+ "el" -> " (αποσαφήνιση)")
+
+ val cleanPageTitle = page.title.decoded.replace(disambiguationTitlePart(language), "")
val list = collectInternalLinks(page).filter(linkNode => linkNode.destination.decoded.contains(cleanPageTitle)
|| isAcronym(cleanPageTitle, linkNode.destination.decoded))
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion