Hello!

I have local instalation of DBpedia extraction and I would like to extract
Slovenian category labels, SKOS and article categories.

For now I have commented line 14 in  CategoryLabelExtractor.scala
14 // require(Set("en").contains(language))

And i get labels but the problem is there are duplications:
<http://dbpedia.org/resource/Category:Military_aircraft> <
http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ka letala"@sl .
<http://dbpedia.org/resource/Category:Military_aircraft> <
http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ki zrakoplovi"@sl .

The reason seems to be that both "letala" and "zrakoplovi" link to the same
english category.

What is the right way to make extraction of categories possible.
Is replacing line 14 with: "require(Set("en", "sl").contains(language))"
enough?

 Is it possible to extract categories with categories which exist only in
Slovenian wikipedia.
I know that official DBpedia extracts only articles that have English page,
but can I change something in extraction framework to base extraction on
Slovenian titles or is this a bigger and deeper change?

Regards,
Marko Burjek
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to