Hi Marko,
maybe you could sign up on the dbpedia developers list:
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
and give us feedback, whether you managed to create a slovenian dump.
If you have some additions/fixes/extensions/tests to the code base,
please tell me. I could give you write access to the Mercurial to commit
you code.
Sebastian
On 30.03.2011 10:11, Dimitris Kontokostas wrote:
Hi,
you can checkout the "Greece" branch from the framework, it deals with
internationalizations issues such as yours. We implemented it for the
creation of the Greek DBpedia but is configurable for other languages
as well
It is not documented yet but you can find some configurations options
in the following files:
org.dbpedia.extraction.ontology.OntologyNamespaces.scala
val specificLanguageDomain = Set("el", "de", "it")
val encodeAsIRI = Set("el", "de")
org.dbpedia.extraction.mappings.extractor.scala
private def retrieveTitle(page : PageNode) : Option[WikiTitle] =
On Wed, Mar 30, 2011 at 12:41 AM, Marko Burjek <[email protected]
<mailto:[email protected]>> wrote:
Hello!
I have local instalation of DBpedia extraction and I would like to
extract Slovenian category labels, SKOS and article categories.
For now I have commented line 14 in CategoryLabelExtractor.scala
14 // require(Set("en").contains(language))
And i get labels but the problem is there are duplications:
<http://dbpedia.org/resource/Category:Military_aircraft>
<http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ka
letala"@sl .
<http://dbpedia.org/resource/Category:Military_aircraft>
<http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ki
zrakoplovi"@sl .
The reason seems to be that both "letala" and "zrakoplovi" link to
the same english category.
What is the right way to make extraction of categories possible.
Is replacing line 14 with: "require(Set("en",
"sl").contains(language))" enough?
Is it possible to extract categories with categories which exist
only in Slovenian wikipedia.
I know that official DBpedia extracts only articles that have
English page, but can I change something in extraction framework
to base extraction on Slovenian titles or is this a bigger and
deeper change?
Regards,
Marko Burjek
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to
meet the
growing manageability and security demands of your customers.
Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your
software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org
------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion