Hi Marko,
maybe you could sign up on the dbpedia developers list:
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers
and give us feedback, whether you managed to create a slovenian dump.
If you have some additions/fixes/extensions/tests to the code base, please tell me. I could give you write access to the Mercurial to commit you code.
Sebastian

On 30.03.2011 10:11, Dimitris Kontokostas wrote:
Hi,

you can checkout the "Greece" branch from the framework, it deals with internationalizations issues such as yours. We implemented it for the creation of the Greek DBpedia but is configurable for other languages as well

It is not documented yet but you can find some configurations options in the following files:

org.dbpedia.extraction.ontology.OntologyNamespaces.scala
   val specificLanguageDomain = Set("el", "de", "it")
   val encodeAsIRI = Set("el", "de")

org.dbpedia.extraction.mappings.extractor.scala
   private def retrieveTitle(page : PageNode) : Option[WikiTitle] =



On Wed, Mar 30, 2011 at 12:41 AM, Marko Burjek <[email protected] <mailto:[email protected]>> wrote:

    Hello!

    I have local instalation of DBpedia extraction and I would like to
    extract Slovenian category labels, SKOS and article categories.

    For now I have commented line 14 in  CategoryLabelExtractor.scala
    14 // require(Set("en").contains(language))

    And i get labels but the problem is there are duplications:
    <http://dbpedia.org/resource/Category:Military_aircraft>
    <http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ka
    letala"@sl .
    <http://dbpedia.org/resource/Category:Military_aircraft>
    <http://www.w3.org/2000/01/rdf-schema#label> "Voja\u0161ki
    zrakoplovi"@sl .

    The reason seems to be that both "letala" and "zrakoplovi" link to
    the same english category.

    What is the right way to make extraction of categories possible.
    Is replacing line 14 with: "require(Set("en",
    "sl").contains(language))" enough?

    Is it possible to extract categories with categories which exist
    only in Slovenian wikipedia.
    I know that official DBpedia extracts only articles that have
    English page, but can I change something in extraction framework
    to base extraction on Slovenian titles or is this a bigger and
    deeper change?

    Regards,
    Marko Burjek


    
------------------------------------------------------------------------------
    Enable your software for Intel(R) Active Management Technology to
    meet the
    growing manageability and security demands of your customers.
    Businesses
    are taking advantage of Intel(R) vPro (TM) technology - will your
    software
    be a part of the solution? Download the Intel(R) Manageability Checker
    today! http://p.sf.net/sfu/intel-dev2devmar
    _______________________________________________
    Dbpedia-discussion mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion




--
Kontokostas Dimitris


------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar


_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig
Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to