Rupert Westenthaler created STANBOL-1046:
--------------------------------------------

             Summary: Create pageId based DBpedia Freebase linker for the 
Entiyhub Freebase Indexing Tool
                 Key: STANBOL-1046
                 URL: https://issues.apache.org/jira/browse/STANBOL-1046
             Project: Stanbol
          Issue Type: Bug
          Components: Entityhub
            Reporter: Rupert Westenthaler


While the Freebase Indexing Tool already supports basic linking between 
Freebase topics and DBpedia Entities those links are constructed based on the 
local names of the Wikipedia pages what is error prone due to encoding issues.

With STANBOL-1034 [~ninniuz] has pointed out that linking by using the 
Wikipedia PageId is superior and that such a linking functionality already 
exists for DBpedia [1].

However using this option would require users to import 

      http://downloads.dbpedia.org/3.8/{language}/page_ids_{language}.nt.bz2

files to the Indexing Source (the Jena TDB holding the Freebase data) or any 
other data store that can hold those mappings (also an in-memory representation 
would be feasible).

Because of that a mapping based on PageId will be implemented in a custom 
EntityProcessor. This Issue covers the implementation of such a processor.


[1] https://github.com/dbpedia/extraction-framework/pull/27

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to