[DBpedia-discussion] GSoC 2016: Inferring Infobox Template Class Mappings From Wikipedia and WikiData

Peng Xu Mon, 22 Aug 2016 15:03:04 -0700

*Mentor: **Nilesh Chakraborty*
*Student: Peng Xu*

In the first phase of the project, I complete a approach to find new
mappings for languages with few existing mappings based on languages with
enough mappings and cross-language links. This approach can achieve fairly
good evaluation results. And I get 456 high-quality new mappings for
Chinese after my manual check.


Due to the incomplete coverage of mapping on DBpedia, we need to predict
ontology types for instances without a type. In the second phase, I try
different methods including tensor factorization and graph embeddings on
DBpedia to do type prediction. The experiments show that tensor
factorization can achieve good performance on small languages like
Bulgarian. However, for larger languages it performs badly due to the limit
of memory.

All the scripts I wrote can be easily applied to other languages if
datasets are downloaded to the proper path. All my code and detailed
documents can be found here:
https://github.com/dbpedia/mappings-autogeneration.

Further work: Currently, I ignore the literals in DBpedia when doing tensor
factorization. The next step is to add the literal information.
Furthermore, considering the issues about time and memory complexity, a
distributed implementation of the algorithm can be useful if we want to
apply the ideas on languages like English.

Best Regards
Peng Xu
----------------------------------
http://billy-inn.github.io/
M.Sc., Department of Computing Science
University of Alberta

------------------------------------------------------------------------------

_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

[DBpedia-discussion] GSoC 2016: Inferring Infobox Template Class Mappings From Wikipedia and WikiData

Reply via email to