Hi Ritesh,

Thanks for the recap.
This project idea is conceptually similar to another one from GSoC 2014.
We wanted to investigate machine learning approaches to automatically 
align the Wikidata ontology to the DBpedia one.
Unfortunately the project failed, but we still have pointers that can 
guide you.

Have a look at the following links:
-idea: http://wiki.dbpedia.org/gsoc2014/ideas#h359-11
-proposal: 
https://docs.google.com/document/d/16lAqKLAsAGQW0cp9SA0Egb1vlb6mPCcHYezVN-zB870/edit?pli=1
-stuff done: 
https://github.com/dbpedia/extraction-framework/wiki/GSoC-2014-Progress-Sergey-Skovorodkin

Cheers!

On 3/5/15 10:50 PM, Ritesh Kumar Singh wrote:
> Hi,
>
> I'm Ritesh, a 3rd year undergraduate in Computer Science. I've been
> working with dbpedia framework and the codebase for some time. Here's a
> link to my github profile <https://github.com/gone-phishing> . I would
> love to work for improving the mapping support of dbpedia by aligning
> the existing properties of it with the Freebase data schema and adding
> new classes wherever required.
> The syntactic equivalence method has been proven to yield better results
> than all the other dictionary based or string matching based algorithms,
> giving higher F1 scores. It can match 2 properties which are neither
> same nor synonymous by giving results on basis of subject and instance
> values and thus reduces the false positives. Though the paper given used
> Apache Hadoop framework, it was on a 14 node cluster. I have worked on
> Apache Spark (using scala) which in this case can give a better speedup
> but I'm not sure on getting that huge cluster. Probably we can try it on
> a sample dataset in single system mode and then extend it to the
> original dataset. Looking forward to any approach to crosscheck the
> results on properties matched other than the manual approach.
>
> Cheers,
> Ritesh  Kumar Singh
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
>
>
> _______________________________________________
> Dbpedia-gsoc mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

-- 
Marco Fossati
http://about.me/marco.fossati
Twitter: @hjfocs
Skype: hell_j

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to