Hi,

I'm Ritesh, a 3rd year undergraduate in Computer Science. I've been working
with dbpedia framework and the codebase for some time. Here's a link
to my github
profile <https://github.com/gone-phishing> . I would love to work for
improving the mapping support of dbpedia by aligning the existing
properties of it with the Freebase data schema and adding new classes
wherever required.
The syntactic equivalence method has been proven to yield better results
than all the other dictionary based or string matching based algorithms,
giving higher F1 scores. It can match 2 properties which are neither same
nor synonymous by giving results on basis of subject and instance values
and thus reduces the false positives. Though the paper given used Apache
Hadoop framework, it was on a 14 node cluster. I have worked on Apache
Spark (using scala) which in this case can give a better speedup but I'm
not sure on getting that huge cluster. Probably we can try it on a sample
dataset in single system mode and then extend it to the original dataset.
Looking forward to any approach to crosscheck the results on properties
matched other than the manual approach.

Cheers,
Ritesh  Kumar Singh
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to