Hi Abhishek, thanks for the contribution. Your suggestions are pretty much
aligned with what we where thinking in any event, and the initial plan
seems good.
On the assumption that there's some code that generates extra possible
surface forms from a cannonical surface form, like your 'Michael Jordan' ->
'M. Jordan', 'Jordan' and so on example, it would be worth looking in the
literature on Machine Translation on how to establish some score for the
surface form. That is, if you spot 'M Jordan' on the text, what is the
probability of it being a translation of the canonical name 'Michael
Jordan' . If there's a simple way to implement this, we could try to get
the raw data with counts, generate some extra sfs in a principle manner and
use that to calculate probabilities. Still for the moment, I'd focus on
setting the spotlight server up and play with the warm up tasks.
Thanks for the good work,
Thiago
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc