Hi,
My name is Vivek and I am a third year undergraduate student at the
Institute of Technology, BHU at Varanasi, India . I am interested in
applying for a GSoC project at Dbpedia. I have used freebase for an
application that recommends movies and music and am also intrigued by
natural language processing of late.
I am specifically interested in the Spotting and Disambiguation ideas for
Spotlight. I have read some papers on entity extraction, pattern learning,
tf-idf, maxent classification etc. I would like to know, what are the major
improvements the Spotlight team is currently targeting in these two areas.
Have you considered using Redis as an alternative to Lucene? I implemented
a simple proof of concept tf*idf search with Redis over here (
https://github.com/vivekn/redis-simple-search/blob/master/search.py) . Or
is Project Voldemort better suited as it is JVM based?.
It would be great if someone could provide some pointers for getting
started with.
Thanks,
Vivek Narayanan
------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users