Hi Rafa Great the see some movement on the disambiguation topic.
On Wed, Jan 30, 2013 at 12:16 PM, Rafa Haro <[email protected]> wrote: > We wanted to propose in the list a first approach to a roadmap for > disambiguation in Stanbol. In our opinion, a high-level list of tasks that > should be done is the following: > > - Agree a disambiguation index model to store entities' surface forms and > disambiguation contexts independent of the Knowledge Base, enabling also > disambiguation with custom vocabularies. > I would also like if such a model would support temporal and spatial contexts in addition to the lexical context (surface forms), full text contexts (mentions) and the formal context (relations to/from the Entity in the knowledge base). > - Design and develop tools for building such indexes, including an specific > one for DBpedia - Wikipedia. > I would not limit this to DBpedia but also consider using information from freebase and yago for that task. > - Maintain disambiguation-mlt as a baseline disambiguation algorithm and > adapt it to work with the new designed index. Adapt it to work with last > Enhancer Release and merge it with the trunk in SVN. > Updating the disambiguation-mlt branch to the now released versions of Commons, Enhancer and Entityhub is on my TODO list. I plan also to work on the few remaining issues to make the engine releasable (mainly improving the engines configuration). > - Design and develop new disambiguation algorithms based on entities > co-occurrence, graph representations and statistical models. > As those algorithm will be the main source for requirements on the disambiguation index model we might need to investigate this while designing the disambiguation index model. Thanks Rafa for taking up this important topic! best Rupert > > Any comment, feedback or ideas are more than welcome!!! > > Regards > > [1] - https://github.com/kritarthanand/Disambiguation-Stanbol > [2] - https://github.com/dbpedia-spotlight/dbpedia-spotlight > [3] - > http://blog.iks-project.eu/dbpedia-spotlight-integration-in-apache-stanbol-2/ > > -- > > ------------------------------ > This message should be regarded as confidential. If you have received this > email in error please notify the sender and destroy it immediately. > Statements of intent shall only become binding when confirmed in hard copy > by an authorised signatory. > > Zaizi Ltd is registered in England and Wales with the registration number > 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, > London W10 5JJ, UK. -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
