Hi All, Thanks a lot to let me participate in this project and to become part of this community. I'll work hard in order to finish the project (I hope so) with your help and (why not) to become in a new contributor to Stanbol :)
My proposal abstract : Freebase Entity Disambiguation Engine In Apache Stanbol In the Web of Data, the information is structured and connected, being more feasible to drive the users through the correct resources because these resources are properly described in a language that a machine can understand. In order to make the Web of Data a reality, it is necessary to structure all the unstructured information. The structuring or semantic enrichment process used to involve the recognition of complex entities and concepts. The extracted entities need to be linked with “real world” knowledge bases entries in order to acquire their semantics. The task of associating entity mentions in texts with Knowledge Bases entries is commonly known as Entity Linking. The most complex issue of Entity Linking is entity disambiguation resolution. Entity Disambiguation tries to resolve the synonymy and homonymy over names’ mentions, i.e., the fact that an entity can have many different names and the fact that the same name can refer to more than one entity. The main goal of the present proposal is to develop a disambiguation engine for the Open-Source project Apache Stanbol using Freebase as Knowledge Base. This is not an easy task to integrate Freebase in Stanbol as Knowledge Base. Apache Stanbol provides a set of reusable components for Semantic Content Management. One of such component is a Content Enhancer, which can be used to extract concepts and entities from texts and link them with any Knowledge Base registered in Stanbol. Apache Stanbol already manages others semantics databases like DBpedia or Geonames. The GSoC project would contribute with all the developments necessary to fully support Freebase in Stanbol including disambiguation engines for this Knowledge Base. Freebase is an open, Creative Commons licensed repository of structured data of almost 23 million entities. An entity is a single person, place, or thing. Freebase connects entities together as a graph. Freebase contains at this time of writing more than 37 million topics, 1,998 types, and more than 30,000 properties. This is not a small database by any measure. Thanks Antonio On Tue, May 28, 2013 at 4:58 AM, Dileepa Jayakody <[email protected] > wrote: > Hi All, > > Thanks a lot for the wishes and selecting me to become part of this amazing > community. :) > I hope to do my best in this summer project with all your help and > guidance, and hopefully become a continuous contributor to Stanbol. > > My proposal abstract : FOAF Co-reference Based Entity Disambiguation Engine > In Apache Stanbol > > The proposed project focuses on developing an 'Entity Disambiguation > Engine' in Apache Stanbol by computing co-referent relations in > friend-of-a-friend (FOAF) data-sets. The same entity (persons, > organizations) can be referred by different names and vice-versa on the web > which leads to the 'named ambiguity' problem of entities. This problem can > affect the accuracy and relevance of results inferred by semantic engines > and leads to the requirement of using effective disambiguation techniques > to process entities as part of the enhancement process in the semantic > engines. This proposal focuses on using FOAF profiles as a datasource and > process them to resolve name ambiguity problem in an effective way. > > > FOAF is a vocabulary used to describe people, organizations and groups in > the form of linked data to form an entity network on the web. The > relationship of these FOAF instances can be very useful to derive new > knowledge about entities using semantic techniques. The co-reference > analysis can use FOAF attributes such as mbox, homepage, weblog, as unique > identifiers to match FOAF instances to identify co-referent clusters and > use it to disambiguate entities over the web. This project aims to develop > a comprehensive disambiguation algorithm by identifying and clustering > co-referent FOAF instances which describes the same entity over the web. > > > Thanks, > > Dileepa > > > On Tue, May 28, 2013 at 1:01 AM, Rupert Westenthaler < > [email protected]> wrote: > > > Hi all, > > > > Congratulations to Antonio and Dileepa! Great news for Stanbol. Thanks > > for your interest in Stanbol and the great proposals. This will be an > > exiting coding summer. > > > > Will try my best as a Mentor > > > > best > > Rupert > > > > p.s. Antonio, Dileepa: It would be cool if you could provide a summary > > of your proposals here on the list ^^ > > > > > > On Mon, May 27, 2013 at 9:21 PM, Rafa Haro <[email protected]> wrote: > > > Nice!!! > > > > > > Congratulations to the students!! Now it's when the funny stuff > starts!! > > > > > > > > > El lunes, 27 de mayo de 2013, Andreas Kuckartz escribió: > > > > > >> A few minutes ago Google announced the selected GSoC projects. These > two > > >> Stanbol proposals were selected: > > >> > > >> Freebase Entity Disambiguation in Apache Stanbol > > >> Antonio David Perez Morales > > >> > > >> > > > http://www.google-melange.com/gsoc/project/google/gsoc2013/adperezmorales/10001 > > >> > > >> FOAF Co-reference Based Entity Disambiguation Engine In Apache Stanbol > > >> Dileepa Jayakody > > >> > > > http://www.google-melange.com/gsoc/project/google/gsoc2013/dileepaj/14001 > > >> > > >> Congratulations to the two students! > > >> > > >> Cheers, > > >> Andreas > > >> > > > > > > -- > > > > > > ------------------------------ > > > This message should be regarded as confidential. If you have received > > this > > > email in error please notify the sender and destroy it immediately. > > > Statements of intent shall only become binding when confirmed in hard > > copy > > > by an authorised signatory. > > > > > > Zaizi Ltd is registered in England and Wales with the registration > number > > > 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam > > Road, > > > London W10 5JJ, UK. > > > > > > > > -- > > | Rupert Westenthaler [email protected] > > | Bodenlehenstraße 11 ++43-699-11108907 > > | A-5500 Bischofshofen > > > -- ------------------------------ This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.
