Re: [GSOC Idea] Apache Stanbol to verify digital identity over social web

Rafa Haro Tue, 23 Apr 2013 08:07:57 -0700

Hi Dileepa,

El 23/04/13 13:45, Dileepa Jayakody escribió:

Hi Fabian et al,


Thanks a lot for your valuable ideas.
Yes it's really interesting to implement a 'person | organization'
disambiguation module using WebID protocol as part of Stanbol Enhancement
Engine. I went through the documentation of Stanbol and I have gained an
overall idea about the architecture of Stanbol.

+1. That's a great idea. I don't know very much about WebID protocol,but as far as you could use some profile data as disambiguationcontexts, it should be feasible to implement a disambiguation algorithm.Could you please give us a concrete example where WebID is used? Isuppose that the general use case is to link name mentions in web pageswith digital identities. What kind of information is it possible togather from WebID identities?


It would be great to get more ideas, suggestions about how to use Stanbol
for people, organization disambiguation and to discuss the objectives and
milestones in the GSOC project idea at [1].
I also think one of the main factor for disambiguation is the
data-set/knowledge base used for the process. What is the data-set Stanbol
uses to verify data? Is Google Wiki-links released recently [1] a candidate
for Stanbol data-set?

Initially, you can use any knowledge base in Stanbol. I always identifyEntityHub component as a "Knowledge Base" management system, althoughmaybe formally the EntityHub is not exactly that. Anyway, GoogleWiki-links could be a good resource for disambiguation when theknowledge base is Wikipedia or DBpedia. In fact, Wiki-links contains 40millions of mentions and its contexts retrieved from web pages. Thisinformation can be eventually added to a Wikipedia or DBpedia knowledgebase as disambiguation contexts for the entities covered in the dataset.Another interesting resource, as the new in techcrunch points, is thedictionary of Wikipedia concepts released last year [1]. This resourcecan be used to include more labels for each entity (possible names),improving then the candidate selection step. As always, we face arecall/precision problem with such dictionary.

[1] -http://googleresearch.blogspot.com.es/2012/05/from-words-to-concepts-and-back.html


Regards!


Thanks,
Dileepa

[1]
http://techcrunch.com/2013/03/08/google-research-releases-wikilinks-corpus-with-40m-mentions-and-3m-entities/
On Mon, Apr 22, 2013 at 7:32 PM, Fabian Christ <[email protected]

wrote:
Hi,

2013/4/22 Dileepa Jayakody <[email protected]>:

Could it be a valid use-case to integrate WebID protocol in Stanbol to
create social graphs and related ontologies?

the already mentioned entity disambiguation for persons might be such
a use case.

Another idea could be that the enhancement process uses some
information from the personal profile of the user who sends the
request. I do not have any concrete example at the moment but engines
might be interested in knowing who is sending an enhancement request.
This may also be a relevant information for the disambiguation task.

Best,
   - Fabian


--
Fabian
http://twitter.com/fctwitt



--

------------------------------

This message should be regarded as confidential. If you have received thisemail in error please notify the sender and destroy it immediately.Statements of intent shall only become binding when confirmed in hard copyby an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,London W10 5JJ, UK.

Re: [GSOC Idea] Apache Stanbol to verify digital identity over social web

Reply via email to