Hi Dileepa,

El 23/04/13 13:45, Dileepa Jayakody escribió:
Hi Fabian et al,

Thanks a lot for your valuable ideas.
Yes it's really interesting to implement a 'person | organization'
disambiguation module using WebID protocol as part of Stanbol Enhancement
Engine. I went through the documentation of Stanbol and I have gained an
overall idea about the architecture of Stanbol.
+1. That's a great idea. I don't know very much about WebID protocol, but as far as you could use some profile data as disambiguation contexts, it should be feasible to implement a disambiguation algorithm. Could you please give us a concrete example where WebID is used? I suppose that the general use case is to link name mentions in web pages with digital identities. What kind of information is it possible to gather from WebID identities?

It would be great to get more ideas, suggestions about how to use Stanbol
for people, organization disambiguation and to discuss the objectives and
milestones in the GSOC project idea at [1].
I also think one of the main factor for disambiguation is the
data-set/knowledge base used for the process. What is the data-set Stanbol
uses to verify data? Is Google Wiki-links released recently [1] a candidate
for Stanbol data-set?
Initially, you can use any knowledge base in Stanbol. I always identify EntityHub component as a "Knowledge Base" management system, although maybe formally the EntityHub is not exactly that. Anyway, Google Wiki-links could be a good resource for disambiguation when the knowledge base is Wikipedia or DBpedia. In fact, Wiki-links contains 40 millions of mentions and its contexts retrieved from web pages. This information can be eventually added to a Wikipedia or DBpedia knowledge base as disambiguation contexts for the entities covered in the dataset. Another interesting resource, as the new in techcrunch points, is the dictionary of Wikipedia concepts released last year [1]. This resource can be used to include more labels for each entity (possible names), improving then the candidate selection step. As always, we face a recall/precision problem with such dictionary.

[1] - http://googleresearch.blogspot.com.es/2012/05/from-words-to-concepts-and-back.html

Regards!

Thanks,
Dileepa

[1]
http://techcrunch.com/2013/03/08/google-research-releases-wikilinks-corpus-with-40m-mentions-and-3m-entities/
On Mon, Apr 22, 2013 at 7:32 PM, Fabian Christ <[email protected]
wrote:
Hi,

2013/4/22 Dileepa Jayakody <[email protected]>:
Could it be a valid use-case to integrate WebID protocol in Stanbol to
create social graphs and related ontologies?
the already mentioned entity disambiguation for persons might be such
a use case.

Another idea could be that the enhancement process uses some
information from the personal profile of the user who sends the
request. I do not have any concrete example at the moment but engines
might be interested in knowing who is sending an enhancement request.
This may also be a relevant information for the disambiguation task.

Best,
   - Fabian


--
Fabian
http://twitter.com/fctwitt



--

------------------------------
This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, London W10 5JJ, UK.

Reply via email to