Re: GSoC disambiguation project, any news? (was: GSoC project accepted)

Rupert Westenthaler Fri, 13 Jul 2012 00:20:22 -0700

Hi Kritart, all

On Thu, Jul 12, 2012 at 4:51 PM, kritarth anand
<[email protected]> wrote:
> Hi Bertrand,
>
> The project is going good.
>
> We have now ,a working version of an entity disambiguation engine which on
> simple algorithm. It does work well for very simple cases. It does require
> some code cleaning and I will sharing it (and my mid term report) as an
> update in a day or two you guys. Rupert is reviewing it as of now.
>

Kritart can you have a look at my Pull request [1]. I needed to make
some adoption to make it work with the API changes of the Entityhub
introduced by STANBOL-673. With those changes the Engine runs fine in
the Stanbol trunk.

There are still some limitations and issues but it is worth a try (I
recommend to download/install one of the bigger dbpedia indexes for
testing).

Kritart can provide please provide a short how to install/run/test your engine?

[1] https://github.com/kritarthanand/Disambiguation-Stanbol/pull/1

> The initial part of my project was mainly familiarizing with Stanbol,
> getting background on Entity Disambiguation, get a simple version running
> etc and carrying out some reading to get some ideas about the possible
> choice for algorithm. I had issues with those but was mainly interacting
> one on one with Rupert and Anuj.
>
> However for the later part of my project I will taking a decisions on
> algorithms to chose and many concerns related to it and therefore I am
> hoping to  interact a lot more with the entire Stanbol community to get
> their views and feed backs. I am looking forward to it.
>

I completely agree with that. It is critical to discuss you ideas with
the community. Especially to get more feedback on how to apply those
ideas to Apache Stanbol.

For that I see two things that need discussion/feedback of the community:

* Most research papers do use Wikipedia/DBpedia as test data, but
Stanbol users tend more often to use company/domain specific
controlled vocabularies.
* How can/need we do adapt/improve Stanbol to collect/provide the
information needed by those algorithm.

Next Steps:

* Improve the current Engine in a 2nd iteration (I think we should
create a Jira issue for that)
* Discuss other disambiguation possibilities here on the list and
select one to be implemented

Kritarth thanks for your work so far
best
Rupert

> Kritarth
>
>
> On Thu, Jul 12, 2012 at 6:19 PM, Bertrand Delacretaz <[email protected]
>> wrote:
>
>> Hi,
>>
>> On Tue, Apr 24, 2012 at 9:22 AM, Bertrand Delacretaz
>> <[email protected]> wrote:
>> > ...According to [1], Kritarth Anand's GSoC entity disambiguation project
>> > has been accepted, congrats!...
>>
>> How's that project going forward? I don't remember seeing any
>> discussions about it here, did I miss something?
>>
>> -Bertrand
>>

-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: GSoC disambiguation project, any news? (was: GSoC project accepted)

Reply via email to