Re: DBpedia Spotlight and Stanbol

Iavor Jelev Mon, 04 Jun 2012 04:57:08 -0700

Hi Rupert,

thanks for your suggestions! I will look into incorporating them in our
implementation.


Am 04.06.2012 13:10, schrieb Rupert Westenthaler:
> No much simpler. Just to use the
> 
>     @Component(configurationFactory = true, [..])
> 
> to allow users to configure multiple instances the DBpedia Spotlight
> Spotting engine. Than he can configure different spotting algorithms
> for the different instances.
> 
> Note also that you can configure the Engines and the chain by using the
> 
>     <Install-Path>{path}</Install-Path>
> 
> bundle extension.
> 
> The OSGI service configuration need than to be located at
> 
>     {module}/src/main/resources/{path}


best,
Iavor






>>
>> On Mon, Jun 4, 2012 at 11:45 AM, Rupert Westenthaler <
>> [email protected]> wrote:
>>
>>> Hi Pablo
>>>
>>> I made som tests and the spotting looks great. Also tried some some of
>>> the different Spotting algorithms (NER, LingPipeSpotter (very slow)
>>> and  Kea).
>>>
>>> Here are some Questions/Suggestions related to the engine.
>>>
>>> 1. Do you think it might make sense to allow multiple EngineInstances
>>> using different Spotting algorithms?
>>>
>>> 2. I noticed that created TextAnnotations do not have "dc-terms:type"
>>> information. This property is used to represent the "nature" (e.g.
>>> Person, Organisation, Place in case of Named Entities) by the the
>>> Stanbol Enhancement Structure. So if such information are available it
>>> would be great to set it.
>>>
>>> 3. I would suggest to add support for the type suggestion filter
>>> feature as shown in the 2nd example of the user manuel [1]
>>>
>>> [1] http://wiki.dbpedia.org/spotlight/usersmanual#h139-10
>>>
>>> On Fri, Jun 1, 2012 at 5:37 PM, Pablo Mendes <[email protected]>
>>> wrote:
>>>> Our next step is to create an enhancement chain with two enhancement
>>>> engines: DBpedia Spotlight Spotting and DBpedia Spotlight Disambiguation.
>>>
>>> So basically to split this engine in to separate one, right?
>>>
>>>> We have performed preliminary evaluations of the new enhancement engine
>>>> using the Stanbol Benchmark Component (SBC). The SBC allows evaluating
>>>> content enhancement engines based on examples of desired and undesired
>>>> behavior defined through Benchmark Definition Language (BDL) statements.
>>> We
>>>> have transformed the dataset from Kulkarni et al. 2009 [4] into BDL. The
>>>> BDL data set is available from:
>>>> http://spotlight.dbpedia.org/download/stanbol/
>>>>
>>>
>>> Had not yet time to look at the examples in detail, but if the license
>>> if [4] allows and you agree we could think about to make them
>>> available as part of the Stanbol Enhancer.
>>>
>>>> The SBC is a nice way to perform manual inspection of the behavior of the
>>>> enhancement chain for different examples in the evaluation dataset.
>>>> However, for evaluations with several hundreds of examples, it would be
>>>> interesting to have scores that summarize the performance for the entire
>>>> dataset. We are in the process of conducting large scale experiments with
>>>> existing datasets, aiming at producing precision and recall figures for
>>>> different enhancement chains.*
>>>>
>>>
>>> This is completely true. Can you start an Jira Issue about that. I
>>> will definitely help with implementing this.
>>>
>>> best
>>> Rupert
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             [email protected]
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
> 
> 
>

Re: DBpedia Spotlight and Stanbol

Reply via email to