Re: "Standard" UIMA typesystem

Peter Klügl Tue, 30 Aug 2016 07:39:47 -0700

Hi,


I don't think that providing a standard type system will enforce its
usage. Ruta already provides some type systems and it does not hurt at
all, e.g., normal uima users do not care about it.


If there no standard type system, then people have two options: create
their own one or reuse an existing type system of a component
repository, e.g., DKPro Core. As far as I know LiMoSINe [1] moved  from
their own type system to DKPro Core (I waiting for some text to put on
our external resources page - in case they read this).  I also was
thinking about switching our NLP components to the DKPro Core type
system, but there are several issues preventing that, first of all that
I cannot build it :-/


A standard type system will never fulfill all requirements of a special
interest group, but it could be a start. Even if only a small part is
shared, it could increase the interoperability.


There are two main questions:


- Can the community agree on what should it contain and how is it
defined? Only basic stuff like Tokens, Sentences. What about POS Tags?
Representation of coarse and fine-grained tags on feature- or
type-level. Which variant of universal tagset, UD, google, ...? What
about inter-linkage of annotations?

- Will it be adapted by the community? Changing the type system is
really a lot of work, especially if you have to support everything that
you did before.  I wonder if it can survive if DKPro Core does not adapt
it. I could imagine that we (Averbis) are somewhat open to adapt parts
of a standard type system, as I am planning to change our type system
anyways.


btw, in my experience, converting annotations between typesystems within
a pipleine can easily become a performance bottleneck.


Best,


Peter


[1] https://aclweb.org/anthology/P/P16/P16-4027.pdf


Am 30.08.2016 um 15:59 schrieb Richard Eckart de Castilho:
> While I think that an endorsed type system is a good idea, I still wonder...
>
> As far as I understood, UIMA has always been advertised as an "empty" 
> framework
> that does explicitly not prescribe a particular type system - probably to 
> underline
> it's flexibility. Would that not suffer if UIMA itself provided a standard 
> typesystem?
>
> Cheers,
>
> -- Richard
>
>> On 30.08.2016, at 15:56, Marshall Schor <[email protected]> wrote:
>>
>> This is a great idea.  The key will be in discovering and using a workable
>> "crowd-sourced" (?) process (and perhaps supporting tooling :-) ) that lets a
>> diverse set of people with somewhat aligned interests converge on a shared
>> definition.
>>
>> -Marshall
>>
>> On 8/30/2016 5:40 AM, Jens Grivolla wrote:
>>> Hi all,
>>>
>>> at the LREC conference there were some brief discussions about pushing for
>>> a "standard" typesystem (and maybe some more things) to make combining UIMA
>>> annotators from different sources easier.
>>>
>>> While it is great that UIMA itself is a generic framework that is
>>> completely agnostic to the tasks it is used for, there are many users that
>>> want to be able to use existing analysis engines. Currently they are forced
>>> to either choose a specific component collection (DKpro, cTakes, JCORE,
>>> OpenNLP, ...) or write adapters to convert type systems.
>>>
>>> There was agreement between some of us (Richard, Peter, etc.) that it would
>>> be very helpful to guide component developers towards a shared type system
>>> to make adoption of UIMA easier and avoid fragmentation.
>>>
>>> Here are some suggestions on how to proceed:
>>>
>>> - go all in and have the UIMA project provide a type system (in the UIMA
>>> namespace)
>>> - develop an independent (unofficial) type system that is recommended on
>>> the UIMA web site
>>> - develop an unofficial type system and gather endorsements from a variety
>>> of institutions (UPF, UKP, JulieLab, Averbis, ...) so as to promote this
>>> type system.
>>>
>>> I think (and there was initial agreement on this) that DKpro's type system
>>> would be a good starting point (with some fixes).
>>>
>>> So, how does everybody feel about this, and how do we get started?
>>>
>>> Best,
>>> Jens

Re: "Standard" UIMA typesystem

Reply via email to