Re: Handling type conversion (was Re: "Standard" UIMA typesystem)

Joern Kottmann Mon, 12 Sep 2016 06:54:12 -0700

On Mon, Sep 12, 2016 at 12:00 PM, Richard Eckart de Castilho <[email protected]
> wrote:


> > On 12.09.2016, at 11:52, Joern Kottmann <[email protected]> wrote:
> >
> >> On Sun, Sep 11, 2016 at 3:38 PM, Peter Klügl <[email protected]>
> >> wrote:
> >>
> >>> Am 09.09.2016 um 23:24 schrieb Joern Kottmann:
> >>>
> >>> A framework like Uima has to make it easy to reuse components and in my
> >>> opinion strict compile time typing makes that really difficult to
> >> achieve.
> >>
> >> Components are already very reusable if they use the same typesystem.
> >> Again, that has nothing to do with compile time typing.
> >>
> >> And, this is not the only purpose but there are many more, e.g., allow
> >> the developer to create large maintainable pipelines.
> >> In my opinion, components in UIMA are much more reusable because of the
> >> static typing, not just throw-away prototypes.
> >
> > I strongly disagree here I think the really static type system (and with
> > JCas even compile time static) in UIMA makes it hard reuse a component,
> > because I need to write explicit type system converters in many cases to
> be
> > able to use them.
>
> IMHO type converters are necessary whether or not the type system is
> compiled
> statically. You seem to want per-component converters (what you call
> adapters).
> I personally prefer converters at the beginning and end of pipeline
> sections
> (which can be realized e.g. through collection readers, CAS consumers or
> CAS mulipliers).
>
> Regarding adapters: IMHO a UIMA component largely *is* the adapter between
> type system X
> and underlying implementation Y. My hypothesis is that if there would be a
> generic
> configurable mechanism by which this mapping functionality could be
> externalized
> from a UIMA component, then this mechanism would have the same level of
> complexity
> as the Java code which usually fulfills this purpose in a component.
> Furthermore,
> I expect the remaining component code to become largely trivial then. - OR
> - if the
> mapping functionality is reduced in functionality in order to become
> simpler, then
> it would mean certain type system designs are not supported (cf. OpenNLP
> type
> mapping not being compatible with the DKPro Core type system and others).
>


Today you would write pairs of converters and place them as strategically
as possible in your pipeline, right,
so you would want to group AEs with the same type system in one place.

The OpenNLP UIMA annotators are build in a more generic way and only make
certain assumptions about the type system, e.g. it has token and sentence
annotations. The user has to configure a type and feature mapping in the
xml descriptor. This works for many cases, in some it doesn't.
So there are definitely cases where type mapping isn't enough, e.g. output
of best pos tag, and list of best n pos tags. For those cases I propose to
use adapters which can adapt the component to the type system the user is
using . And those adapters could also handle the type/feature mapping case.

I think if the adapters have support from the framework we could come up
with certain tricks that are not possible with an converter AE.
A converter AE needs to duplicate all (or probably most) the inputs and
outputs for the conversation. This means everything needs to be copied at
least once.

With special APIs you could probably do the following things:
- Define type name mappings, type A looks like type B to the AE
- Define functions which are used to access the features of a FS (the
function can map the feature value to something new) and let the CAS APIs
take care of calling it
- Define functions which converts an entire FS of type A into an FS of type
B  and let the CAS APIs take care of calling it
- It could be possible to define adapters for AAEs as well (same TS AEs
could be grouped)


Type conversion is an entirely separate thing from JCas classes and managing
> different JCas wrappers at the level of classloaders. Here, UIMA offers
> the PEAR solution and I think a constructive discussion could revolve
> how PEARs can be improved or replaced by a superior approach.
>
> > The alternative to this would be a type system which is much less static
> > (or dynamic) and APIs to write AEs which can adapt well to similar but
> > different user defined type systems. This could be achieved by allowing
> > type system mappings, by adding explicit support for adapters in the
> > framework, allowing dynamic definition of types,
> >
> > Together with Thilo I wrote a paper which speaks a bit about this topic
> > (see at 6.4):
> > http://www.aclweb.org/anthology/W14-5209
>
> A more dynamic approach to the type system would be great, in particular
> the ability to add types and features at execution time. We have an API
> that in principle supports this (CAS). Again, this is decoupled from JCas
> which is a higher-level API than CAS is.
>

I agree, I think that can be useful in many situations, an example is the
output of debug or log FSes.

Jörn

Re: Handling type conversion (was Re: "Standard" UIMA typesystem)

Reply via email to