Re: Annotation Mapping Annotator

Marshall Schor Tue, 13 May 2008 12:02:11 -0700

The simple name translation capability I think could be provided by anextension to the base framework, using an "alias" notion. The conceptwould be to allow multiple type and feature names to map to the sameinternal type. This would allow the same internal CAS object to bereferred to by the aliased names, without doing any actual copying ofthe object.


-Marshall


David Buttler wrote:

I agree with Pascal that there are many use cases where simple nametransformation (e.g. org.apache.uima.Person to com.company.PersonName)is insufficient. Incorporating a scripting framework that allowsarbitrary computation is the easiest way to create a component that isusable for the task. However, I think the simple name translationservice does provide value by itself, and it is conceptually muchsimpler to use.I would like to see both components. This would allow people to startwith something that is simple and easy to use, and then graduate intoa more complex mapping once they realize the need. Either that, or acomponent that does everything Pascal describes, but also allows asimple configuration for easy cases without a performance orconfiguration penalty for being able to extend a mapping to ascripting framework.


Dave

Pascal Coupet wrote:

This is indeed an important issue for an easy interoperability.

However a simple mapping solves only a part of the issue. In a lot of
case, the mapping operation requires a lot of intelligence. For a tagger
by example, one will have 17 tags related to Verbs and another 4: a
simple mapping will not work. For a name extractor, one will provide 2
fields, firstName and lastName, and another one will have a middle
initial, a title and so on. So a lot of time you have to modify the data
themselves to move those from one typesystem to another one and this
require simple or not so simple code.
What we do usually is to use a java scripting language to do the mapping
like beanshell. This is flexible and simple but still powerful to use
Java libs you may have to do complex things. Maybe an idea could be to
develop a mapping component based on a java scripting language,
configurable using XML files as you suggest but also flexible to add

code in an easy way.A flexible mapping annotator will be very useful in the UIMA. This is

one way which is needed and pragmatic. Another important and
complementary way to tackle the issue is to have some recommendations
for standards entity types like People names, dates, places and so on,
like Dublin Core for Metadata. If we do this, people providing
annotators will be able to add this type as output and will do the
mapping themselves to be "UIMA people compliant" by example. This can be
a degraded mode for them in regard of their standard capabilities but

this will be nice for fast prototyping.

Pascal

-----Original Message-----

From: Michael Baessler [mailto:[EMAIL PROTECTED] Sent: Tuesday,May 13, 2008 5:09 PM

To: [email protected]
Subject: Annotation Mapping Annotator

Is there some interest/need in the UIMA community to have an annotation
mapping annotator?

I think some of you might know the issue that different UIMA components
work on different
annotations and type systems. A mapping annotator component could be
used to translate the
annotations between these different requirements. E.g. we have a
tokenizer component at the
beginning of the analysis flow that produces example.Token annotations
with a POS feature set. Later
in the flow have a component that needs that information, but expects an
example.Noun annotation.
Unfortunately there is no way to configure both components to produce or
read different annotation
types, so in that case we need a mapping.

Tokenizer creates:

  example.Token (2,8)
     POS = NN

Mapping annotator translates this to:

  example.Noun (2,8)
     posTag = NN

If there is a need for such a component we can reuse some of the code
developed for the UIMA
SimpleServer. The SimpleServer has a mapping syntax with additional
filtering as shown below.

The mapping for the example above looks like:

<type name="example.Token" outputTag="example.Noun">
  <filters>
      <filter featurePath="POS" operator="=" value="NN" />
  </filters>
  <outputs>
      <output featurePath="pos" outputAttribute="posTag"/>

  </outputs>
</type>

Any feedback/comments for such a component?
Are there any implementations available?

-- Michael

Re: Annotation Mapping Annotator

Reply via email to