=
=
=
=
=
=
=
=
========================================================================
Final Call for Participation
Unstructured Information Management Architecture (UIMA)
2nd u...@gscl Workshop
October 1st, 2009
Potsdam, Germany
http://www.ling.uni-potsdam.de/acl-lab/gscl09/workshops.en.html
=
=
=
=
=
=
=
=
========================================================================
-------------------
Program
-------------------
09:00 - 10:00 - UIMA Tutorial, Graham Wilcock
10:00 - 10:30 - Coffee Break
10:30 - 10:45 - Opening
10:45 - 11:15 - ClearTK: A Framework for Statistical Natural Language
Processing (Philip V. Ogren, Philipp G. Wetzler, and Steven J. Bethard)
11:15 - 11:45 - Multimedia Feature Extraction in the SAPIR Project
(Aaron Kaplan, Jonathan Mamou, Francesco Gallo, and Benjamin Sznajder)
11:45 - 12:15 - TextMarker: A Tool for Rule-Based Information
Extraction (Peter Kluegl, Martin Atzmueller, and Frank Puppe)
12:15 - 13:00 - Lunch Break
13:00 - 13:30 - LuCas - A Lucene CAS Indexer (Erik Faessler, Rico
Landefeld, Katrin Tomanek, and Udo Hahn)
13:30 - 14:00 - Abstracting the types away from a UIMA type system
(Karin Verspoor, William Baumgartner Jr., Christophe Roeder, and
Lawrence Hunter)
14:00 - 14:30 - Poster Session
14:30 - 15:00 - Round Table/Discussion
-----------------------------
Workshop Description
-----------------------------
For many decades, NLP has suffered from low software engineering
standards causing a limited degree of re-usability of code and
interoperability of different modules within larger NLP systems. While
this did not really hamper success in limited task areas (such as
implementing a parser), it caused serious problems for the emerging
field of language technology where the focus is on building complex
integrated software systems, e.g., for information extraction or
machine translation. This lack of integration has led to duplicated
software development, work-arounds for programs written in different
(versions of) programming languages, and ad-hoc tweaking of interfaces
between modules developed at different sites.
In recent years, the Unstructured Information Management Architecture
(UIMA) framework has been proposed as a middleware platform which
offers integration by design through common type systems and
standardized communication methods for components analysing streams of
unstructured information, such as natural language. The UIMA framework
offers a solid processing infrastructure that allows developers to
concentrate on the implementation of the actual analytics components.
An increasing number of members of the NLP community thus have adopted
UIMA as a platform facilitating the creation of reusable NLP
components that can be assembled to address different NLP tasks
depending on their order, combination and configuration.
This workshop aims at bringing together members of the NLP community
that are users, developers or providers of either UIMA components or
UIMA-related tools in order to explore and discuss the opportunities
and challenges in using UIMA as a platform for modern, well-engineered
NLP. In the context of an emerging NLP-oriented UIMA community, the
challenge to create not only reusable, but also interoperable
components raises particular interest. From a methodological
perspective, interoperability relies largely on UIMA type systems.
Technically, it includes issues related to the packaging and
distribution of UIMA components. Also, tools are important, for
example to assemble complex processing work flows, to manage the
bodies of data that are to be analysed and to visualize, explore, and
further deploy the analysis results. Finally, interoperability is also
affected by legal issues, such as potentially incompatible licenses
ofcomponents and tools.
The availability of ready-to-use components plays a major role in
choosing UIMA over other alternatives. To accentuate this, the
workshop puts a focus on UIMA-based components and tools that are
freely available for research.
--------------
Topics
--------------
Participants are invited to present applications realized using UIMA,
general experiences using UIMA as a platform for natural language
processing, as well as technical papers on particular aspects of the
UIMA framework. Alternatives to and comparisons of other frameworks -
e.g. GATE, LingPipe, etc. - with UIMA are of interest, too. More
specifically, workshop topics include, but are not limited to:
• UIMA components with a special focus on genericity and type-system
independence
• repositories of ready-to-use UIMA-based components
• (generic) type systems for UIMA
• distribution of UIMA components: documentation, licensing and
packaging
• sophisticated tools to build and manage complex processing pipelines
• experience reports combining UIMA-based components from different
sources, as well as solutions to interoperability issues
• processing of very large data collections: scale-out,
parallelization, and performance optimization
• analysis of results: exploration, evaluation, visualization, and
statistical analysis
• developing for UIMA: simplified APIs, debugging, unit testing, and
limitations of UIMA
---------------------------------
Organizers and Contact
---------------------------------
• JULIE Lab, Friedrich-Schiller-Universität Jena
• Udo Hahn
• Katrin Tomanek
• UKP Lab, Technische Universität Darmstadt
• Iryna Gurevych
• Richard Eckart de Castilho
Please address any inquiries regarding the workshop to:
[email protected]
---------------------------------
Program Committee
---------------------------------
• Anni R. Coden, IBM T.J. Watson Research Center, USA
• Branimir K. Boguraev, IBM T.J. Watson Research Center, USA
• Graham Wilcock, University of Helsinki, Finland
• Iryna Gurevych, Technische Universität Darmstadt, Germany
• Katrin Tomanek, Friedrich-Schiller-Universität Jena, Germany
• Leo Ferres, University of Concepcion, Chile
• Michael Tanenblatt, IBM T.J. Watson Research Center, USA
• Nicolas Hernandez, Université de Nantes, France
• Philipp Cimiano, Delft University of Technology, Netherlands
• Richard Eckart de Castilho, Technische Universität Darmstadt, Germany
• Sophia Ananiadou, University of Manchester, Great Britain
• Stefan Geißler, TEMIS GmbH, Germany
• Udo Hahn, Friedrich-Schiller-Universität Jena, Germany