Pipeline Builder & Runner for JULIE Lab JCoRe Components

Erik Fäßler Mon, 05 Oct 2020 07:00:50 -0700

Dear users of UIMA,

the JULIE Lab of the Friedrich Schiller University in Jena has a long standing 
tradition of creating UIMA components for our needs.
We have been publishing our components under the JCoRe brand for quite some 
years now. The components themselves may be found
in code form on our GitHub organization (https://github.com/JULIELab 
<https://github.com/JULIELab>; https://github.com/JULIELab/jcore-base 
<https://github.com/JULIELab/jcore-base>; 
https://github.com/JULIELab/jcore-projects 
<https://github.com/JULIELab/jcore-projects>).
The compiled component artifacts are distributed via Maven.
A lot of components in JCoRe focus on biomedical NLP. But there are also 
components of general value and some other-focused components like our DTA 
reader, for example.


The latest addition to these repositories are the JCoRe Pipeline Builder and 
Pipeline Runner programs (https://github.com/JULIELab/jcore-pipeline-modules 
<https://github.com/JULIELab/jcore-pipelines>; 
https://zenodo.org/record/4066619#.X3sPVS8Rp-U) which
facilitate the creation of JCoRe pipelines tremendously as I find, as long as 
one is using JCoRe components or adds one’s own components in a manual fashion 
to the local pipeline component repository, which is possible.

Please find below a short description of the main components offered here. A 
more detailed description can be found in the GitHub’s README.md file. For 
further questions, please use the project’s issue tracker.

The Pipeline Builder
—————————

On first start, the pipeline builder directly connects to the JCoRe GitHub 
repositories and scans them for JSON files that disclose meta data about the 
available components. For this purpose, most JCoRe components contain the 
“component.meta” file which, despite the “.meta” extension, is a JSON file 
carrying the name and the nature of the component (reader, multiplier, ae or 
consumer) as well as the Maven coordinates for the software artifacts.
Thus, one of the strong points of the tool is the direct connection of 
description and running code.

The user is then shown a menu where she must select the JCoRe repositories to 
employ on first startup. Afterwards, a menu is offered to select components, 
configure them, rearrange them, select another version of the Maven artifact 
and more.
One configuration highlight is the possibility to deactivate components. It is 
thus possible to keep components with different configurations and switch them 
on or off, for example.

The pipeline can be stored which causes the specified directory to be populated 
by UIMA descriptors, pipeline meta information, a library directory containing 
the Maven artifacts and more.

The Pipeline Runner
—————————

The pipeline runner requires a configuration file in XML format. Upon execution 
of the program with a non-existing file as parameter, an empty configuration 
file will be written instead than can be used as a template for the new 
configuration.
The runner supports CPEs and UIMA DUCC. Thus, the configuration contains one 
passage for both. But you will only want to keep one.
For example, to run a pipeline as a CPE, the configuration must contain the 
path of the pipeline created with the pipeline builder at the very least. 
Optionally, the maximum heap size and the number of threads to use can also be 
specified.



Please feel free to try out the programs and let me know - preferably through 
the issue tracker (https://github.com/JULIELab/jcore-pipeline-modules/issues 
<https://github.com/JULIELab/jcore-pipeline-modules/issues>) - if you have an 
issues.

I hope that someone will find this useful.

Best regards,

Erik Faessler, JCoRe Maintainer at JULIE Lab, FSU Jena

Pipeline Builder & Runner for JULIE Lab JCoRe Components

Reply via email to