Re: UIMA Fit + Pythonator issue

William Colen Thu, 05 Jan 2017 14:13:57 -0800

I would like to share my final solution

1. Created a combo iterator like the one in OpenNLP
https://gist.github.com/wcolen/037a68fca7e8b402b6e0d3e4df4fab49#file-annotationcomboiterator-py


2. Created a sample,py that iterates over UIMA annotations
https://gist.github.com/wcolen/5edbdcb1d2b6588fead45bbc2dd4fb5b#file-sample-py


William

2017-01-05 1:43 GMT-02:00 William Colen <william.co...@gmail.com>:

> Thank you very much, Richard!
>
> Actually it was an error in my Python iterator.
>
> I am using sentence detector and tokenizer from OpenNLP.
> POS Tagger I am using one created in Python using neural networks (
> https://github.com/erickrf/nlpnet).
>
>
>
> 2017-01-04 21:29 GMT-02:00 Richard Eckart de Castilho <r...@apache.org>:
>
>> Hi William,
>>
>> what component collection are you using? OpenNLP? Maybe the components
>> are not set up completely. If you use OpenNLP with uimaFIT, you might
>> find this example here useful:
>>
>>   https://cwiki.apache.org/confluence/display/UIMA/uimaFIT+and+Groovy
>>
>> Cheers,
>>
>> -- Richard
>>
>> > On 04.01.2017, at 21:06, William Colen <william.co...@gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > I managed to create a UIMA C++ component that performs POSTagging with
>> > Pythonator. It works very well as a standalone annotator. I created a
>> XMI
>> > with sentence and token annotation, the Python code could iterate them
>> and
>> > create the POS tags. I could run it as follows
>> >
>> > runAE.sh PythonAnnotator.xml -xmi xmi_folder
>> >
>> >
>> > Now I am integrating it to the pipeline using UIMA Fit.
>> >
>> >
>> >
>> > ...
>> >
>> > AggregateBuilder builder = new AggregateBuilder();
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(Se
>> ntDetect.class,
>> >
>> >   SentenceModelResource.PARAM_SENTENCE_MODEL_RESOURCE,
>> sentdetectModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescription(To
>> kenizer.class,
>> >
>> >  TokenizerModelResource.PARAM_TOKENIZER_MODEL_RESOURCE,
>> tokenizerModelRes
>> > ));
>> >
>> > builder.add(AnalysisEngineFactory.createEngineDescriptionFromPath(
>> > "/complete_path/PythonAnnotator.xml"))
>> >
>> >
>> > AnalysisEngine aggregate = builder.createAggregate();
>> >
>> >
>> > It runs OK. I can see a log in the Python code that the "process"
>> function
>> > was called. It loads the type system. I can also run getDocumentText
>> and it
>> > works as expected.
>> >
>> >
>> > The issue starts when I try to iterate over the sentence annotations.
>> They
>> > are not there! It works in the standalone version when I read it from
>> XMI.
>> >
>> >
>> > Any clue what I am missing?
>> >
>> >
>> > Thank you,
>> >
>> > William
>>
>>
>

Re: UIMA Fit + Pythonator issue

Reply via email to