Re: Running an AnalysisEngine on part of a document

Nils Reiter Tue, 16 Feb 2016 04:14:05 -0800

Hi Richard,

thanks for your reply and don’t worry, I am planning on using DKpro components 
:)


So if I get you correctly, all DKpro components rely on token/sentence 
annotations and ignore the rest, right?

Best regards,
Nils

> On 16 Feb 2016, at 12:18, Richard Eckart de Castilho <[email protected]> wrote:
> 
> Ok, sorry, the answer below would assume you are using DKPro Core components 
> ;)
> 
> Sorry Nils, I didn't notice you were posting to the Apache UIMA list.
> 
> So for UIMA in general, I am not aware of a solution other that what you 
> describe. So it would depend on the components / component collection that 
> you are using.
> 
> Cheers,
> 
> -- Richard
> 
>> On 16.02.2016, at 12:17, Richard Eckart de Castilho <[email protected]> wrote:
>> 
>> The easiest would be to remove the token/sentence annotations of those parts 
>> of the text that you do not care about.
>> Or alternatively - if you have annotations that specifically mark the text 
>> sections, then configure the segmenter component to create sentences/tokens 
>> only within the boundaries of these annotations using PARAM_ZONE_TYPES and 
>> PARAM_STRICT_ZONING.
>> 
>> Cheers,
>> 
>> -- Richard
>> 
>>> On 16.02.2016, at 12:02, Nils Reiter <[email protected]> wrote:
>>> 
>>> Hi,
>>> 
>>> is there a way to run an analysis engine on only a part of the CAS?
>>> 
>>> I have UIMA annotations over all the substrings that I want to process. The 
>>> only way I could think of is creating new views or CASs for each string, 
>>> but that would result in > 100 views. Is there a more straightforward way?
>>> 
>>> Background:
>>> Only part of the CAS contains natural language, other parts are lists, 
>>> names and headers. I would like to POS-tag the text, but not the rest.
>>> 
>>> Thanks in advance for any pointers or suggestions,
>>> Nils
>> 
>

Re: Running an AnalysisEngine on part of a document

Reply via email to