The standard way that we do save redundant processing time is by writing the CAS for each file to an XMI file after one pass on the data which runs all the analysis engines.
For example, if we are working on experiments, we have one pipeline that does all the NLP feature generation (POS tags, dependency parsing, dictionary lookup, etc.), and writes each document to an xmi file in a directory using UimaFit's CasIOUtil class: https://uima.apache.org/d/uimafit-current/api/org/apache/uima/fit/util/CasIOUtil.html Then in a second machine learning pipeline we read the xmi files (using a different CasIOUtil method) and vary any machine learning parameters we want using the same standard annotations. Hope this helps. Tim ________________________________________ From: samir chabou [[email protected]] Sent: Monday, April 13, 2015 11:22 PM To: [email protected]; [email protected] Subject: Re: iterate on the features of CAS consumer (FileWriterCasConsumer) Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. please advise Thanks On Saturday, April 11, 2015 12:54 AM, samir chabou <[email protected]> wrote: Hi,how can I load an existing FileWriterCasConsumer in a java code and iterate through the features in the FileWriterCasConsumer ? Note: i was able to load the clinical pipeline in my java code and create a new jCas and process it; the problem with this is each time i ran the java code i have to reload the clinical pipeline which take a bit of time. Thanks
