Thanks for the wonderful explanation Richard! That was very well said. Such a learning curve!
JG On Wed, Jul 2, 2014 at 1:11 AM, Richard Eckart de Castilho <[email protected]> wrote: > Hi John, > > there is actually no grand difference between analysis engines and > consumers. > > Per default, a UIMA runtime may create multiple instances of an analysis > engine and run them in parallel (if the runtime supports that), > but a "consumer" must see all data going through the pipeline, so there > can only be once instance. > > The default value of flag about being allowing multiple instances or not > is the only real difference. > > Basically any analysis engine that does only read annotations from the CAS > but not add/change anything is a consumer. Consequently, a consumer can be > added anywhere in the pipeline, not only at the end (I sometimes do that to > see intermediate results). > > If a component has the "allow multiple instances" flag set to "false" > (which is usually what you want), then runtimes may react to that > differently. E.g. the Collection Processing Engine (CPE) will single-thread > all components (analysis engines or consumers) after it hits the first > component with "allow multiple instances" set to false (which is typically > a consumer). So to make optimal use of the CPEs multi-threading > capabilities, such components should be towards the end of the CPE pipeline. > > I believe there is a Java interface declaration and base classes for > "CasConsumers" in UIMA - I haven't used these in years. The uimaFIT API > doesn't even support these because everything can also be (and is within > uimaFIT) nicely modeled using analysis engines and the "allow multiple > instances" flag. > > Cheers, > > -- Richard > > On 02.07.2014, at 04:01, Masanz, James J. <[email protected]> wrote: > > > Hi John, > > > > Not positive this is the line you are referring to, but there is a line > in cTAKES_clinical_pipeline.groovy (which is not in sandbox, btw) that has > a comment about > > > > "createAnalysisEngineDescription expects name to not end in .xml even > though filename actually does" > > > > I am guessing the comment you see is trying to say the same thing. > > > > cTAKES_clinical_pipeline.groovy is in ctakes-core/scripts/groovy > > > > In that script, line 321 is where the writer is specified. There is no > separately defined "consumer" in the same sense that the CPE GUI has > consumers that are separate from annotators. The script just uses the last > "annotator" as a consumer and convention is AFAIK to call them writers in > this case. > > > > Hope that helps, > > -- James > > > > -----Original Message----- > > From: John Green [mailto:[email protected]] > > Sent: Tuesday, July 01, 2014 7:15 PM > > To: [email protected] > > Subject: ytex DBconsumer and groovy parser > > > > If someone has a free minute, which, judging from my own life is probably > > not the case - where in the groovy scrips in sandbox do you define the > > consumer to use? There is one comment that says "dont put the .xml here" > > then there is a path to the dictionary ae. Im working by ssh from the > > hospital a lot in my "free time" in the ICU and running gui CPEs isn't > > gonna cut it. > > > > Apropos the ytex dbconsumer - I should be able to just tack this on to > the > > end of the ytex aggregate pipeline? > > > > I'm probably still asking very naive questions but to date I still > haven't > > had the time to dive into UIMA's base very well, so I apologize. > > > > My goal is to run the full ytex pipeline from the command line with the > > ytex dbconsumer ... > > > > Thanks for everyone's patience, > > John > >
