Actually, that was something that ended up being the solution. I checked it the collection engine was finished and slept for 5 seconds. Once it was finished, I was able to do another analysis. It works so well that I am chaining together engines over collections and processing individual steps.
Thanks for the suggestion! On Sat, Dec 23, 2017 at 1:22 PM, Jens Grivolla <[email protected]> wrote: > Hi Ben, > > if I understand correctly you want to run a process once the whole > collection has been analyzed. You can have an AnalysisEngine that does this > by implementing > http://uima.apache.org/d/uimaj-2.10.0/apidocs/org/ > apache/uima/analysis_engine/AnalysisEngine.html# > collectionProcessComplete() > > You just need to make sure that you gather all the necessary information > somehow. If the AE that calculates the statistics is at the end of the > pipeline and you have only one instance of it it's easy to gather all the > information there. Or you could just write everything you need to a > centralized datastore (i.e. a database) and use that to calculate the > statistics. > > If I didn't misunderstand you, that's really a quite common scenario. > > Best, > Jens > > On Fri, Dec 22, 2017 at 6:26 PM, Benedict Holland < > [email protected]> wrote: > > > Hello All, > > > > I find myself in a strange situation. I have a content processing engine > > working. I have N threads populating N CAS objects and running my > pipeline. > > Each CAS object gets 1 piece of data, like say a row in a database. Each > > process is entirely independent and can run concurrently. I specifically > > did not configure this pipeline as an aggregate process as I don't really > > care when the events trigger since the CPE maintains the order of the > > engines. > > > > Now I want to add an analysis that will run over the aggregate output. > For > > example, I processed N texts using the CPE and now I want to run a TF-IDF > > analysis over the entire corpora. The TF-IDF analysis should only run > once > > all documents are processed. > > > > How would I go about doing this? Does this have to do with not allowing > > multiple deployments? > > > > Thanks, > > ~Ben > > >
