Actually, rethinking this, DKPro Lab currently doesn't support this scenario out of the box. It assumes that data flows between tasks, but not that data flows between subsequent iterations of the same task. It would be a nice addition though ;)
So, the basic Java loop seems to be a low-effort practical solution really. -- Richard Am 17.06.2013 um 22:54 schrieb Oliver Ferschke <[email protected]>: > I can really recommend the DKPro Lab that Richard suggested! > ________________________________________ > Von: Richard Eckart de Castilho [[email protected]] > Gesendet: Montag, 17. Juni 2013 21:36 > An: [email protected] > Betreff: Re: Processing a Text Collection more than once? > > Hi Susanne, > > there are two options in UIMA: > > 1) you write your own reader which repeatedly outputs the same data > 2) you write a flow controller which saves all data produced by the reader > and re-runs all components again on it > > However, I'd recommend we check offline if/how DKPro Lab could fit in with > your scenario. It may even just be the easiest to run your pipeline in a loop > using uimaFIT and XMI/binary CAS serialization to feed in the output of one > run into the next one. > > Cheers, > > -- Richard > > Am 13.06.2013 um 17:32 schrieb Susanne Neumann > <[email protected]>: > >> Hi, >> >> is there a (good) way in UIMA to process the whole text collection more than >> once? The process method processes each document once for the whole >> collection. But I need to iterate several times over the whole collection. >> >> The background is, that I want to implement a bootstrapping annotator using >> UIMA. One of the main characteristics of bootstrapping is, that the corpus >> is processed several times, collecting new rules, terms and evidence each >> time, based on the results of the previous turns. I planned to write a >> bootstrapping AE, but I can't figure out how to iteratively process the >> collection. >> >> I am looking for any hints or tips about how to implement this with UIMA. >> >> Thanks, >> Susanne
