Hi Eddie, In fact we would like to have the unzipping and other file aggregation processes resolved outside the collection reader in order to be able to handle numerous formats and numerous crawling facilities without having to create a collection reader for every combination of crawling and format. The CASMultiplier was like a simple and efficient way to manage this case if it was working inside the CPE. As it is today we handle the process through parameters but it would be easier for people using our collection reader to be able to add the CASMultiplier in the CPE to handle the format.
The problems we face with the actual implementation is that there is no CAS pool for the CASMultpilier which can lead to memory issues and that the CPE does not handle this type of processor. Using an aggregated annotator would make it more difficult for the error management. We can manage without using the CASMultiplier by using plugins for our collection readers but we had rather use a standard UIMA feature. Best, Eric Vachon Eddie Epstein a écrit : > Muon, > > I'm not sure exactly what your question is. A CPE based on the CPM uses a > Collection Reader with an optional Cas Initializer. In UIMA 2.x it is > possible to have a Cas Multiplier as a UIMA aggregate component. A > Collection Reader, minus Cas Initializer, is considered a subset of a Cas > Multiplier and can also be used in a UIMA aggregate. > > If none of this answers your question, please try to clarify. > > Thanks, > Eddie Epstein > > On 9/24/07, Muon Le <[EMAIL PROTECTED]> wrote: >> Hi Adam, >> >> I am Muon LE, I would like to replace my CAS Initilizers to CAS >> Multipliers. >> I know there is the limitation about the CPE (UIMA-2.2) to use CAS >> Multiplier. >> Do you know when this limitation will be removed? >> >> Thank you, >> Muon LE. >> >> -----Message d'origine----- >> De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de Adam >> Lally >> Envoyé : jeudi 19 juillet 2007 15:38 >> À : [email protected] >> Objet : Re: Multi-threading with a CAS Multiplier. >> >> On 7/19/07, Benjamin Sznajder <[EMAIL PROTECTED]> wrote: >>> <snip/> >>> I would like to get your opinion about the following workaround: >>> Why don't we hide the steps done by the CAS Multiplier in the >>> Collection >>> Reader: the collection reader will read a document of 10 minutes long, >>> and will create 10 CASes corresponding to our 5 and 5 CASes of video >>> and speech of 2 minutes duration? >>> If we do the above, then setting the processingUnitThreadcount to 3 >>> (or >>> more) will create three (or more) instances of our AggregateEngine2 >>> and we would get real parallelization between our 10 CASes. Do I miss >> something? >> That should work. As Eddie said, the CPE understands Collection Readers >> but doesn't know anything about CAS Multipliers. >> >> -Adam >> >
