Hi Eric, A collection reader typically does 3 things: create/access content to be analyzed, obtain a new CAS to be passed to other CAS processors, and initialize the CAS with the content (assuming the CAS initializer is part of the collection reader). These three things could be done in different components. For example, the collection reader could populate a CAS with a pointer to new content and a content type label, and the content itself accessed from a second component which then populated the CAS. The secondary component could be called based on the type of content type and would not have to be a CAS multiplier.
I don't yet understand enough about your configuration to see the CAS pool issue. Regards, Eddie On 9/25/07, Eric Vachon <[EMAIL PROTECTED]> wrote: > > Hi Eddie, > > In fact we would like to have the unzipping and other file aggregation > processes resolved outside the collection reader in order to be able to > handle numerous formats and numerous crawling facilities without having > to create a collection reader for every combination of crawling and > format. The CASMultiplier was like a simple and efficient way to manage > this case if it was working inside the CPE. As it is today we handle the > process through parameters but it would be easier for people using our > collection reader to be able to add the CASMultiplier in the CPE to > handle the format. > > The problems we face with the actual implementation is that there is no > CAS pool for the CASMultpilier which can lead to memory issues and that > the CPE does not handle this type of processor. Using an aggregated > annotator would make it more difficult for the error management. > > We can manage without using the CASMultiplier by using plugins for our > collection readers but we had rather use a standard UIMA feature. > > Best, > Eric Vachon > > Eddie Epstein a écrit : > > Muon, > > > > I'm not sure exactly what your question is. A CPE based on the CPM uses > a > > Collection Reader with an optional Cas Initializer. In UIMA 2.x it is > > possible to have a Cas Multiplier as a UIMA aggregate component. A > > Collection Reader, minus Cas Initializer, is considered a subset of a > Cas > > Multiplier and can also be used in a UIMA aggregate. > > > > If none of this answers your question, please try to clarify. > > > > Thanks, > > Eddie Epstein > > > > On 9/24/07, Muon Le <[EMAIL PROTECTED]> wrote: > >> Hi Adam, > >> > >> I am Muon LE, I would like to replace my CAS Initilizers to CAS > >> Multipliers. > >> I know there is the limitation about the CPE (UIMA-2.2) to use CAS > >> Multiplier. > >> Do you know when this limitation will be removed? > >> > >> Thank you, > >> Muon LE. > >> > >> -----Message d'origine----- > >> De : [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] De la part de > Adam > >> Lally > >> Envoyé : jeudi 19 juillet 2007 15:38 > >> À : [email protected] > >> Objet : Re: Multi-threading with a CAS Multiplier. > >> > >> On 7/19/07, Benjamin Sznajder <[EMAIL PROTECTED]> wrote: > >>> <snip/> > >>> I would like to get your opinion about the following workaround: > >>> Why don't we hide the steps done by the CAS Multiplier in the > >>> Collection > >>> Reader: the collection reader will read a document of 10 minutes long, > >>> and will create 10 CASes corresponding to our 5 and 5 CASes of video > >>> and speech of 2 minutes duration? > >>> If we do the above, then setting the processingUnitThreadcount to 3 > >>> (or > >>> more) will create three (or more) instances of our AggregateEngine2 > >>> and we would get real parallelization between our 10 CASes. Do I miss > >> something? > >> That should work. As Eddie said, the CPE understands Collection > Readers > >> but doesn't know anything about CAS Multipliers. > >> > >> -Adam > >> > > > >
