Is the analysis of each document to be done independently of
the others? For example, annotation offsets are relative to the
beginning of each document. If not, the documents can be
concatenated together and analyzed at the same time.

If the documents are to be considered independently, the
annotator has to process each separately. One could
create a view for each document and let the annotator
iterate over all views. Of course since the CAS is memory
resident there is a natural limit to the total size of all
documents to be processed in this way.


On Sun, Nov 14, 2010 at 10:10 AM, Drenski <[email protected]> wrote:
> Hi,
> I am new to UIMA and i have been struggling for some time
> with the following problem.
> I have some documents, which i need to process simultaneously.
> So I implemented a collection reader, which reads all the files
> from a directory and annotates them as Documents. But how can
> i put these all files in an Array for example so that I can
> iterate them and make my further processing. Basically I
> just want to fetch the files from the directory and put
> them in an array so that i can process them.
> Is CAS consumer what I need? I saw in the doc that
> it is now deprecated. Or should I use some index like Lucene?
> But I guess this will be too complex for my simple task?
> I would appreciate any suggestions.
> Regards,
> Drenski
>
>

Reply via email to