Hi, Thanks! That helps a lot!
My data format is text file, can I assume other descriptors work like that? Because there are also multiple AE xmls in desc\ctakes-core\desc\analysis_engine, and CC xmls in desc\ctakes-core\desc\cas_consumer, and I only need one of each. It seems like they are AggregateAE and FilesInDirectoryCasConsumer.example in my case? Thanks, Yi-Wen On Sat, Nov 7, 2015 at 3:46 AM, Miller, Timothy < [email protected]> wrote: > Hi Yi-Wen, > There are different collection readers for different data sources, and we > usually try to give them descriptive names. > FilesInDirectoryCollectionReader is one of the most useful ones -- it will > look for a list of text files in a directory and put one file in each cas. > If your data is in that format or is easy to convert to that format that's > probably a good starting point. > Tim > > ________________________________________ > From: Yi-Wen Liu <[email protected]> > Sent: Saturday, November 7, 2015 12:59 AM > To: [email protected] > Subject: CR descriptor > > Hi, > > I am looking for the main collection reader(CR) in cTAKES in order to do > scale out on UIMA DUCC. And in des/ctakes-core/des/collection_reader/, > there are multiple CR xml files. I am not sure which is the one that should > be specified in DUCC's job file...are they all necessary in cTAKES job or > some of them are offered for other reference? > > I am not familiar with cTAKES structure so hope somebody can help me out, > thanks! > > Thanks, > Yi-Wen >
