Hi Tol (and Maite),
I'm not entirely certain that I understand the question, but here is an attempt
to help. If I'm oversimplifying then I apologize.
I think that ExampleAggregatePipeline is intended to represent a very simple
single-note pipeline and that custom code could be produced by using it as an
example.
If you want to process texts in a directory, you can find with a web search
plenty of ways to list files in a directory and read text from files.
org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader might be what you
used in the CPE, and you can certainly peruse the code and take what you need.
Or, if you decide to write a simple diy, here is one possibility:
Static public Collection<File> getFilesInDir( final File directory ) {
final Collection<File> fileList = new ArrayList<>();
final File[] fileList = directory.listFiles();
if ( fileList == null ) {
System.err.println( "please check the directory " +
directory.getAbsolutePath() );
System.exit( 1 );
}
for ( final File file : directory.listFiles() ) {
if ( file.canRead() ) {
fileList.add( file );
}
}
}
Static public String getTextInFile( final File file ) throws IOException { --
or handle ioE herein
final Path nioPath = file.toPath();
return new String( Files.readAllBytes( nioPath ) );
}
Static public void main( String ... args ) {
If ( args[0].isEmpty() ) {
System.out.println( "Enter a directory path" );
System.exit( 0 );
}
Final Collection<File> files = getFilesInDir( new File( args[0] );
For ( File file : files ) {
Final String note = getTextInFile( file );
--- Insert here code a' la ExampleAggregatePipeline ---
--- swap out the writer in ExampleAggregatePipeline with CasIOUtil
method (below) ---
}
}
I must admit that I have never directly used it, but there is an xmi file
writing method in org.apache.uima.fit.util.CasIOUtil named writeXmi( JCas jCas,
File file ). You could give this a try and see if it produces the type of
output that you want. The same utility class has a writeXCas(..) method.
If the above has absolutely nothing to do with your needs then please send me a
bulleted list of items, example workflow, etc. and I'll see if I can be of
service.
Oh, and I wrote the above code freehand, so MS Outlook is adding capital
letters, etc. If you cut and paste you'll need to change that - plus I haven't
run/compiled, so there might be a typo or missed exception or something. Or it
may not work (in which case I'll throw in a little more effort).
Sean
-----Original Message-----
From: Tol O. [mailto:[email protected]]
Sent: Monday, February 02, 2015 6:56 PM
To: [email protected]
Subject: Re: Question about the pipeline
Maite Meseure Hugues <meseure.maite@...> writes:
>
> Hello all,
>
> Thank you for your preceding answers.
> I have a few questions regarding the pipeline example to run cTakes
> programmatically.
> I am running ExampleAggregatePipeline.java with
> ExampleHelloWorldAnnotator but I would like to know how I can change
> it to run my data, as the CPE where we can choose the directory of our data.
> My second question is about the xml output generated with the CPE, can
> I get the same xml output in using the example pipeline? and How?
> Thanks for your time.
I would like to ask the same question. After successfully setting up CTAKES
following the Developers Guide I would also like to use a modified
ExampleAggregatePipeline to output a CAS file identical to the output obtained
by the CPE or the CVD when following the Users Guide.
This would be a great help for developers as a starting class to be able to
programmatically obtain an annotated file based on a plaintext or XML input,
same as through the two GUIs.
Right now I am reading through the Component Use Guide to replicate the CPE or
the CVD tutorial with the test input, but it is a bit overwhelming.
Any pointers or suggestions would be really appreciated.
Tol O.