Hi folks, I've been trying to run the WhitespaceTokenizer AE (for example) on the CAS documents of a Corpus folder within the CAS Editor for some time now, with no success. I run into a class not found exception each time, indicating that the WhitespaceTokenizer class cannot be found.
I don't know I'm overlooking on something here or if I'm just dumb... I'm using Apache UIMA 2.3, the CAS Editor plugin from the repository and Eclipse 3.3.2 Here is what I've tried : 1- The simple way : create the CAS Editor project, import type system, create a corpus directory, import text files, create processing directory and import the WhitespaceTokenizer (let's says WT from now) descriptor in it. Finally click on a document in the corpus directory : Annotator > WT.xml => it fails 2- The improved way : do the same as the simple way, then open the properties > UIMA CDE Property page and specify the absolute path to the WT jar. => it fails the same way 3- The hacky way : create a Java project and specify the Build path so that the WT is loaded, edit the .project file, add the NLPProject nature, open the CAS Editor perspective and redo the simple way => it still fails Well, I'm out of ideas... does anybody have a solution to my problem ? Has anybody already been able to launch an AE on a corpus from the CAS Editor ? I'd be grateful to any hint ! -- Fabien Poulard
