On 27/05/12 16:59, Seid Muhie wrote: > Dear Thilo Goetz > Thank you for your response > > I have aleardy tried different ways of reading text file with different > encodings. > > For example using commons IO FileUtils class, I tried as follows > > ............ > String document = FileUtils.file2String(inputFile, "UTF-8"); > tcas.setDocumentText(document); > tae.process(tcas); > ...... > > It again stuck at process() method. It seems the problem is with that method
What do you mean, stuck at? Have you tried debugging to see what it's doing? > > thank you very mucb > > On Sun, May 27, 2012 at 8:42 AM, Thilo Goetz <[email protected]> wrote: > >> On 26/05/12 23:13, Seid Muhie wrote: >>> dear all >>> I have Unicode document I want to process. >>> Following the tutorial at >>> this<http://www.ibm.com/developerworks/webservices/tutorials/ws-uima/>, >>> the code stucks at the last line. >>> >>> File taeDescriptor = new >>> File("desc\\DateAnnotatorAEDescriptor.xml"); >>> File inputFile = new File("data\\document1.txt"); >>> XMLInputSource in = new XMLInputSource(taeDescriptor); >>> ResourceSpecifier specifier = >>> UIMAFramework.getXMLParser().parseResourceSpecifier(in); >>> AnalysisEngine tae = UIMAFramework.produceAnalysisEngine(specifier); >>> CAS tcas = tae.newCAS(); >>> FileInputStream fis = new FileInputStream(inputFile); >>> byte[] contents = new byte[(int) inputFile.length()]; >>> fis.read(contents); >>> fis.close(); >>> String document = new String(contents); >>> tcas.setDocumentText(document); >>> *tae.process(tcas);* >>> >>> thank you. >>> >> >> Please check the web on how to read in a text file with >> a specific encoding. An easy way is to use commons io. >> >> --Thilo >> >> > >
