Sounds great James, will be happy to test it. Reading that stackoverflow issue I also found this other one: http://stackoverflow.com/questions/499010/java-how-to-determine-the-correct-charset-encoding-of-a-stream
that mentions: http://code.google.com/p/juniversalchardet/ This could be used to detect the encoding of the source doc, and output the result in the same format. On Mon, Dec 12, 2011 at 3:11 AM, James Kosin <james.ko...@gmail.com> wrote: > Hi, > > I'm currently working on patches to try and fix all this. It really > depends on the platform running the java application and not so much a > problem with Java. > I've found other references to other articles that do into great detail > on this issue. > > James > > On 12/11/2011 3:13 PM, György Chityil wrote: >> Hi Jörn, >> >> Meanwhile I researched the output encoding issue, and found this >> http://stackoverflow.com/questions/2415597/java-how-to-detect-and-change-encoding-of-system-console >> >> perhaps the output encoding could be passed in as an arg for the >> opennlp console, and utf-8 could be defined as the default. > -- Gyuri 274 44 98 06 30 5888 744